Product Cover

Machine Learning for Social and Behavioral Research

Ross Jacobucci, Kevin J. Grimm, and Zhiyong Zhang

HardcoverPaperbacke-bookprint + e-book
Hardcover
July 17, 2023
ISBN 9781462552931
Price: $93.00
416 Pages
Size: 7" x 10"
order
Paperback
July 11, 2023
ISBN 9781462552924
Price: $62.00
416 Pages
Size: 7" x 10"
order
e-book
May 30, 2023
PDF ?
Price: $62.00
416 Pages
order
print + e-book
Paperback + e-Book (PDF) ?
Price: $124.00 $74.40
416 Pages
order
bookProfessors: request an exam copy

Today's social and behavioral researchers increasingly need to know: “What do I do with all this data?” This book provides the skills needed to analyze and report large, complex data sets using machine learning tools, and to understand published machine learning articles. Techniques are demonstrated using actual data (Big Five Inventory, early childhood learning, and more), with a focus on the interplay of statistical algorithm, data, and theory. The identification of heterogeneity, measurement error, regularization, and decision trees are also emphasized. The book covers basic principles as well as a range of methods for analyzing univariate and multivariate data (factor analysis, structural equation models, and mixed-effects models). Analysis of text and social network data is also addressed. End-of-chapter “Computational Time and Resources” sections include discussions of key R packages; the companion website provides R programming scripts and data for the book's examples.

This title is part of the Methodology in the Social Sciences Series, edited by Todd D. Little, PhD.


“Current, highly informative, and useful, this is a 'go-to' book for social science graduate students, faculty, and practitioners seeking a strong introduction to machine learning. Unlike typical, more technical machine learning books, this one is unique in providing the strong psychological measurement guidance required to apply these techniques most appropriately. It walks the reader through general principles of machine learning, regression- and tree-based predictive models, text- and network-based methods of clustering, and—most innovatively—machine learning–based psychometric approaches (CFA and SEM).”

—Fred Oswald, PhD, Professor and Herbert S. Autrey Chair in Social Sciences, Department of Psychological Sciences, Rice University


“This book is very timely. Social scientists need to be educated about the pros and cons of machine learning methods and about how, when, and why these methods can be applied to their research topics. The book describes key techniques in enough detail to enable readers to subsequently digest more specialized journal articles or software applications, but not in so much detail as to lose momentum.”

—Sonya K. Sterba, PhD, Department of Psychology and Human Development, Vanderbilt University


“Jacobucci, Grimm, and Zhang's ambitious book takes the reader on an in-depth tour of machine learning methods. Its strength is that the authors link machine learning to more traditional topics of regression, structural equation modeling, factor analysis, and network analysis methods. This book should be required reading for the new generation of psychology graduate students who are interested in more advanced quantitative methods.”

—James W. Pennebaker, PhD, Regents Centennial Professor of Liberal Arts and Professor of Psychology, The University of Texas at Austin


​"A 'must read' for social scientists who want to familiarize themselves with machine learning but don’t know where to start. Understanding the practices and principles of machine learning is fundamental to modern data analysis. Many social scientists will be surprised by how well their traditional statistical training has prepared them to grasp the material in the book.”

—Alexander Christensen, PhD, Department of Psychology and Human Development, Vanderbilt University

Table of Contents

I. Fundamental Concepts

1. Introduction

- Why the Term Machine Learning?

- Why do We Need Machine Learning?

- How is this Book Different?

- Definitions

- Software

- Datasets

2. The Principles of Machine Learning Research

- Overview

- Principle #1: Machine Learning is Not Just Lazy Induction

- Principle #2: Orienting Our Goals Relative to Prediction, Explanation, and Description

- Principle #3: Labeling a Study as Exploratory or Confirmatory is too Simplistic

- Principle #4: Report Everything

- Summary

3. The Practices of Machine Learning

- Comparing Algorithms and Models

- Model Fit

- Bias-Variance Tradeoff

- Resampling

- Classification

- Conclusion

II. Algorithms for Univariate Outcomes

4. Regularized Regression

- Linear Regression

- Logistic Regression

- Regularization

- Rationale for Regularization

- Alternative Forms of Regularization

- Bayesian Regression

- Summary

5. Decision Trees sample

- Introduction

- Decision Tree Algorithms

- Miscellaneous Topics

6. Ensembles

- Bagging

- Random Forests

- Gradient Boosting

- Interpretation

- Empirical Example

- Important Notes

- Summary

III. Algorithms for Multivariate Outcomes

7. Machine Learning and Measurement

- Defining Measurement Error

- Impact of Measurement Error

- Assessing Measurement Error

- Weighting

- Alternative Methods

- Summary

8. Machine Learning and Structural Equation Modeling

- Latent Variables as Predictors

- Predicting Latent Variables

- Using Latent Variables as Outcomes and Predictors

- Can Regularization Improve Generalizability in SEM?

- Nonlinear Relationships and Latent Variables

- Summary

9. Machine Learning with Mixed-Effects Models

- Mixed-Effects Models

- Machine Learning with Clustered Data

- Regularization with Mixed-Effects Models

- Illustrative Example

- Additional Strategies for Mining Longitudinal Data

- Summary

10. Searching for Groups

- Finite Mixture Model

- Structural Equation Model Trees

- Summary

IV. Alternative Data Types

11. Introduction to Text Mining

- Key Terminology

- Data

- Basic Text Mining

- Text Data Preprocessing

- Basic Analysis of the Teaching Comment Data

- Sentiment Analysis

- Topic Models

- Summary

12. Introduction to Social Network Analysis

- Network Visualization

- Network Statistics

- Basic Network Analysis

- Network Modeling

- Summary

References


About the Authors

Ross Jacobucci, PhD, is Assistant Professor in Quantitative Psychology in the Department of Psychology at the University of Notre Dame. His research interests include the development and application of machine learning for clinical research, with a focus on suicide and nonsuicidal self-injury. Dr. Jacobucci is an active developer of open-source software for the R statistical environment, with five packages that implement some form of machine learning. His website is www.rjacobucci.com.

Kevin J. Grimm, PhD, is Professor of Psychology at Arizona State University. His research interests include multivariate methods for the analysis of change, multiple group and latent class models for understanding divergent developmental processes, nonlinearity in development, machine learning techniques for psychological data, and mathematics and reading ability development. Dr. Grimm is a recipient of the Early Career Research Award and the Barbara Byrne Book Award (for Growth Modeling: Structural Equation and Multilevel Modeling Perspectives) from the Society of Multivariate Experimental Psychology.

Zhiyong Zhang, PhD, is Professor in Quantitative Psychology in the Department of Psychology at the University of Notre Dame, where he directs the Lab for Big Data Methodology. He has conducted research in the areas of Bayesian methods, structural equation modeling, longitudinal data analysis, and missing data and non-normal data analysis. His recent research involves the development of new methods and software for social network and text analysis. Dr. Zhang is the founding editor of the Journal of Behavioral Data Science. His website is https://bigdatalab.nd.edu.

Audience

Researchers, instructors, and graduate students in psychology, human development and family studies, education, management, sociology, social work, nursing, and public health.

Course Use

Will serve as a core text in courses on machine learning or data management/science, or as a supplement in advanced quantitative methods courses.