Psychometric Methods

Theory into Practice

Larry R. Price

Hardcovere-bookprint + e-book
December 12, 2016
ISBN 9781462524778
Price: $87.00
552 Pages
Size: 7" x 10"
December 13, 2016
Price: $87.00
552 Pages
print + e-book
Hardcover + e-Book (PDF) ?
Price: $174.00 $95.70
552 Pages

Grounded in current knowledge and professional practice, this book provides up-to-date coverage of psychometric theory, methods, and interpretation of results. Essential topics include measurement and statistical concepts, scaling models, test design and development, reliability, validity, factor analysis, item response theory, and generalizability theory. Also addressed are norming and test equating, topics not typically covered in traditional psychometrics texts. Examples drawn from a dataset on intelligence testing are used throughout the book, elucidating the assumptions underlying particular methods and providing SPSS (or alternative) syntax for conducting analyses. The companion website presents datasets for all examples as well as PowerPoint slides of figures and key concepts. Pedagogical features include equation boxes with explanations of statistical notation, and end-of-chapter glossaries. The Appendix offers extensions of the topical chapters with example source code from SAS, SPSS, IRTPRO, BILOG-MG, PARSCALE, TESTFACT, and DIMTEST.

This title is part of the Methodology in the Social Sciences Series, edited by Todd D. Little, PhD.

“This book is suitable for emerging assessment professionals and practitioners who are interested in learning psychometrics but with little knowledge in statistics. It provides not only a theoretical foundation to the topics but also worked examples to highlight their practical applications. The syntax, output, and interpretations based on software programs like SPSS and BILOG-MG will help readers to bridge theory and methods with hands-on examples. This book would be a convenient toolbox for applied researchers who would like to conduct psychometric analyses, and it would also serve as a handbook for graduate students who study measurement and psychometrics.”


“This excellent book explores the basic concepts of psychometric knowledge and practice….This would be a great addition to the libraries of graduate students and researchers.”

Doody's Review Service

“I like the book and it meshes well with what I plan to do in my course. I particularly like the generalizability theory, norms and test equating, scaling, and validation process chapters. It is very easy reading—I am planning to use the book next spring.”

—R. J. de Ayala, PhD, Chair of Educational Psychology, University of Nebraska–Lincoln

“I encourage all psychologists and educators to read this marvelous book. I learned a lot from reading it. The key terms are very useful, as are the chapter summaries. Readers of all levels will find material relevant to them, including SPSS code and GfGc datasets on intelligence that will be quite useful in trying out the ideas. I give this book my highest recommendation and think it will be a great classroom text.”

—John J. McArdle, PhD, Department of Psychology, University of Southern California

“This book is both comprehensive and accessible, laying the foundation for all the requisite skills needed to be both a successful consumer and producer of psychometrics. Scholars who are unfamiliar with measurement could easily teach themselves from this text, becoming quite proficient at psychometrics. There is excellent integration of quantitative statistics throughout, so that readers will be able not only to understand the psychometric concepts, but also to apply their knowledge. This is a useful text for a graduate-level Psychometric Methods or Measurement class.”

—Debbie L. Hahs-Vaughn, PhD, Department of Educational and Human Sciences, University of Central Florida

“An encyclopedia of psychometric issues—a real 'must have' for anyone teaching Tests and Measurement or Research Methods, or directing student research projects. The book's high level of detail makes it invaluable for any professional who works with, creates, or analyzes psychometric material. The use of intelligence testing data throughout the chapters helps bring cohesiveness.”

—John Wallace, PhD, Department of Psychological Science, Ball State University

“With vast expertise in psychometric instrument development, statistical applications, and research, Price has produced a theoretically informed, practical volume. Professionals in health-related fields will find this book extremely valuable for guidance in the development of rigorous instruments, such as patient-reported outcome measures. Featuring examples using a range of software, this text is ideal for graduate courses on measurement in schools of medicine, public health, nursing, or health professions.”

—Byron J. Gajewski, PhD, Department of Biostatistics, University of Kansas Medical Center

Table of Contents

1. Introduction

1.1 Psychological Measurement and Tests

1.2 Tests and Samples of Behavior

1.3 Types of Tests

1.4 Origin of Psychometrics

1.5 Definition of Measurement

1.6 Measuring Behavior

1.7 Psychometrics and Its Importance to Research and Practice

1.8 Organization of This Book

Key Terms and Definitions

2. Measurement and Statistical Concepts

2.1 Introduction

2.2 Numbers and Measurement

2.3 Properties of Measurement in Relation to Numbers

2.4 Levels of Measurement

2.5 Contemporary View on the Levels of Measurement and Scaling

2.6 Statistical Foundations for Psychometrics

2.7 Variables, Frequency Distributions, and Scores

2.8 Summation or Sigma Notation

2.9 Shape, Central Tendency, and Variability of Score Distributions

2.10 Correlation, Covariance, and Regression

2.11 Summary

Key Terms and Definitions

3. Criterion, Content, and Construct Validity

3.1 Introduction

3.2 Criterion Validity

3.3 Essential Elements of a High-Quality Criterion

3.4 Statistical Estimation of Criterion Validity

3.5 Correction for Attenuation

3.6 Limitations to Using the Correction for Attenuation

3.7 Estimating Criterion Validity with Multiple Predictors: Partial Correlation

3.8 Estimating Criterion Validity with Multiple Predictors: Higher-Order Partial Correlation

3.9 Coefficient of Multiple Determination and Multiple Correlation

3.10 Estimating Criterion Validity with More Than One Predictor: Multiple Linear Regression

3.11 Regression Analysis for Estimating Criterion Validity: Development of the Regression Equation

3.12 Unstandardized Regression Equation for Multiple Regression

3.13 Testing the Regression Equation for Significance

3.14 Partial Regression Slopes

3.15 Standardized Regression Equation

3.16 Predictive Accuracy of a Regression Analysis

3.17 Predictor Subset Selection in Regression

3.18 Summary

Key Terms and Definitions

4. Statistical Aspects of the Validation Process

4.1 Techniques for Classification and Selection

4.2 Discriminant Analysis

4.3 Multiple-Group Discriminant Analysis

4.4 Logistic Regression

4.5 Logistic Multiple Discriminant Analysis: Multinomial Logistic Regression

4.6 Model Fit in Logistic Regression

4.7 Content Validity

4.8 Limitations of the Content Validity Model

4.9 Construct Validity

4.10 Establishing Evidence of Construct Validity

4.11 Correlational Evidence of Construct Validity

4.12 Group Differentiation Studies of Construct Validity

4.13 Factor Analysis and Construct Validity

4.14 Multitrait–Multimethod Studies

4.15 Generalizability Theory and Construct Validity

4.16 Summary and Conclusions

Key Terms and Definitions

5. Scaling

5.1 Introduction

5.2 A Brief History of Scaling

5.3 Psychophysical versus Psychological Scaling

5.4 Why Scaling Models Are Important

5.5 Types of Scaling Models

5.6 Stimulus-Centered Scaling

5.7 Thurstone’s Law of Comparative Judgment

5.8 Response-Centered Scaling

5.9 Scaling Models Involving Order

5.10 Guttman Scaling

5.11 The Unfolding Technique

5.12 Subject-Centered Scaling

5.13 Data Organization and Missing Data

5.14 Incomplete and Missing Data

5.15 Summary and Conclusions

Key Terms and Definitions

6. Test Development

6.1 Introduction

6.2 Guidelines for Test and Instrument Development

6.3 Item Analysis

6.4 Item Difficulty

6.5 Item Discrimination

6.6 Point–Biserial Correlation

6.7 Biserial Correlation

6.8 Phi Coefficient

6.9 Tetrachoric Correlation

6.10 Item Reliability and Validity

6.11 Standard Setting

6.12 Standard-Setting Approaches

6.13 The Nedelsky Method

6.14 The Ebel Method

6.15 The Angoff Method and Modifications

6.16 The Bookmark Method

6.17 Summary and Conclusions

Key Terms and Definitions

7. Reliability

7.1 Introduction

7.2 Conceptual Overview

7.3 The True Score Model

7.4 Probability Theory, True Score Model, and Random Variables

7.5 Properties and Assumptions of the True Score Model

7.6 True Score Equivalence, Essential True Score Equivalence, and Congeneric Tests

7.7 Relationship between Observed and True Scores

7.8 The Reliability Index and Its Relationship to the Reliability Coefficient

7.9 Summarizing the Ways to Conceptualize Reliability

7.10 Reliability of a Composite

7.11 Coefficient of Reliability: Methods of Estimation Based on Two Occasions

7.12 Methods Based on a Single Testing Occasion

7.13 Estimating Coefficient Alpha: Computer Programs and Example Data

7.14 Reliability of Composite Scores Based on Coefficient Alpha

7.15 Reliability Estimation Using the Analysis of Variance Method

7.16 Reliability of Difference Scores

7.17 Application of the Reliability of Difference Scores

7.18 Errors of Measurement and Confidence Intervals

7.19 Standard Error of Measurement

7.20 Standard Error of Prediction

7.21 Summarizing and Reporting Reliability Information

7.22 Summary and Conclusions

Key Terms and Definitions

8. Generalizability Theory

8.1 Introduction

8.2 Purpose of Generalizability Theory

8.3 Facets of Measurement and Universe Scores

8.4 How Generalizability Theory Extends Classical Test Theory

8.5 Generalizability Theory and Analysis of Variance

8.6 General Steps in Conducting a Generalizability Theory Analysis

8.7 Statistical Model for Generalizability Theory

8.8 Design 1: Single-Facet Person by Item Analysis

8.9 Proportion of Variance for the p x i Design

8.10 Generalizability Coefficient and CTT Reliability

8.11 Design 2: Single-Facet Crossed Design with Multiple Raters

8.12 Design 3: Single-Facet Design with the Same Raters on Multiple Occasions

8.13 Design 4: Single-Facet Nested Design with Multiple Raters

8.14 Design 5: Single-Facet Design Multiple Raters Rating on Two Occasions

8.15 Standard Errors of Measurement: Designs 1–5

8.16 Two-Facet Designs

8.17 Summary and Conclusions

Key Terms and Definitions

9. Factor Analysis

9.1 Introduction

9.2 Brief History

9.3 Applied Example with GfGc Data

9.4 Estimating Factors and Factor Loadings

9.5 Factor Rotation

9.6 Correlated Factors and Simple Structure

9.7 The Factor Analysis Model, Communality, and Uniqueness

9.8 Components, Eigenvalues, and Eigenvectors

9.9 Distinction between Principal Components Analysis and Factor Analysis

9.10 Confirmatory Factor Analysis

9.11 Confirmatory Factor Analysis and Structural Equation Modeling

9.12 Conducting Factor Analysis: Common Errors to Avoid

9.13 Summary and Conclusions

Key Terms and Definitions

10. Item Response Theory

10.1 Introduction

10.2 How IRT Differs from CTT

10.3 Introduction to IRT

10.4 Strong True Score Theory, IRT, and CTT

10.5 Philosophical Views on IRT

10.6 Conceptual Explanation of How IRT Works

10.7 Assumptions of IRT Models

10.8 Test Dimensionality and IRT

10.9 Type of Correlation Matrix to Use in Dimensionality Analysis

10.10 Dimensionality Assessment Specific to IRT

10.11 Local Independence of Items

10.12 The Invariance Property

10.13 Estimating the Joint Probability of Item Responses Based on Ability

10.14 Item and Ability Information and the Standard Error of Ability

10.15 Item Parameter and Ability Estimation

10.16 When Traditional IRT Models Are Inappropriate to Use

10.17 The Rasch Model

10.18 The Rasch Model, Linear Models, and Logistic Regression Models

10.19 Properties and Results of a Rasch Analysis

10.20 Item Information for the Rasch Model

10.21 Data Layout

10.22 One-Parameter Logistic Model for Dichotomous Item Responses

10.23 Two-Parameter Logistic Model for Dichotomous Item Responses

10.24 Item Information for the Two-Parameter Model

10.25 Three-Parameter Logistic Model for Dichotomous Item Responses

10.26 Item Information for the Three-Parameter IRT Model

10.27 Choosing a Model: A Model Comparison Approach

10.28 Summary and Conclusions

Key Terms and Definitions

11. Norms and Test Equating

11.1 Introduction

11.2 Norms, Norming, and Norm-Referenced Testing

11.3 Planning a Norming Study

11.4 Scaling and Scale Scores

11.5 Standard Scores Under Linear Transformation

11.6 Percentile Rank Scale

11.7 Interpreting Percentile Ranks

11.8 Normalized z- or Scale Scores

11.9 Common Standard Score Transformations or Conversions

11.10 Age- and Grade-Equivalent Scores

11.11 Test Score Linking and Equating

11.12 Techniques for Conducting Equating: Linear Methods

11.13 Design I: Random Groups—One Test Administered to Each Group

11.14 Design II: Random Groups with Both Tests Administered to Each Group, Counterbalanced (Equally Reliable Tests)

11.15 Design III: One Test Administered to Each Study Group, Anchor Test Administered to Both Groups (Equally Reliable Tests)

11.16 Equipercentile Equating

11.17 Test Equating Using IRT

11.18 IRT True Score Equating

11.19 Observed Score, True Score, and Ability

11.20 Summary and Conclusions

Key Terms and Definitions

Appendix. Mathematical and Statistical Foundations


Author Index

Subject Index

About the Author

About the Author

Larry R. Price, PhD, is Professor of Psychometrics and Statistics at Texas State University, where he is also Director of Methodology, Measurement, and Statistical Analysis. This university-wide role involves conceptualizing and writing the analytic segments of large-scale competitive grant proposals in collaboration with interdisciplinary research teams. Previously, he served as a psychometrician and statistician at the Emory University School of Medicine (Department of Psychiatry and Behavioral Sciences) and at the Psychological Corporation (now part of Pearson's Clinical Assessment Group). Dr. Price is a Fellow of the American Psychological Association, Division 5 (Evaluation, Measurement, and Statistics), and an Accredited Professional Statistician of the American Statistical Association.


Behavioral researchers; testing and assessment professionals; graduate students and instructors in psychology, neuroscience, education, management, sociology, and public health.

Course Use

Will serve as a text in graduate-level courses such as Psychometric Methods, Measurement Theory, and Tests and Measurement.
The first printing included some errors, which are listed in this errata sheet. These errors will be corrected in future printings