psychometrics

Psychometric Considerations for Learning Maps-Based Assessments

Learning map models are a type of cognitive model composed of multiple interconnected learning targets and other critical knowledge and skills. The Dynamic Learning Maps (DLM) Alternate Assessment System uses learning maps models as the basis for …

Empirical methods for evaluating maps: Illustrations and results

This presentation is part of a coordinated session, Beyond Learning Progressions: Maps as Assessment Architecture. Learning progressions (LPs) are commonly used in educational assessments to identify interim steps on a pathway toward a grade-level target. LPs describe typical expected pathways, but may not represent the multiple pathways by which students develop knowledge in a domain. Another type of cognitive model, the learning map, is better suited to describing heterogeneous pathways that support learning for all students including those with the most significant cognitive disabilities.

Measuring reliability of student mastery classification at multiple levels

As the use of diagnostic assessment systems transitions from research applications to large-scale assessments for accountability purposes, reliability methods that provide evidence at each level of reporting must are needed. The purpose of this paper …

A hierarchical IRT model for identifying group-level aberrant growth

As cheating on high-stakes tests continues to threaten the validity of score interpretations, approaches for detecting cheating proliferate. Most research focuses on individual scores, but recent events show group-level cheating is also occurring. …

Measuring reliability of student mastery classification at multiple levels

As the use of diagnostic assessment systems transitions from research applications to large-scale assessments for accountability purposes, reliability methods that provide evidence at each level of reporting must are needed. The purpose of this paper …

Using simulation to evaluate retest reliability of assessment results

As diagnostic assessment systems become more prevalent as large-scale operational assessments, consideration must be given to the method of reporting reliability. Alternatives to traditional reliability methods must be explored that are consistent …

Construct Irrelevance

Construct irrelevance, as the name might suggest, refers to measuring phenomena that are not included in the definition of the construct. This is generally considered to be one of the two biggest threats to the validity of an assessment, along with …

Visualizing different levels of compensation in multidimensional item response theory models

This graphic shows the probability of providing a correct response to an item in a multidimensional item response theory (MIRT) model. The colors represent the probability of a correct response, and the contours represent chunk of 10% probability …

A hierarchical IRT model for identifying group-level aberrant growth to detect cheating

As cheating on high-stakes tests continues to threaten the validity of score interpretations, approaches for detecting cheating proliferate. Most research focuses on individual scores, but recent events show group-level cheating is also occurring. …

Creating an R package for a reproducible workflow in educational assessment

The assessment of individual’s knowledge, skills or abilities is a fundamental aspect of any educational system or program. Without some form of assessment, it is impossible to know whether or not an individual has gained the necessary skills for the …