Who Put the "Stakes" In "High-Stakes Testing"?
"Evidence of validity, reliability, and fairness for each purpose for which a test is used in a program evaluation, policy study, or accountability system should be collected and made available." (Standard 13.4, p. 210, emphasis mine)This statement is well worth unpacking, because it dwells right in the heart of the ongoing debate about "high-stakes testing" and, therefore, influences even the current presidential race.
A core principle of psychometrics is that the evaluation of tests can't be separated from the evaluation how their outcomes will be used. As Samuel Messick, one of the key figures in the field, put it:
"Hence, what is to be validated is not the test or observation device as such but the inferences derived from test scores or other indicators -- inferences about score meaning or interpretation and about the implications for action that the interpretation entails." [1] (emphasis mine)He continues:
"Validity always refers to the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of interpretations and actions based on test scores." [1] (emphasis mine)I'm highlighting "actions" here because my point is this: You can't fully judge a test without considering what will be done with the results.
To be clear: I'm not saying items on tests, test forms, grading rubrics, scaling procedures, CONTINUE READING: Jersey Jazzman: Who Put the "Stakes" In "High-Stakes Testing"?