Monday, February 16, 2015

Sorting the Big Standardized Tests | Peter Greene

Sorting the Big Standardized Tests | Peter Greene:



Sorting the Big Standardized Tests




Since the beginnings of the current wave of test-driven accountability, reformsters have been excited about stack ranking -- the process of sorting out items from the very best to the very worst (and then taking a chainsaw to the very worst).
This has been one of the major supporting points for continued large-scale standardized testing -- if we didn't have test results, how would we compare students to other students, teachers to other teachers, schools to other schools. The devotion to sorting has been foundational, rarely explained but generally presented as an article of faith, a self-evident value -- well, of course, we want to compare and sort schools and teachers and students!
But you know what we still aren't sorting?
The big standardized tests.
Since last summer the rhetoric to pre-empt the assault on testing has focused on "unnecessary" or "redundant" or even "bad" tests, but we have done nothing to find these tests.
Where is our stack ranking for the tests?
We have two major BSTs -- the PARCC and the SBA. In order to better know how my child is doing (isn't that one of our repeated reasons for testing?), I'd like to know which one of these is a better test. There are other state-level BSTs that we're flinging at our students willy-nilly. Which one of these is the best? Which one is the worst?
I mean, we've worked tirelessly to sort and rank teachers in our efforts to root out the bed ones, because apparently "everybody" knows some teachers are bad. Well, apparently everybody knows some tests are bad, so why aren't we tracking them down, sorting them out, and publishing their low test ratings in the local paper?
We've argued relentlessly that I need to be able to compare my student's reading ability with the reading ability of Chris McNoname in Iowa, so why can't I compare the tests that each one is taking?
I realize that coming up with a metric would be really hard, but so what? We use value-added measures to sort out teachers, and those have been debunked by everyone except people who work for the USED. I think we've established that the sorting instrument doesn't have to be good or even valid -- it just has to generate some sort of rating.
So let's get on this. Let's come up with a stack-ranking method for sorting out the SBA and the PARCC and the Keystones and the Indiana Test of Essential Student Swellness and whatever else is out there. If we're going to rate every student and teacher and school, why would we not also rate the raters? And then once we've got the tests rated, we can throw out the bottom ten percent of them. We can offer a "merit bonus" to the company that made the best one (and peanuts to everyone elseSorting the Big Standardized Tests | Peter Greene: