Latest News and Comment from Education

Monday, July 13, 2015

Pennsylvania Manufactures Massive Test Failure | Peter Greene

Pennsylvania Manufactures Massive Test Failure | Peter Greene:

Pennsylvania Manufactures Massive Test Failure

Brace yourself, Pennsylvanians. The new cut scores for last years Big Standardized Tests have been set, and they are not pretty.
It was only this week the State Board of Education met to accept the recommendations of their Council of Basic Education. Because, yes -- cut scores are set after test results are in, not before. You'll see why shortly.
A source at those meetings passed along some explanation of how all this works. We'll get to the bad news in a minute, but first -- here's how we get there.
How Are Scores Set?
In PA, when it comes to ranking students, we stick with good, old-fashioned Below Basic, Basic, Proficient and Advanced. The cut scores -- the scores that decide where we draw the line between those designations -- come from two groups.
First, we have the Bookmark Participants. The bookmark participants are educators who take a look at the actual test questions and consider the Performance Level Descriptors, a set of guidelines that basically say "A proficient kid can do these following things." These "have been in place since 1999" which doesn't really tell us whether they've ever been revised or not. According to the state's presentation:
By using their content expertise in instruction, curriculum, and the standards, educators made recommendations about items that distinguished between performance levels (eg Basic/Proficient) using the Performance Level Descriptors. When educators came to an item with which students had difficulty, they would place a bookmark on that question. 
In other words, this group set dividing lines between levels of proficiency in the way that would kind of make sense -- Advanced students can do X, Y, and Z, while Basic students can at least do X. (It's interesting to note that, as with a classroom test, this approach doesn't really get you a cut score until you fiddle with the proportion of items on the test. In other words, if I have a test that's all items about X, every student gets an 95 percent, but if I have a test that's all Z, only the proficient kids so much as pass. Makes you wonder who decides how much of what to put on the Big Standardized Test and how they decide it.)
Oh, and where do the committee members come from? My friend clarifies:
The cut score panelists were a group that answered an announcement on the Data Recognition website, who were then selected by PDE staff. 
DRC is the company that runs testing in PA, so they get to select the folks who will score their work. We also add a couple of outside experts. One of the outside "experts" was from the National Center for the Improvement of Educational Assessment, one more group that thanks the Gates Foundation for support.
This means that everyone in the room for this process is a person who, in case Something Bad shows up, is pre-disposed to believe that the problem couldn't possibly be the test.

But Wait-- That's Not All

But if we set cut scores based on difficulty of various items on the Big Standardized Test, why can't we set cut scores before the test is even given? Why do we wait until after the tests have been administered and scored?
One reason might be that setting the curve after the results are in guarantees failure. Every student in PA could do better than 95 percent, and the Board could still declare, "Those kids who got a measly 95 percent are Below Basic. They suck, their teachers suck, and their school needs to be shut down."
But we should also note that the Bookmark Group is not the end of the line. Their recommendations go on to the Review Committee, and according to the state's Pennsylvania Manufactures Massive Test Failure | Peter Greene: