Despite Progress, the ‘Charade’ of High-Stakes Testing Persists

testing charadeThe proliferation of high-stakes testing is most often associated with No Child Left Behind (NCLB), the national education law that was in place for more than a decade. With good reason – the law imposed a crushing accountability regime that turned many of our schools into test prep factories and corrupted what it meant to teach and to learn. But test-based accountability was well-established long before NCLB was signed into law in 2002. And it persists today, two years after the law was replaced by the Every Student Succeeds Act (ESSA), says Daniel Koretz, author of The Testing Charade: Pretending to Make Schools Better.  

While the campaign led by educators, parents and students against overtesting has helped bring about real improvements, Koretz, professor of education at the Harvard Graduate School of Education, cautions that too many policymakers refuse to give up their  “blind reliance” on high-stakes tests – to the detriment of students, schools and the teaching profession. “Test-based accountability has become an end itself in American education,” he writes in The Testing Charade, “unmoored from clear thinking about what should be measured, how it should be measured, or how testing can fit into a rational plan for evaluating and improving our schools.”  Koretz recently spoke with NEA Today.

Test-based accountability didn’t begin with NCLB. When did this shift begin and what were the factors that triggered it?

Daniel Koretz: This started at least as early as minimum competency testing in the 1970s and “measurement-driven instruction” in the 1980s. However, looking back, what we called “high-stakes testing” back in the 1980s was very weak tea compared with what we see now. The pressure really started ramping up in the early 1990s, when states began putting in place concrete sanctions and rewards based on test scores. NCLB federalized these programs and made them harsher, but most of its components we had already seen before.

Many people attribute some of the impetus for serious test-based accountability to the publication of A Nation at Risk in 1983, which painted a dire picture of the condition of American education. However, within a short time, this approach developed so much momentum that few people questioned it. There were ongoing arguments about how to use tests for accountability, but few questioned the basic approach.

You’re not anti-testing, so what’s the “charade” around testing as you see it?

DK: You’re right, I am certainly not anti-testing. Good tests, used appropriately, can be very valuable. They can provide information about the condition of education that we can’t get anywhere else, and they can provide teachers with useful, specialized information.

The charade is the phony signs of progress—the badly inflated test-score gains—that test-based accountability has produced. This misleads the public and allows those in charge to pretend that these policies are making schools better, while in reality they are not. In fact, in many instances, they have generated more problems than they have solved.

There is an irony in this: high-stakes testing has actually undermined the value of tests because it leads to inflated scores that aren’t accurate and informative.

In the book, you talk a lot about the impact of excessive test prep. What are the most damaging outcomes?

DK: One of the most damaging outcomes is the score inflation I mentioned above. It’s not uncommon for studies to show that gains on the tests used for accountability are three to six times as large as they should be, and we have cases of large gains on a high-stakes test that are accompanied by no real gains in learning whatever. This is what has allowed the charade to continue for so long.

Worse, some schools, districts, and states have more severe score inflation than others. This means that we sometimes identify the wrong teachers, schools, or programs as effective or ineffective.

The University of Chicago Press is offering a 30% discount off the list price of The Testing Charade. Visit the publisher’s site and enter the code AD1717 at checkout.

Second, bad test prep undermines the quality of instruction. Teachers have only so many hours a day, and time wasted on bad test prep can’t be recovered for real instruction. And bad test prep is often mind-numbingly boring, which turns kids off to school and, worse, to learning. And teachers have a strong incentive to forgo good instruction for test prep: test prep has the promise of increasing scores more quickly, albeit by producing bogus gains.

Third, test prep has undermined the very notion of good teaching. Many new teachers are taught not only to use bad test prep, but even that this test prep constitutes “good instruction.” And many of these teachers have never seen any instructional program that isn’t dominated by test prep.

As you discuss in the book, even some the architects of standardized testing decades ago warned about the potential misuse of testing. Why have these warnings, along with the mountain of evidence supporting them, continue to be ignored by many policymakers?

DK: I wish it were only that the reformers haven’t seen the evidence. To be fair, that is undoubtedly the case for many of them, even though that evidence has been accumulating for nearly 30 years. However, it’s certainly not true of all of them. Time and time again, I have encountered people who had been presented with the evidence and simply ignored it, insisted that it must be wrong, or denied that it would apply to their own policies. This denial was facilitated by the many researchers and evaluators who used scores from high-stakes tests as if they were trustworthy measures.

This was one of my primary motivations for writing Charade. I finally lost patience with the pretense. My hope is that the book will make it a bit harder to hide from the facts about test-based accountability.

Test-based accountability is defended as a tool for equity – that it’s essential to hold struggling schools accountable so that our most vulnerable students don’t fall through the cracks. What’s the evidence tell us?

DK:  A desire for improved equity was one of the primary motivations of some advocates of test-based accountability. For example, the expectation that test-based accountability would improve the education of low-performing groups was the main reasons why some leading liberals in Congress supported NCLB.

Good intentions notwithstanding, test-based accountability has largely failed to improve equity. There is no evidence that these policies have contributed to the slow and erratic narrowing of the gap between minority and non-minority children, which started much earlier, and the gap between rich and poor students has been widening. There is some evidence of small relative gains by students at the bottom of the distribution of test scores, but that evidence is both inconsistent and limited.

We know some of the reasons why. Bad test prep tends to be more extensive in disadvantaged schools, and by the same token, we sometimes find that disadvantaged students experience much more severe score inflation. This leads to an illusion that equity has improved when it really hasn’t.

Despite the mobilization against testing in recent years by educators and parents, dislodging the system is a major challenge. What are the encouraging signs moving forward? 

DK: There are a few reasons to be guardedly optimistic. While many in the press are still credulous about test-based accountability, a growing number have become aware of the problems it has created. A slowly growing number of people in the education policy community admit that test-based accountability has been problematic, even if their responses to those problems haven’t always been adequate. And there are signs that parents are increasingly fed up with what test-based accountability has done to their schools.

However, this isn’t enough, and there is a long way to go. While ESSA did end some of the worst excesses of NCLB, many policymakers remain wedded to the old approach. For example, the Minnesota legislature is considering a bill, SF 2816, that would establish a rating system for schools and districts that is based entirely on test scores for elementary schools and only on test scores and graduation rates for high schools and districts.

To bring about the larger changes in direction that we need will take real work. Those who want this real change—educators, parents, and other concerned citizens—will have to make their voices heard. That’s why I wrote The Testing Charade.