In the first large-scale analysis of new systems that evaluate teachers based partly on student test scores, education researchers have found a weak to nonexistent relationship between value-added models (VAM) of teacher performance and the content or quality of classroom instruction. This at best tenuous correlation calls into question the appropriateness of using the data in evaluating teachers or improving classroom instruction, the report says.
This study, conducted by Morgan Polikoff and Andrew Porter and published by the American Education Research Association, is just the latest addition to the mountain of evidence that value-added teacher evaluation – a collection of statistical techniques used for analyzing student test scores – is an unreliable, cookie-cutter method that leads to unfair and inaccurate performance evaluations.
Still, as more states design and implement new evaluation systems, the pressure to attach an undue importance on value-added measures is proving hard to resist. Thirty-five states and the District of Columbia require “student achievement,” or test scores, to be a “significant” or the “most significant” factor in teacher evaluations. Only ten states do not require test scores to be used in teacher evaluations.
Polikoff and Porter analyzed and evaluated data from 327 fourth and eighth grade math and English teachers in six school districts – New York City, Dallas, Denver, Charlotte-Mecklenburg, Memphis, and Hillsborough County, Florida. The data was collected from a larger project funded by the Bill and Melinda Gates Foundation known as the Measures of Effective Teaching (oddly enough. the Gates Foundation also funded the AERA study)
The two researchers found that some teachers who were well-regarded based on measures such as classroom observations, student surveys, and other indicators nonetheless had students who tests scores that were below average. At the same time, some teachers whose students had higher test scores didn’t do so well on those other measures.
Polikoff said the results were surprising.
“What we expected to find was that there were strong positive relationships between instructional alignment with these measures of quality, that it would predict student learning on state tests,” Polikoff explained. “But what we actually found was that there were very weak to zero relationships between pedagogical quality with the value-added measures.”
Polikoff and Porter believe that value-added measures do provide some useful information, they nonetheless are not picking up the qualities most people think of as being associated with good teaching.
“Our results suggest that it’s going to be difficult to use these systems to improve teacher performance,” said Polikoff. “Given the growing extent to which states are using these measures for a wide array of decisions, our findings are troubling.”
But with the rollout of Common Core State Standards proceeding, Polikoff says it is imperative that all stakeholders develop a deeper understanding of the ways effective teachers implement the standards in the classroom.
The findings in the study, the two researchers said, leads to a disconcerting question that should be addressed by policymakers who are pushing these unreliable models: “If VAMs are not meaningfully associated with either the content or quality of instruction, what are they measuring?”
Despite its unreliability, VAM is being called into question across the country. In 2013, teachers in Florida, supported by the National Education Association and the Florida Education Association, filed a lawsuit charging that the state’s new evaluation system, which used a bizarre formula that incorporates test data from students who some teachers have never taught, violated the equal protection and due process clauses of the 14th Amendment of the U.S. Constitution. Although the judge agreed that the system was unreliable and unfair, he dismissed the suit because the problems did not meet the standard to be declared unconstitutional. In April, the Tennessee Education Association took similar legal action against the state’s value-added assessment system.