Story Highlights

  • Leading educational researchers say the idea of evaluating, paying, and even firing teachers based on their ability to raise student test scores is fraught with problems.
  • Perhaps the deepest concern shared by parents and teachers is that an over-emphasis on improving test scores will lead to even more ‘teaching to the test.' 
  • Video highlights of the symposium can be found on njspotlight.com.
  • Read an article from the March NJEA Review which outlines what researchers said at the symposium.

NJEA urges caution in tying test scores to teacher evaluation

Researchers insist model has serious limitations

Published on Wednesday, March 2, 2011

It sounds so simple: if tests measure what a student learns, why shouldn’t they be used to measure the effectiveness of teachers?

That appears to be the reasoning behind Governor Chris Christie’s insistence that “at least 50 percent” of teacher evaluations be tied to student test score improvement.

But leading educational researchers say the idea of evaluating, paying, and even firing teachers based on their ability to raise student test scores is fraught with problems.

“We believe student test scores have a place in the evaluation process,” said NJEA President Barbara Keshishian, “but we also agree with highly regarded researchers that they should not play a determining role in high-stakes personnel decisions. There are a lot of flashing yellow lights suggesting policymakers should proceed with caution before putting too much emphasis on test score improvement.”

On Jan. 19, Educational Testing Service hosted a symposium on “Standardized Tests and Teacher Accountability” at its Princeton campus.  A panel of distinguished researchers discussed so-called “value-added models (VAM),” which Christie believes should be used to decide how to reward – or punish – teachers.  The researchers cited a number of concerns.

  • Current tests aren’t reliable enough.  VAM scores do little to distinguish the performance of one teacher from another.  “Value-added models ask more from current tests than they can provide,” said panel moderator Howard Wainer, a research scientist at the National Board of Medical Examiners.  “There is a lot that testing companies could do to address this issue, but it will take time and cost money.”

  • Most subjects and grades are not tested.  In fact, between 70 and 80 percent of teachers can’t currently be evaluated with test score-based models.  It will take hundreds of millions of unbudgeted dollars to create tests for all subject areas and grades.  “It’s hard to imagine how you could use VAM anytime soon in areas that aren’t tested,” said Arthur E. Wise, president emeritus of the National Council for Accreditation of Teacher Education. “Are we ready to pay for them?  What will we have to give up?”

  • The use of VAM will narrow the curriculum.  Non-tested subjects will be abandoned in favor of those that have consequences.  Schools are already over-emphasizing standardized test-taking and preparation, to the point where “we’ve adopted a strategy that focuses on drilling basic skills and narrows the curriculum,” said symposium keynoter Richard Rothstein, a research associate at the Economic Policy Institute.

  • VAM ignores factors beyond most teachers’ control.  Research shows that only one-third of the variables affecting student achievement occur in school, according to Rothstein.  The other two-thirds are outside the school.  Students from families struggling with poverty, unemployment, homelessness, illness, divorce, abuse, or any other issue will almost certainly score lower on standardized tests.

  • VAM may make it harder to dismiss bad teachers.  Most experts agree that three to five years of data are needed before VAM results can have a reasonable degree of reliability.  This reality may lead to the unintended consequence of making it harder to fire clearly underperforming teachers after one or two years in the classroom.

  • Missing data is a serious concern.  High student mobility rates (students who are not in the same school for the entire year) and/or absenteeism lead to missing test data.  “The problem of missing data is even more pernicious,” said panelist Henry Braun of Boston College.  “ [VAM] doesn’t take into account the fact that responsible professionals will pay attention to the students who are there.  In schools with highly mobile populations, a great deal of a teacher’s efforts will be spent on students who aren’t there when the test is given.”

“If the point of VAM is to differentiate among teachers it does a really bad job because most teachers are not really different from the average,” Braun said.  “VAM can only identify those teachers at the extremes, but that’s where the statistical uncertainty is the greatest.”

“The best VAM can currently do is identify those teachers who are systematically very high or very low performing after multiple years of observation,” added panelist Sean Corcoran, a researcher and assistant professor at New York University. “Didn’t we already know who these teachers are?”

“Perhaps the deepest concern shared by parents and teachers is that an over-emphasis on improving test scores will lead to even more ‘teaching to the test,’” said Keshishian.  “That’s already a problem in public schools, and it’s taking a terrible toll on teacher morale and the quality and depth of instruction.”

(For detailed coverage of the symposium see this article in the NJEA Review.) 

Video highlights from “Standardized Tests AND Teacher Accountability: THE RESEARCH” and links to presenter papers on VAM and teacher evaluation are available on njspotlight.com/ets_symposium.

Related Articles


Bookmark and Share