Learning, Improvement, and Accountability:
Outlining the Dimensions of the Data Discourse

By Pamela Burdman

It seems that more and more, discussions of education research and practice have data at their core: data to measure student achievement, data to evaluate teachers, data to rank schools, data to inform policy makers. The education reform initiative that does not view data as key to ensuring all children the opportunity to learn is increasingly rare.

Nine years after the Bush Administration launched No Child Left Behind, the Obama Administration is investing hundreds of millions to build longitudinal data systems. The Data Quality Campaign is ranking states on the state of their educational data. Leading educational organizations such as the Carnegie Foundation for the Advancement of Teaching are emulating the evidence-based improvement work of the Institute for Healthcare Improvement. And major philanthropic foundations are backing initiatives to increase the use of data in schools and colleges.

As stimulus dollars throw additional weight behind the data parade and push states and schools to join the “race to the top,” it sometimes feels as though an inexorable movement has taken hold. As such, it is a particularly opportune moment to step back and reflect on what role data can and should play in improving the nation’s schools.

At this month’s American Educational Research Association conference, the Spencer Foundation devoted a Saturday evening to that very question. Spencer, a Chicago-based philanthropy, was founded in 1968 to support educational research. Like the Foundation’s Data Use and Educational Improvement initiative, the event was designed to provide context for discussions about how and whether data can be used to make schools better.

The session in Denver was not billed as a debate. But the invitation for the event - titled What Should We Measure and Why: The Uses and Limits of Data in Improving Education and featuring Harvard professors Richard Murnane and Howard Gardner – left little doubt in the minds of the audience as to who would be championing the promise of data and who would be warning of the dangers of misuse.

Murnane, an economist who studies the interaction of education and the economy, has a particular interest in the links between schooling and social inequality. His collaboration with Boston schools led to the development of technology tools to help educators analyze student data to implement evidence-based improvements for reducing achievement gaps. The process that emerged was dubbed “Data Wise”.

A psychologist best known as the father of the theory of multiple intelligences and a critic of classical psychometrics, Gardner could surely be expected to counter Murnane’s faith in the positive use of data. The very idea of multiple intelligences casts doubt on the power of numbers to measure student learning.

Moderator David Leonhardt, a New York Times economics columnist, is no stranger to debates about how data shapes societal improvement. He has written in depth about health care, exposing the tension over whether physicians should make decisions based on reasoning and judgment or rely on evidence-based protocols and checklists.

“The debate between intuition and empiricism is as old as Plato, who thought that knowledge came from intuitive reasoning, and Aristotle, who preferred observation,” wrote Leonhardt in the Times Magazine last fall. “The argument has seemed especially intense lately, as one field after another has struggled to define the role of human judgment in a data-saturated society.”

In medicine, the dichotomy is epitomized by two Harvard professors and prolific writers, cancer researcher Jerome Groopman and surgeon Atul Gawande. In How Doctors Think, Groopman criticizes the cognitive errors frequently committed by physicians, calling for an overhaul in how doctors assess and communicate with their patients. Gawande’s most recent book, The Checklist Manifesto, advocates a different remedy: the quality improvement approaches used by companies such as Boeing and Wal-Mart.

Unwittingly, Groopman and Gawande played a role in the genesis of Spencer’s dialogue about education data. Earlier this year, Gardner, in a conversation with Spencer Foundation President Mike McPherson, positioned himself on the Groopman-Gawande scale: “When it comes to medicine, I’m with Gawande, but when it comes to education, I’m with Groopman.”

Gardner’s affinity with Groopman wasn’t altogether surprising, and he noted subsequently that the application of the physicans’ views to education is much more complicated than his casual statement conveyed. Nevertheless, the parallels intrigued McPherson and inspired him to organize the discourse with Murnane and Leonhardt. Despite knowing each other for years, the Harvard colleagues say McPherson’s invitation led to their most extensive conversation about education policy to date. Clearly McPherson was not alone in his curiosity. That night, at least 300 AERA attendees – including numerous luminaries in the education research world - descended to the depths of the Denver Marriott to witness the interaction.

The setting was genteel, and so was the tenor. Gardner in particular demurred at the implication of a face-off or confrontation, insisting that he had little dispute with Murnane’s views. In fact, Leonhardt’s words in last fall’s Times article about health care could have been written for this event: “These disagreements can sometimes be exaggerated, because everyone agrees that intuition and empiricism both have a role to play. But the fight over how to balance the two is a real one.”

Both scholars referred laudingly to learning organizations, those that use evidence as feedback to improve their practices. Gardner didn’t reject numerical metrics out of hand, he just wished that all in education would use data as wisely as Murnane has modeled in his work with schools. Nor did Murnane deny the prevalence of data misuse, as he urged Gardner to continue cautioning against it. Gardner said he longed for a Supreme Court ruling against educational rankings. Murnane concurred that test data should never be used to make high-stakes decisions about teachers and schools. Yet the difference in emphasis was hard to hide.

“I worry a great deal about the implicit or explicit messages in having such a focus on tests, data, failing kids, failing schools, rankings, rankings, and rankings,” noted Gardner in his opening remarks. “As a psychologist, I think there’s been a massive figure-ground confusion. The figure has been consistently, for the last twenty years, tests and data, tests and data, tests and data. And the rest of life, the kind of society we live in, the kind of people we want to be, has receded so far into the background that in many of our most troubled schools they’re absent altogether.”

He held up a copy of U.S. News and World Report’s ranking of the nation’s 100 best high schools, calling it a “nonsensical document.” He deplored the shallowness of parents and principals urging kids to chew gum in class because of research that says it could boost their test scores by a few points. He decried cases such as Mississippi, where 82 percent of students are considered proficient on state tests, compared with 18 percent on national assessments. “What’s going on is smoke and mirrors, and every place is Lake Wobegone,” Gardner sighed.

The fixation with rankings, in Gardner’s view, fosters an acquisitive and amoral society. Enron, AIG, and Goldman Sachs are further evidence that graduates of the nation’s best schools are obsessed with “money, markets, and me,” what he calls the three M’s. “This league table mentality is so powerful that it’s going to take psychosurgery to get people to think differently,” he lamented. “I don’t think anything as complicated as a school or teacher can be evaluated on one thing.”

He advocated an approach to assessments in which the instrument would change every year to prevent gaming. Several times he returned to the paradigm of Reggio Emilia, a town in northeastern Italy often considered home to the world’s best schools for young children. In lieu of test scores and data, he says, Reggio has a caring and respectful culture and a community that works, the kind of community where Gardner would like to live – and a direct antithesis to the America he describes.

Murnane shared Gardner’s distress about the direction of American democracy. But while Gardner began with a focus on values, Murnane grounded his thinking in a concern about equity. He sympathized with the pluralistic vision of schooling espoused by Gardner but questioned its pragmatism.

“I see the inability of low-income families to obtain a good education for their children as a critical reason why socioeconomic mobility is so low in the United States today relative to other countries. I see that as an enormous threat to our democracy. I want for all kids the kind of schools that you want,” he said to Gardner. “I just don’t see any way to get there.”

“I want the voice of Howard and others who express similar views to be heard widely and loudly throughout the country as we think about the future of education in the United States. Unfortunately, I don’t think those voices, by themselves, will carry the day. A great many Americans do not want the types of schools Howard advocates. People with views very different from Howard … have substantial power in determining what happens in local schools.”

Against the backdrop of local control of schools, Murnane finds hope in No Child Left Behind. Education researchers are deeply ambivalent about the law, which requires states to test all students in basic skills as a condition of federal funding. Schools are expected to meet annual targets for improvement. Without defending its flaws, Murnane insisted that NCLB provides critical leverage for the federal government to attempt to improve schools that are failing their students. Most importantly, he noted, “The increased attention to the achievement of disadvantaged children is a singular accomplishment … a foundation on which to build in improving the law.”

Plus, he added, not all data are test scores that can be gamed. Unlike state standards tests, Murnane noted that the National Assessment of Education Progress is very helpful in elucidating achievement patterns over time and across regions. NAEP data, in fact, was used to reveal Missisippi’s low proficiency standard. He also mentioned MDRC’s recent evaluation of career academies, which showed that though the programs didn’t produce differences in test scores, their students were earning 11 percent more than similar students eight years after leaving high school. “It’s increasingly low cost to look at a variety of kinds of data,” he pointed out.

True to his calling as an economist, the metrics Murnane mentioned most were conventional educational outcomes: high school graduation rates, college-going rates, labor market outcomes. Gardner, the psychologist, turned more readily to psycho-social measures like acts of violence, cases of bullying, and students’ ability to listen to diverse points of view.

Though he didn’t find fault with Murnane’s empiricism per se, Gardner presented a contrasting picture. “If there’s one point where there’s a real fault line this evening, it’s that I’m much, much more afraid of the abuse of data and the misuse for society, whether it’s in business or in education,” said Gardner. “Anything I can do to pluralize, mess things up, get people to think more rounded and more complexly, is good. At the end of the day, we want communities and societies that work, and not ones where the stock market is over the top and then we find out it’s a house of cards.”

Murnane took issue with the implication that empirical data required an abandonment of ethics or community-spiritedness: “Those are values I share. I don’t see those as inconsistent with an emphasis on teaching kids core skills. With enough capacity, with enough investment, you can use the pressure from well-done standards and assessments like in Massachusetts to both raise kids’ scores and do it in a way in which the adults and kids respect each other.”

Each scholar seemed to be striving assiduously to find points of agreement even as they highlighted their differences. This may have allowed both perspectives to resonate with the audience. According to interviews after the event, many who had come sympathetic to one of the two perspectives left seeing new value in the other. Neither viewpoint seemed fully sewn up.

Murnane didn’t have easy answers about how to navigate the policy challenge of ensuring that more states follow the lead of Massachusetts, while avoiding the pitfalls that Gardner fears.

But it was Gardner who faced the most challenging questions from Leonhardt, who confessed to being a Gawande man. Asked by Leonhardt to explain how to measure whether schools are achieving their mission of building strong communities, Gardner fell back on his own discernment, saying “I can walk into any school in America … and I can tell you in minutes whether there’s an air of respect.” He advocated strong teacher communities that set their own standards against which they measure their own work.

In comments after the session, Gardner clarified that he did not intend for each school to devise its own measures, but suggested that networks of schools could collectively develop their own assessments to use for accountability purposes. He envisioned that these networks could form around curricular pathways, such as one emphasizing E.D. Hirsch-type “core knowledge” and another that is more inquiry-based.

That approach certainly avoids the “league table” mentality, but Leonhardt pressed him on how accountability can exist in the absence of common benchmarks that are susceptible to ranking. To that, Gardner confessed, “This debate is really about how the best is the enemy of the good. Dick is right in the sense that I’m really a best kind of person, and good often gets in my way.”

Gardner’s insight seemed key to understanding the evening. While opponents of data and testing may appear suspicious and mistrustful, their vision of what schools can accomplish in fact demonstrates a deep faith – a faith that teachers working together have both the desire and capacity to build strong schools and communities in the absence of carrots and sticks. It’s an idealistic assumption that is not easy to vet in the current American educational environment.

Those who pin their hopes on NCLB and positive uses of data may have a far less utopian goal. To them, the risks of crude measures and manipulated evidence are a price worth paying in exchange for a lever for changing, at least incrementally, the type of education many disadvantaged children receive. While doing so may not ensure these students the best possible education, it could prevent them from experiencing the worst.

If one thing was elucidated by the discussion, it is that researchers probably can’t investigate the uses and limits of data for improving education without encountering these deeper ethical and strategic questions. Ultimately, their answers may depend on how they define improvement, and how much improvement they dare to seek.

Pamela Burdman, a senior project director at WestEd, previously served as a program officer at the William and Flora Hewlett Foundation and an education reporter at the San Francisco Chronicle.

Howard Gardner is the John H. and Elisabeth A. Hobbs Professor of Cognition and Education at Harvard University. Gardner is best known for his theory of multiple intelligences, a critique of the notion that there exists but a single human intelligence that can be adequately assessed by standards psychometric instruments. During the past two decades, Gardner has been involved in the design of performance-based assessments; education for understanding; and development of more personalized curriculum, instruction, and pedagogy. He is currently investigating trust in contemporary society and ethical dimensions entailed in the use of the new digital media. He has two forthcoming books: Good Work: Theory and Practice, the ninth in a series, as well as a collection of lectures at the Museum of Modern Art on classical educational virtues viewed from the perspective of the post-modern digital era. He is also a Spencer Foundation board member.

Richard Murnane is the Juliana W. and William Foss Thompson Professor of Education and Society at Harvard University. Murnane’s research focuses on the relationship between education and the economy, teacher labor markets, and strategies to improve education for disadvantaged children. His 1996 book with Frank Levy, Teaching the New Basic Skills, explains how changes in the U.S. economy have increased the skills needed to earn a middle-class living. Murnane is currently co-directing a major research project on social inequality and educational disadvantage in the United States. In 2005, he co-edited Data Wise: A Step-by-Step Guide to Using Assessment Results to Improve Teaching and Learning. His book with John Willett, Methods Matter: Improving Causal Inference in Educational and Social Science Research will be published in September 2010.

David Leonhardt is a New York Times writer and author of the column, Economic Scene. David has been writing about economics for The Times since 2000 and has been a columnist since 2006. He was part of a team that produced the paper’s award-winning series and book, Class Matters. He is also a staff writer for The Times Magazine and contributes to the Economix blog. A New York native, he studied applied mathematics at Yale. Before joining The Times, Leonhardt worked for Business Week magazine and for the Washington Post.

The Spencer Foundation was established in 1962 by Lyle M. Spencer. The Foundation received its major endowment upon Spencer’s death in 1968 and began formal grant making in 1971. The Foundation is intended, by Spencer’s direction, to investigate ways in which education, broadly conceived, can be improved around the world. The Spencer Foundation believes that cultivating knowledge and new ideas about education will ultimately improve students’ lives and enrich society. The Foundation pursues its mission by awarding research grants and fellowships and through other activities to strengthen the connections among education research, policy, and practice.