Errors, Ambiguities Plague U.S. News Rankings; Data Uncertain

By Gregory N. Price Sep. 14, 2007

MIT’s fall from fourth to seventh place in this year’s U.S. News and World Report college ranking was driven in part by changes in how MIT defines and computes class sizes. Corrections in how MIT reports its entering class’s SAT scores also contributed to the drop.

“The language is muddy,” Director of Institutional Research Lydia S. Snover said of the class-size definition used by U.S. News. “We’ve been having these debates on and off over what is a class.” Ambiguities in MIT’s administrative databases, resolved differently from year to year, also affect the data.

Similar errors and uncertainties affect data reported by other colleges. Schools’ reported numbers changing with interpretation from year to year “happens regularly, but not too often,” said Robert J. Morse, director of data research at U.S. News, who manages the rankings. As for outright errors, “There are many, many schools that make mistakes in their data” and correct them privately, Morse said, but “there are so many schools and so many pieces of data that [the fraction of schools making mistakes] is probably under 1 percent.”

Since 1994, MIT’s ranking in the annual survey has ranged as high as third, in 1999, and as low as seventh, in 2005 and this year. (See table on page 17.) The precise formula is not published, but the rankings are based on measures of admissions selectivity, reputation among other schools’ administrators, faculty and financial resources, graduation rates, and alumni generosity.

SAT scores omitted

In the admissions process, MIT ignores certain standardized-test scores from some students — the SAT verbal score of an international student who also took the Test of English as a Foreign Language, or the less favorable score of a student who took both the SAT and the ACT. Interim Director of Admissions Stuart Schmill ’86 explains that, in past years, “if we didn’t use a student’s score … we didn’t report it. This year we did.” The Common Data Set definition, used by U.S. News, calls for SAT and ACT quartiles based on “all enrolled, degree-seeking, first-time, first-year [freshman] students who submitted test scores.”

The errors are “really surprising,” said Morse. “It’s very rare that a school admits to something like this.” On his blog on the U.S. News Web site, Morse describes a five-step process of checking the data. This process includes pointing out big changes and asking the schools to acknowledge them, and making sure the schools report the same data to U.S. News as to other sources. To critics of the rankings’ accuracy, Morse has responded, “U.S. News does verify the data.”

For this year’s ranking, MIT’s reported first-quartile SAT scores for the 2006 incoming class totalled 1380 between the verbal and math exams, a drop of 50 points from 2005 and 30 from 2004 and 2003. The lower scores place MIT well below the California Institute of Technology, which edged ahead of MIT in this year’s ranking, and in line with Princeton, Harvard, and Yale, which occupy the top three overall spots. (See table on page 17.) With this year’s corrected data, MIT dropped from second to third in “selectivity,” one of the seven components of the rankings.

‘The language is muddy’

For the class size distribution, MIT’s Student Information System central database lacks information required to precisely answer the Common Data Set questions, leaving the various offices that have prepared different years’ data to fill in the gaps differently and produce the more favorable numbers reported in 2006 than in 2007, 2005, and 2004.

In the 2006 data, MIT reported 68 percent of its classes to have fewer than 20 students and only 11 percent to have 50 or more; the other years’ numbers ranged from 60–63 percent of classes with fewer than 20 students and from 14–16 percent with 50 or more. (See table on page 17.) The 2007 class-size proportions place MIT below all other schools in the top 10; the 2006 numbers would place MIT in line with Harvard University, the next lowest on this scale of the overall top 10.

“That class size data played a factor in why MIT fell in the rankings this year,” Morse said.

The variation from year to year arises because MIT class numbers do not always correspond to the units in which instruction is actually held — one class number may correspond to multiple sections meeting at different times and places. In this case, the Common Data Set requires these multiple sections to be treated as separate classes for the purpose of reporting class size.

“The idea is to let students know what to expect in terms of how many students will be the classroom with them at any one time,” Snover explained. But two problems arise, Snover said. First, in some classes, most instruction may take place in small sections but occasional lectures or demonstrations occur in the larger group. In these edge cases, the Common Data Set definition may be interpreted in different ways.

Second, for some classes, the administration may not know whether the class or the sections are the locus of instruction, or just how many students are in each section. “MIT does not keep track of this centrally,” Snover said. “There’s a lot of unknowns in the data.”

Filling in the unknowns requires a mixture of estimation, use of other data sources, and guesswork. In some past years, Snover’s Institutional Research office has prepared the reported numbers. For the data reported in 2006, which ended up differing in MIT’s favor from other recent years, Snover said she called up instructors across the Institute to resolve ambiguities and fill in gaps. “I supplied the data,” she says, “but it took a month. You can’t have people calling up every year.”

For this year’s data, the Registrar’s Office took over, employing their access to a broader set of databases. But even so, Snover said, “The way the database is set up, it’s very difficult right now, and really subject to the interpretation of the people doing the analysis.”

Schmill said that the use of class-size data leaves MIT “somewhat disadvantaged. … The rankings are geared towards what [its authors] consider an ideal education to be.” In a humanities class, he explains, it’s important to have a small class for discussion with the instructor, and MIT’s humanities classes are comparable in size to those at other institutions. But in an engineering class, Schmill said, “Personally, I’d rather be in a large course with a great faculty member than a small course with a mediocre one.”

Variations in how a school interprets such ambiguous data are not uncommon, Morse said. “Different people are assigned the external reporting of the data, and then the person in the most recent year will say we can’t speak for how it was done last year, all we know is this year’s is right, and we can prove that,” Morse said.

Associate Registrar Ri Romano, who prepared the data that were reported this year, said, “U.S. News is very specific about what to count and what not to count. I know that I can reproduce the numbers that I gave.” Romano said she does not know how the data have been prepared and reported in past years by the Institutional Research office.

Just how often differences in reporting affect the rankings is not easy to determine. “I don’t have time to do [a close review] for each school,” Morse said. “We don’t have time to spend, if we spend as much time as [a reporter has] to understand it, to challenge to get to the right person, to get the explanation, we’d never finish.”

In reporting rankings, U.S. News bases its recommendations for the next year on the previous year’s data. The “America’s Best Colleges 2008” list is based on data from 2007. All dates in this article refer to the data, not the date U.S. News uses in advertising its ranks.