Abolish the SAT? Score: 3 wrong, 5 omitted

July 20, 2007

by Cal Lanier

Charles Murray has parted ways with the SAT, an announcement that has generated a predictable response

Leave aside, for the moment, Mr. Murray's knowledge gaps--for example, the test prep industry doesn't even count the SAT as its most profitable exam. Ignore his failure to mention the ACT, which demolishes his aptitude/achievement dichotomy: the ACT is an achievement test that is both the academic equivalent of the SAT and the proud owner of its own healthy test prep sector.

Focus only on the foundation of Mr. Murray's argument: the Geiser/Studley study, the “pivotal analysis” that convinced him to change his views and problems abound.

Mr. Murray makes three errors in his characterization and use of the study, quoted below:

The pivotal analysis was published in 2001 by the University of California (UC), which requires all applicants to take both the SAT and achievement tests (three of them at the time the data were gathered: reading, mathematics, and a third of the student’s choosing). Using a database of 77,893 students who applied to UC from 1996 to 1999, Saul Geiser and Roger Studley analyzed the relationship among high school grades, SAT scores, achievement test scores, and freshman grades in college. .....All freshman grades are not created equal, so the UC study took the obvious differences into account. It broke down its results by college campus (an A at Berkeley might not mean the same thing as an A at Santa Cruz) and by freshman major (an A in a humanities course might not mean the same thing as an A in a physical science course). The results were unaffected. Again, the SAT was unnecessary; it added nothing to the forecasts provided by high school grades and achievement tests.

The italicized text contains the following errors:

  1. Mr. Murray describes the population as 77,893 applicants, but in fact the study population was accepted students.
  2. While the study did break down the results by included campus, at no point did it consider the differences between an A at Berkeley to one at Santa Cruz, as Mr. Murray implies, for the simple reason that there were no As at Santa Cruz—nor Bs, Cs, Ds, or Fs—at the time of the study. The formerly hippy dippy campus didn’t grade its students until 2002. The study also excluded Riverside students for two of the four years, as well as all students accepted without SAT scores.
  3. The University of California required three subject tests in writing (not reading), math, and a third test.

These mistakes may seem small, but they undermine all of Mr. Murray's conclusions.

By describing the Geiser/Studley study as of all students who “applied” to UCs, Mr. Murray avoids an obvious selection bias. In 1999, the 25th/75th percentiles for all college bound students were 430/580 Verbal, 430/590 Math. The equivalent 1999 numbers for UC Davis (which are roughly similar to those for UCSB through UCSC) are 510/630, 550/650. UC students can't be considered representative of the entire population.

Moreover, UC students are accepted on a sliding scale of "statewide eligibility". The lower the high school grades, the higher the required test scores. Students who have high test scores and lower grades in high school will, as a rule, continue that trend in college. Thus, the study "identifies" a distinction that is built into the UC acceptance matrix.

The study further undercuts its findings by excluding the Santa Cruz campus, two years of Riverside scores, and students with no SAT scores. The last two omissions are particularly significant. Students who don't submit SAT scores are overwhelmingly students accepted because of "life challenges", otherwise known as "students with extremely low test scores". Riverside has been the lowest performing UC campus for a decade or more. Santa Cruz, while ranked next last of the UC campuses, was at that time the campus of choice for bright white underachievers, although Prop 209 and its decision to adopt grades improved its fortunes considerably.

Thus, by leaving out a group of students very likely to have low test scores and high GPAs, and another group likely to have high test scores and lower GPAs, the study inadvertently omits substantial populations that might change its results.

Mr. Murray’s final error on the three required subject tests may seem even more trivial, but it's the most troubling of the three. Apparently, he confused the old Writing subject test with the English Literature subject test, thinking that the latter was the required third test.

Using a College Board study’s finding of a .83 correlation between SAT I Verbal and SAT II Literature tests, Mr. Murray argues that the SAT is redundant. The Literature test can provide the same information.

Normally, subject tests aren't useful for drawing these conclusions. Students don't take subject tests unless they apply to an elite college or the UCs; consequently, selection bias makes study results difficult to apply to a larger population. However, the College Board study focused solely on California students and, says Mr. Murray, UC requires all applicants to take the literature test. This is a sufficiently representative sample, so the high correlation can be taken at face value.

Except, of course, UC didn’t require the Literature test. It required the Writing test. The Literature test is taken by exactly the self-selecting group that Mr. Murray thought he was free of. At the time of the study, five times as many California students took the Writing test as took the Literature test.

The College Board study shows that Writing and SAT Verbal scores have a .79 correlation. Mr. Murray might point out that the error is minor--substitute the Writing test for his point and the same logic applies.

And here the full extent of his error becomes clear. As of 2005, the writing test is now part of the SAT test, which Mr. Murray wishes to abolish. No existing subject test with a sufficiently broad testing population has demonstrated any correlation with the SAT Critical Reading section.

Mr. Murray makes the same argument for the Math 1C, another test whose status has changed in the last two years. While that subject test hasn't been completely eliminated, the UC no longer requires it because the College Board moved much of the tested subject material into the SAT Math section, and most other colleges have de-emphasized it. Since the 2005 changes, the number of Math 2c testers is triple the Math 1c population, numbers that were reversed the year before.

Thus, the Math 1c subject test and the Writing subject test that Mr. Murray recommends as replacements for the SAT are in essence the SAT Math and SAT Writing sections, which Mr. Murray wishes to abolish. Normally, Mr. Murray is the first to object to distinctions without a difference; almost certainly, he is unaware that his errors have led to his doing exactly that.

Mr. Murray's error is sourced in his misunderstanding of the original Writing and Math 1c subject tests and their relationship to the other subject tests. The writing test was (and is, as part of the SAT) a straightforward English usage and essay test. The Math 1c tested second year algebra and basic trigonometry, and was originally used to supplement the SAT, which at that time tested only through geometry (it now tests through 2nd year algebra). The remaining subject tests are considerably more difficult, functionally equivalent to the multiple choice sections of the corresponding Advanced Placement tests (with the exception of Math 2c, which tests through pre-calculus).

Mr. Murray calls for ending the SAT in favor of three subject tests, citing the Geiser/Studley study as support. However, the Geiser/Studley study has absolutely nothing to say about the predictive nature of the difficult subject tests. While the Geiser/Stuldley report does state that "SAT II tests" did a slightly better job of predicting GPA, it also qualified this finding, limiting it to the predictive value of the Writing and Math 1c tests (a finding that has since been disputed). Its only conclusions involve a test that no longer exists, and a math test that the UC and many other schools consider redundant to the SAT--both tests much easier than the remaining subject tests. As Mr. Murray points out that any study conclusions involving the third subject test are unfounded, he has absolutely no basis for using the Geiser/Studley analysis as support.

Even had Mr. Murray accurately characterized the study, he must still account for his decision to use it as the basis of his argument in the first place. As mentioned above, some of the findings have been disputed. The SAT study has problems that could have been avoided. As mentioned, the lower-performing student populations were omitted and the fact that two of the three SAT II tests were considerably less difficult was not considered relevant.

More importantly, though, why would Geiser and Studley use freshman GPA as a meaningful prediction for college completion? For over fifteen years, the hot button factor in college completion rates has been remediation rates. Not only doesn't the study control for remediation, the word is not ever mentioned.

The implications of that omission are really stunning. The study doesn't distinguish between an A received in remedial education (which, in the UC system, do not count towards graduation credits) and an A in Calculus. Undoubtedly some students get As in remedial courses, just as they received As in highschool for substandard work. Likewise, a student who got a B in AP Calculus will probably get a B in multivariate calculus.

If Geiser/Studley had controlled for remedial courses, they could have eliminated all remedial courses from GPA consideration, or separated students who required more than one remedial course from those who required one or none. Imagine the findings: "Once we control for remediation, high school GPA is a better predictor of freshman GPA than SAT for students who can't write or calculate at a 10th grade level." or "Highschool GPA predicts non-remedial course GPA for students who aren't capable of college level work (what non-remedial courses could they take?)". Ultimately, one must ask of what value is freshman GPA when a substantial portion of the population isn't capable of doing freshman-level work?

California's second university system, the CSU, has declared limits on remedial courses--one year and out. UC campuses often mandate remedial courses to be completed in the first year as well.

Presumably, Geiser and Studley knew the importance of remediation in college graduation rates. However, the political nature of the study--dictated as it was by Atkinson's request for an SAT replacement--required that a more subjective predictive criteria be used.

In fact, no study is needed to determine whether test scores or GPA better predicts a student's need for remediation--although certainly, studies have demonstrated this before now. Clearly, test scores are more accurate indicators. If grades were accurate, there'd be no need for the question.

Implicit in the question is the admission that grades don't predict academic preparation. Colleges accept students with high GPAs. If GPAs were valuable predictors, they wouldn't need to remediate students at all. A high GPA would signify that the student had all the skills necessary for college.

I know of no college that accepts a high GPA as proof of academic knowledge. In contrast, hundreds of colleges, including most elite schools and certainly including the UC system, accept certain scores as proof of competency in English or math. SAT and ACT test scores are valid currency everywhere; grades are not.

Thus, the entire debate about grades vs. SAT is moot. While most admissions directors can look at a school's zipcode and student income in combination with a student's GPA and accurately predict academic abilities, they can't do it on GPA and course load alone. Grades are a joke. At best, they predict effort and have some limited relationship with the student's academic standing relative to the rest of his school class.

Mr. Murray asserts that subject tests can replace the SAT. I am extremely familiar with all the major subject tests and they are far too difficult to accurately predict the need for remediation. While basic skills are necessary, they are insufficient for a high score. Subject tests would contain too many false negatives. In my own student base alone, I can think of a dozen high school boys with a 2100 SAT and subject test scores in the 500s. (Mr. Murray is perhaps unaware that many suburban schools restrict access to advanced classes; a bad freshman year and unassertive parents can leave one barred from AP courses for an entire high school career.)

The SAT and the ACT, in contrast, are superb predictors of college readiness. , Given the current climate of "everybody goes to college", a score of 500 or above on each section of the SAT (or a 20 composite on the ACT) is the minimum level of competency required to complete a college degree. If we restricted college to a truly advanced education, the score baseline should be 550 (23) or even 600( 26). Undoubtedly, some students with lower test scores are being passed through to a degree, further degrading the value of a college diploma. However, the graduation rate of students with scores below 500 (20) is dismal.

The SAT's excellence at predicting the need for remediation may explain why UC, who commissioned the Geiser/Studley study, came to a different conclusion than Mr. Murray did. Recall that UC Chancellor Richard Atkinson, like Mr. Murray, originally called for abolishing the SAT in favor of subject tests, and in the interim doubled the weight of subject tests relative to SAT scores.

Despite the study's results and a strong call for change, the UC system still requires the SAT or ACT. It has returned to weighting the SAT/ACT equally with the subject tests, and also reduced the overall math burden of its test requirements.

So UC began where Charles Murray is now, and ended up keeping the SAT and weighting it as strongly as ever. Mr. Murray might want to ask why.

2008 ACT Results
2007 ACT Results