Quixotic Reimagining of Standardized Tests (Part 2)

If you remember, Part 1 was here and my goal is to construct a theoretical system of standardized tests that I would be satisfied by. Here’s what I’ve got. As usual, because of the daily posting streak I have openly committed to, standard disclaimers apply.

  • We’d have a first-tier test like the SAT, except this will be explicitly designed not to distinguish among the high performers.

    The goal of the test is to assess basic proficiency in reading, writing, and mathematics. Nothing else. Most good students, those who have a shot at “good colleges” and know it, will be able to ace this test with minimal effort and can spend their time studying for other things or engaging in other pursuits. Students who don’t will still have to study and it will probably be boring, but the hope is that, especially if you’re motivated to get into a good college, there won’t be much of that studying.

    For colleges, the intention of this test is to allow them to require this test score from everybody without having to put up disclaimers that go like,

    there is really not a difference in our process between someone who scores, say, a 740 on the SAT math, and someone who scores an 800 on the SAT math. So why, as the commentor asks, is there such a difference in the admit rate? Aha! Clearly we DO prefer higher SAT scores!

    Well no, we don’t. What we prefer are things which may coincide with higher SAT scores…

    I know that the SAT is sometimes criticized for being unable to distinguish high-performers. But that’s what the AP does with its five-point scale and people get along fine with it. Admittedly it seems APs are primarily intended for use after getting admitted to college for getting credit, but I think there’s enough feeling that APs look good on one’s transcript that admissions people wouldn’t ignore those scores. The important thing is that there are other, more specific tests to pick up the slack.

    If you wanted to get something like this with minimal effort, it’s about what you’d get if you took the SAT, eliminated all the scores in each subsection above, say, 600, and replaced them all with 600. (In addition to that, I’d probably also support giving out less precise scores even among the lower ranges, again moving closer to AP tests.)

    And maybe (wild guess) test prep companies will have a hard time marketing this because there won’t be that much material to go over and most of the people who can pay exorbitant fees for prep could teach their children enough to ace the test with equally minimal effort.
  • So what comes above our redesigned basic test? It’s not a single test, but a large list of them in different subject areas, like the hard end of the SATs, SAT Subject Tests, and AP tests. The only novel rule here I’d like to introduce is the following: each student is not allowed to have more than eight hard test scores. If you take more, the worst ones just get thrown away. (Or you get to pick which ones to throw away; maybe you want a mediocre music score instead of an excellent physics score if you’re applying to an art-heavy school.)

    The scores should look like AP scores in the sense that, again, you don’t need to get every question correct to get the highest possible score. I think this is really important to minimize variance from stupid mistakes or a bad test day.

    The goal is that, once you’ve collected eight perfect hard scores through your high school career, as you no longer derive any benefit from studying additional subjects to only this mediocre level. You have to become pointy. Diversity in one’s studies is heartbreakingly overrated.

    Of course, eight is a somewhat arbitrary magic number that there’s room for adjusting.

    And to really let people choose what they want to study there would have to be many harder tests. This seems like a lot of effort but there is one large optimization in this direction that I think wouldn’t that difficult, which is to add more variations on subjects. The biology SAT II already lets you choose to answer either 20 questions on molecular biology or 20 questions on ecology. There’s no reason we couldn’t expand this more. Like the most crude division I could imagine mathematics tests with a track in continuous subjects (algebra, geometry) and a track in discrete subjects (counting and probability and basic number theory.) Of course this is predicated on the guess that both sides of this division mean things to colleges or certain fields of study. I think some studies could produce a meaningful set of tracks, or at least, a set much more so than whatever I’m making up.
  • Finally, we have the superhard tests, which are more or less the equivalent of high-school competitions, both in the test-taking sense and the research project sense. Actually, if I were standardized test czar, I probably wouldn’t need to care about these or include them in my system; their difficulty and prestige would probably stand for themselves. For example, there are boxes for the AMC and AIME on MIT’s app. I do not have many critical thoughts here, but I haven’t thought very carefully about them. (One thing I heard was that the influx of countries with very little experience to the IMO may be dragging down the difficulty of the easiest IMO problems with their votes, understandably because having their contestants not solve anything is kind of embarrassing. Somebody proposed fixing this by adding one very easy problem to each day of the IMO and bump the problems which would have been 1s and 4s back to their level. I think this is a reasonable idea, even though breaking tradition is always a bit jarring, but as far as I know, nothing came of it. I’ve heard arguments for various other issues in math competitions like the mismatch between short-answer and proof-based contests and the case for including more higher mathematics, but this is too far from my topic and I’m on a tight schedule.)

So far I have said absolutely nothing about what the questions on these tests would be like. This is a question I’d really like to answer using solid research backed by a couple million dollars or so, to determine how effectively and reliably these questions distinguish people and how they affect study habits or thinking patterns and so on. But instead I’m just blogging to meet a self-imposed deadline, so here are my feelings extrapolated from instinct and my narrow experience:

We start with the easy ones of the way, by which I mean the ones in which I don’t know much of.

  • Critical reading? Eh, I suspect I wouldn’t change it that much from what the SAT or ACT is offering.
  • Mathematics? The logistical overhead of grading would still probably force any mathematics questions in the most basic test to be multiple-choice and grid-in. In fact maybe I wouldn’t change them that much. At that level, I have little idea how mathematics questions affect people’s studying and distinguish between abilities. Sorry. To satisfy my whims I might relabel the mathematical section in the easiest test “Mathematical Computation” or something like that, and it would be nice if some of the harder tests were based on proof-writing, but I can’t think of any other concrete issues.
  • Writing…

    First, let’s talk mechanics. If there’s grammar on the basic test, I would test for grammar errors like confusing “its” and “it’s” or not capitalizing the first letter of a sentence. “Errors” I would not test include singular “they”, comma splices, “the reason why” or “the reason is because”, “10 items or less”, and so on.

    Is this part going to be boring to study for? Probably still yes, unfortunately, for all except the weirdest people who can derive perverse amusement from weird rules (read: me).

    random aside: English grammar is weird. Here is one of the weirdest corner cases I had no idea I had internalized until some article pointed it out to me. I so wish I remembered where I first saw it. Why does sentence 4 sound so much more ungrammatical than the other three?

    1. Turn the radio up.
    2. Turn up the radio.
    3. Turn it up.
    4. *Turn up it.

    Still, I think the skill of recognizing these uncontroversial errors is more widely applicable. I imagine one would want to use these particular errors in most writing, including formal writing but also informal letters and blog posts and maybe even post-apocalyptic novels where you don’t use any punctuation to symbolize lawlessness or something. I mean. Symbolization is up for interpretation, really.

    When referring to a person with unknown gender, you can choose “he or she”, “one”, “he”, “she”, “it”, singular they, or one of the many hipster newly-invented gender-neutral pronouns (I sometimes use Spivak myself). I’d like to see the prescriptivists squirm when I do those things.

    As for actual writing… oh boy. Honestly I don’t think there’s an easy way out of this one.

    The amount of time it takes to write a well-structured essay without sacrificing any part of the process is simply impractical for any standardized test, and you’d probably want more than one essay to control for variance. My gut feeling is that it would be better to give test-takers time to write and revise one- or two-paragraph responses to pointier questions. You could also have them create outlines and grade those by some unfortunately artificial rubric. Still, I think something like the latter is important because I think one of the worst facets of the old SAT essay is how it rewards rushed voluminous writing, or, to give them a lot of the benefit of the doubt, appears to reward it well enough that that’s the goal everybody is aiming for in practice.

    I think clear organization with precise and brief goals should be rewarded for at least part of the test. One of the hardest parts of writing, something I often feel in my blogging (and I’m still pretty terrible at it), is deciding what to cut and what to leave out. In the real world, readers don’t have all the time in the world to read what you wrote; you need to get your point across before they give up on you.

As usual, way too close to midnight aaaa

(note: the commenting setup here is experimental and I may not check my comments often; if you want to tell me something instead of the world, email me!)