Sabermetrics and Grades

I’m a huge baseball fan, having played the game from six years old through college and watched it all my life. During those years, one of the more interesting developments has been the rise of interest in sabermetrics, an innovative attempt at empirically analyzing in-game activities to measure success. For example, traditional statistics measured straightforward outcomes like batting average (hits divided by at-bats) and statistics with aggregate totals like home runs, runs batted in, etc. Over time, teams began to realize that these statistics did not always measure success in the kind of outcomes that won games. For instance, a fielder may make more errors than another fielder, but actually be an overall better defender for his team. For years this was done by the “eye test,” but eventually sabermetricians began to explore ways to answer the question of the best defender in a way that more accurately reflected game outcomes. For instance, a player may make one more error per 100 chances than another player, but he may actually record an addition four outs per hundred chances because of better range, reaction, and instinct. Out of one hundred chances, then, he was actually three outs better than his counterpart though he had a lower fielding percentage.

sabermetrics periodic tableIn recent years, statistics like Ultimate Zone Rating and Defensive Runs Saved have arisen in an attempt to empirically measure such outcomes. Similarly, the WAR stat (Wins-above-Replacement) has not only become one of the top statistics listed on sites such as ESPN, but it has also seemingly played a key role in MVP voting in the past few seasons. Admittedly, the statistics are somewhat subjective in that the data is fed into a formula that has been created with certain values placed on various in-game activities that may or may not be as valuable in the eyes of all. Nevertheless, in baseball, at least, sabermetrics seems to have been helpful for scouting departments in assessing talent and for coaches in determining who to play at what times.

As an educator, I have experienced a similar dissatisfaction with the status quo of grades. Traditional scoring of exams and papers yields a fairly wooden, and oftentimes skewed, picture of reality. A student may be diligent to memorize notes taken in lecture (or even borrowed from another student!) and have a good short term memory to repeat them on an exam. The end result is an A for the semester, yet very little actual learning has taken place. It is like the player in baseball who hits 25 home runs, but he strikes out 180 times. He may have a good aggregate total of home runs, but the overall data shows he is not a better player than a normal replacement who, albeit hitting less homeruns, makes a more positive impact for the team. In sports, the end goal is easy; but in education, I think our problem with grades lies deeper.

For one, are we playing a team game or is education strictly individual? For example, a traditional B student who is willing to take time to help her peers improve is a much better contributor to the class than the A student who, when asked for help, scoffs that he is too busy doing homework to help. If the end goal is individual education, perhaps this is acceptable. At my school, School of the Ozarks, however, we truly try to create an environment of learning where each student is willing to help the other. If that is the culture of our school, shouldn’t a grade reflect a student’s contribution to the learning of the whole and not just himself?

Second, is the end goal of education repeating the material in the short term, “mastering” the material for the long term, or learning how to learn for the rest of one’s life? Again, traditional examinations and projects test only a small skill set, usually short term memory (and one might add diligence to study), but this skill is only one—and not likely even the most important—skill that will help students succeed in future schooling, work, and life. There must be better clarity regarding the end goal of education and the means to assess the broad range of such skills.

Third, what kind of student are we trying to produce? An A on the grade card would seem to suggest that we have produced an exemplary student, whereas a C suggests mediocrity (or in the age of grade inflation, perhaps even ineptitude in the subject). Yet many A students are far from exemplary. Likely most educators can name a handful of students immediately who, though they received A’s in school, they were afraid would not succeed in future endeavors. Likewise, certainly many C students were obviously on the path to success, but traditional methods of grading were unable to capture this obvious difference between two students.

As a classical Christian educator, my goal is to teach students to glorify God by loving the things that God loves, pursuing truth, demonstrating curiosity about the world, and possessing the skills and tools necessary to search out that truth about which they are curious. If this is my goal, then shouldn’t my students’ grades reflect their level of attainment towards that goal? Mortimer Adler explores a somewhat similar idea with respect to grading in his proposed Paideia schools. In The Paideia Program, he suggests that a different method of instruction such as he proposes would require a radically different kind of grading.. As classical Christian schools who aim to offer a radically different education than the status quo of our culture, I think it is time that we do a better job of grading in accordance with our distinct missions.

In subsequent posts, I hope to explore more deeply the concrete goals of our education at School of the Ozarks, categorize the skills, tools, values, participation, and more that help students achieve this goal, and work towards quantifying the observable data with an aim towards creating an empirical statistic for educational value, a sort of WAR for the classroom (still working on a name for this). I would then hope to run previous graduating classes through this formula to see how their WAR stat matches up with their GPA. Moreover, as students progress throughout college, it may be possible to observe whether classroom WAR or GPA is a better indicator of success in college and beyond. It is my hope that we (for I certainly don’t think I can do this alone) can hone this formula such that it becomes a useful (perhaps even primary) tool for assessing our current and future students. Who knows, perhaps other schools will see the value in this approach and, like the WAR stat in baseball, it will slowly but surely make a helpful contribution to the field of education, but more importantly to the lives of the students in whom we invest.

Have you thought about similar grade reorientation? What have you used? I’d love to get some of your ideas in the comments section or via email to


4 thoughts on “Sabermetrics and Grades

  1. This may end up as more of a debriefing than useful input in and of itself, but I want to say it. I have never valued grades. In optimal conditions, they are statistics that simply show how often the student, as if a machine, will produce the desired result. It is a very dehumanizing system reminiscent of many industrial-age factory economies. Teachers often try to circumvent the very product-based system by attributing grades to certain steps in the desired process by which the desired product will supposedly be produced, but all this guarantees is more complications required out of the student.

    The majority of curricula in the modern American school (as I knew it) in the end result serves the student, again, no further guaranteed purpose than to provide more instances of supposed accuracy of the student’s mechanical computations.

    The most immediate proposal for correction that comes to my mind is to break this obsession with accuracy. Accuracy has it’s place, no doubt; grades as we use them now shouldn’t be removed entirely, although I am partial to a more Peter Principle-esque implementation of standards. Unfortunately, for a new, worthwhile assessment system for students to take primacy, its exact goal is a necessity.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s