3 Chapter 3: The Process Is the Point
A programmer hooked up a neural network to a Roomba. The aim was to see if it could be trained to clean floors while avoiding collisions with objects. This endeavor faced challenges that won’t surprise you if you’re well read about robots and computers. When this machine learning algorithm was given the task of avoiding collisions, it found a foolproof solution: Do not move at all, and you will never collide with objects. Dissatisfied, the programmer said it must move while avoiding collisions. The Roomba went in circles in its immediate vicinity, satisfying the updated requirements and still avoiding all collisions. Still unsatisfied, the programmer imposed constraints that required the Roomba to maximize the area covered. This time, the Roomba found a rather different algorithm: Its solution was to move through the room backward, colliding regularly (and happily) with objects. Why did this satisfy the constraints? Since the collision sensors were on the front of the Roomba and not the back, collisions went undetected, and this approach gave it a perfect performance rating by the standard indicated.[1]
You probably laughed at this anecdote, but human students are not that different from the Roomba in the story, and, as we’ve alluded to before, will take a rational shortcut if one is available.[2] Studying using the best practices recommended by educational professionals will give a decent probability of achieving the goal of an A grade. However, whether this is the best path depends on how one defines “best.” If a student emphasizes the goal of getting an A in a given course, doing so with the least possible effort is rational (at least from a certain point of view). Educators are regularly amazed by the ingenuity shown by students who understand this rational goal. If getting an A is what matters, then why not modify a ballpoint pen so that small scrolls containing text can be rolled up, hidden inside, and easily pulled out during an exam? The student can then succeed with far less effort than traditional studying, and they can also stop stressing about the possibility they might forget some important bit of information.
But that’s cheating, some readers will object. It is. But cheating is often an understandable human response to seemingly arbitrary rules. For example, our campus has many sidewalks, but also a number of worn paths through grass along routes that humans wish to travel. These paths are an implicit response to a sense that the sidewalk layout is suboptimal, and following sidewalks will delay and thus disadvantage the pedestrian. Therefore, the rational choice is to carve a new path, which, as evidenced by their “worn” nature, is a choice shared by many. It is not only our students who do this sort of thing. When there is a rule telling us to do something and we are not convinced that the rule is good, we may break it if we think we can. Tempering that temptation is the mental balancing act we engage in between risk, reward, and consequences of breaking a given rule. Human beings break rules all the time, sometimes feeling the risk of being caught is very low (jaywalking), the reward is commensurate to the risk (getting somewhere faster), or the consequences, even if caught, are not severe (a small fine). Pirating music or movies is another commonly broken rule. Humans tend to follow rules more closely when this calculus is not in our favor, subject to our individual risk profiles and perception.
Speed limits are an excellent and familiar example of this phenomenon. We are likely to drive five to ten miles over the speed limit, just because we want to get somewhere faster. We know it’s wrong, but we perceive that it doesn’t matter, for whatever reason. We are even more likely to break the speed limit if we have a strong motivation to do so, such as in the case of a medical emergency. In fact, when we speed due to the latter, we even justify the rationale for speeding and hold ourselves harmless in that case. Right or wrong, students who cheat may feel like they are in an emergency situation themselves, and they often explain away their behavior, citing the need for a particular GPA to retain their scholarships, for example. In other words, they are deciding between following the rules—which would likely lead to decent achievement but with the risk of failure looming dangerously large—or, alternatively, taking shortcuts and gaming the system—cheating, even—to maximize the chance that they complete the course with the necessary grade to continue studying and earn their diploma. We emphasized in the previous chapter that educators need to provide a more compelling answer to students about why cheating isn’t the best solution to their problem. The current chapter offers you some of those better answers.
Robotic Weightlifting
Most professors are worried about AI, even though automated technology is nothing new. In this section, as an analogy, we explore a different sort of machine than the kind that motivated you to read this book: a robotic arm that can lift fifty pounds. If you take a physical fitness course at a university, would it be acceptable to bring this machine with you to lift the weights on your behalf? Would it be more or less worth the risks and consequences if you felt this course, which satisfied a core curriculum requirement, were a seemingly arbitrary burden on your engineering degree?
The answer probably seems rather obvious to you: The point of weightlifting, in this context, is to build human muscle, not to simply lift weights for the sake of doing so. If you are unable to later satisfy a common job requirement to lift fifty pounds, it probably won’t be satisfactory to bring along your trusty robot arm to do that lifting for you.[3] You cannot use a substitute for your own physical skill (or lack thereof) in such a setting.
The biggest issue with your robot arm plan isn’t actually the fact that you’re breaking the (implied) rules of the physical fitness course. Rather, it’s that you’re depriving yourself of the chance to strengthen your human muscles. With the realization that the brain is also a muscle, we find a parallel for our metaphor with the realm of AI as a “thought machine,” one that is directly transferrable to the humanities. To be clear, however, it is not an exact parallel since, as we have already discussed, there are meaningful limitations of the capabilities of AI.
Similarly, computer scientists can get Copilot (an analogue for ChatGPT that is trained specifically on large repositories of code) to write programs for them. In a computer science environment, this application of AI is similar to another industry-accepted practice, which is to borrow existing code and build on it rather than reinvent the wheel each time. It is much the same thing as if you need to write a “cease and desist letter” or a “not guilty plea brief,” search online for a template, and then modify it accordingly.[4]
Copilot does not, however, always produce correctly functioning programs or even solve the problem that was presented. A computer science student who uses Copilot must still check or trace its output to confirm that the program works as intended. This process of evaluation and validation requires the same knowledge and building blocks that a good computer science curriculum was trying to train in the first place. If students rely entirely on Copilot on easier exercises (where Copilot does a good job), they won’t develop the “muscle memory” they need to write more complex code. In that line of work, sooner or later they will need to write computer programs which exceed the range of code that AI is able to generate. Furthermore, in many industry applications, companies often use proprietary software and coding environments that Copilot won’t have available as training data (analogous, in the humanities, to paywalled or copyrighted work). In those environments, developers for those companies are not allowed to share code snippets with Copilot due to their proprietary nature. A student that bypassed the learning process would be destined for failure in this scenario.
The analogy is preserved with weightlifting as well: Lifting light weights is a necessary precondition for lifting heavier ones—but also for the real-world applications of lifting furniture, maybe even lifting a car off of a person trapped beneath it. The exercises we do to build up such strength are, we understand, a means toward cultivating our own abilities.[5] Yet somehow when it comes to building up mental capacities—comprehension, creativity, originality, clarity, accuracy—we forget all this and think that merely submitting an accurate product is all that matters, not the effort to achieve it or the things we learn through doing so. This discussion is a long-winded way of saying that it’s the journey, not the destination, that matters the most. Our students are prone to miss this and forget it. We need to tell them and remind them periodically.
Later in this book, we explore concrete suggestions for assignments in which bypassing the learning fundamentals by using AI will lead to student failure. We also tailor assignments based on their level: some are designed more for introductory courses, whereas others are better suited to more advanced courses. At an introductory level, many of the skills we want students to hone are developed through simpler assignments, and the end products of these correspond to outputs that AI can produce. If students bypass doing their own work at these stages, they will not be prepared when they are asked to do something beyond an AI’s capability. One’s skills do not become advanced without first being rudimentary. Even though we have calculators, for example, we still teach second graders how to add and subtract numbers by hand. This simple process helps students make mathematical connections and understand the fundamental relationship between numbers. We do the same thing with reading—kids learn to read by actually reading, rather than asking an AI to read things to them. In this sense, the key responsibility of an educator is to inspire and educate students about the rationale behind doing the more elementary activities in introductory courses.
As another example, if you need to be able to run at a certain speed as a football[6] player in order to be on the team, and the evaluation will be entirely based on the speed at which a treadmill is recorded to move, in theory you could put a robot on it and have it run for you. Why not do so? Unless the motivation to get on the treadmill is the value of the running itself, then we may not want to do it. (Also, that football player is going to get absolutely destroyed on the field.) For students to become disciplined and consistent readers, thinkers, and writers, educators need to help them find intrinsic motivation.
Unfortunately, an all-too-common challenge with inspiring motivation is that our reasoning for its payoff (or reward) is often months or years into the future. This is often difficult for students to imagine or extrapolate to, which makes sense. Hindsight is 20/20, but distance makes the motivation grow fonder. Even if a student is motivated to do the work and reap the benefits along the way, the risk and potential negative consequence of doing the work themselves may be perceived to be too high, in which case they may still opt to break the rules and take shortcuts. Sometimes, we devolve into the use of the blunt tool of compulsion, which isn’t the answer either. Even when we genuinely enjoy something, studies show that adding compulsion into the equation demotivates us.[7]
To recap, there are myriad reasons why students might cheat or feel compelled to cheat, or where cheating is the most rational answer. Nevertheless, most students do not cheat, and even when they do, they know that they are, at some level, wrong to do so. Even so, success in your course isn’t the destination—it’s one stop in a long life ahead. Students need to understand that they must master the skills in an introductory course and be able to do the work for themselves to progress further. They may not always believe us, but sometimes the problem is that we do not explain things clearly. It is worth articulating that certain kinds of cheating that may get you through an intro course with an A you didn’t earn will leave you unable to cope in more advanced courses in which the same type of cheating won’t work. As author and teacher John Warner notes,
I stood in front of a class of first-year students on the second day of our writing course and I presented a hypothetical where I give them all A grades, but class would never meet, they would [do] no assignments, they would get no feedback or instruction. They would learn nothing. That first time I did it, about 60-65% of students said they’d take that deal.
Disturbing.
The last time I did it, six or seven years later, 85% said they’d take that deal.
Disastrous.
The students were not lazy or entitled. They were responding rationally to the incentives of the system. An A without learning anything was far more valuable than learning anything, and risking a grade lower than an A.[8]
We leave it to the curious reader to wonder and perhaps research how or whether the percentages would change if the promised grade was instead a B. Doing so might illuminate how students value the risk-reward continuum as opposed to a fixed reward.[9] If we make it all about the grade, then we reap the consequences. We make it impossible for students to focus on learning for learning’s sake. We will return to this topic, because there are approaches to grading that shift the focus back to where it should be—on learning. It happens that these approaches are also useful in the context of our era of AI as well.
There is a heartwarming ad from Dutch online pharmacy DocMorris in which an older gentleman suddenly starts working out in a very specific and unusual way. He finds an object of a particular weight and lifts it up as high as he is able to, over and over again, until eventually he can lift it into the air higher than his head. At the end of the ad, we see the reason for this very targeted workout: He wants to be able to give his granddaughter a star for the top of their Christmas tree and, when he does, to be able to lift her up so that she can put it on the tree. This type of targeted workout is possible when you have a very specific goal to accomplish.
In education, we do that type of focused intellectual workout in preparing students for specific careers through courses in their major. What they may not realize is that the career they envision themselves in may not be what they spend their lives doing. Even if it happens to be, the skills needed to get promoted are more than those needed to get hired. Giving presentations, writing reports, managing interpersonal conflict, cultural understanding—you might be able to get an entry-level job without those skills in certain career paths, but managers and directors will be expected to have them.
Students don’t just want training for an entry-level job. They want to be promoted. They want to be successful. We can help them understand how being broadly educated prepares them for that more comprehensive view of career readiness. You may not know, as a student, precisely which narrow skills you will need over the course of your career, especially not with the rapid pace of technological change. Just as getting broadly and generally in physical shape is the best way to be prepared for whatever lifting, pushing, or sprinting you might find yourself suddenly needing to do, getting in general educational shape is the best way to be ready for whatever comes, and for success and promotion as you do so.
Professor as Coach
A professor is like a coach in the context of sports. A coach guides you and holds you accountable to your goals. But a coach cannot exercise for you.
We should not blame students for rationally navigating course grades by cheating, any more than we should blame the Roomba for its tactics for avoiding collisions. In both cases, the demands are being met as set. Now, you may object that both the Roomba and the human student have missed the point. True, but the point was not woven into the evaluation process. Why is this so often the case? Because the things we consider most valuable in the humanities are often notoriously difficult to evaluate in quantifiable terms. When we ask students to read a novel, what is the point? What do we hope they will get out of it? For me, the reason I read novels is the same reason I assign novels in certain courses. The experience is transformational, horizon broadening, and introspective. Reading is the point in and of itself.
Because our educational system has increasingly become about letter or percentage grades, a problem arises. How do I distinguish between a student who engaged with the reading at a B level of thoughtfulness as opposed to an A–? Another concern is how to ensure that a student cannot bluff their way through the course by saying vague things. We need to ask about specifics and (to make a Friends reference) expect a better answer than “the specifics were the best part.”[10] To ensure that reading has been done, it has thus become customary to quiz students about details: character names, places, dates. This type of assignment in turn leads students to focus on memorizing such information rather than on immersing themselves in the story.
There was a time when such quizzing on details could at least indirectly support the development of the important skill of memorization. Today and in the future, in an era in which the Internet is ubiquitous and information flows freely, memorization is a far less important component of a humanities education. When a student ten years after graduation needs to know the date of event X or the name of character Y from the Bhagavad Gita, they will turn to their phone, not wrack their brain trying to remember. I remember fondly an argument I had with several of my friends in high school about the diameter of the largest hamburger ever made.[11] Now, such an argument will never happen—a simple Google search will forever relegate that experience to a simpler past.[12]
Instead of getting nostalgic, let’s focus on the subtle, yet crucial skill shift in our example, which changed from remembering to discerning. Being able to tell whether an online source is reliable is now more important than committing any particular fact to memory. This evolution did not result from the development of generative AI; many of us had switched focus from remembering to evaluating sources a decade before the release of ChatGPT. However, these skills are even more cru cial now, since the same issues that confront someone Googling to find an answer also pertain to someone querying ChatGPT, even if not in quite the same way.
Recall that the latest AI chatbots (LLMs) are speech-generating applications. They are not designed to provide information. That doesn’t mean that they don’t regularly include information, even true information, in the speech they generate. Since they mimic patterns of words in speech, and humans often use speech to convey information, so too will the generated patterns include information. To the extent that the content can be linked to information in the same cloud of words, an LLM produces information. When we talk about which novel by Margaret Atwood or which album by Taylor Swift is our favorite, we use the names in combinations that are consistent, and the LLM mimicking these patterns of words will often produce an imitation of the correct information.
Google is a search engine, which is also something distinct from being a provider of reliable information. The fact that something is at the top of the search results says nothing whatsoever about its reliability, only the extent of keyword match as well as the result’s popularity.[13] So too what an LLM offers is new text based on the linguistic patterns of everything on the Internet. The fact that an assertion is woven into that text does not even guarantee that it is woven into the data on which the LLM was trained, much less that the assertion is correct. As autocomplete demonstrates, there are too many possible words that could reasonably follow any particular word. While it is a gross oversimplification to call an LLM a highly advanced autocomplete, the analogy may help you understand why an LLM sometimes says things that seem ludicrous or simply turn out to be wrong.
The skill of evaluating sources of information is needed across the board and not only in connection with LLMs. In an era where misinformation is commonplace, it is more crucial than ever. It doesn’t matter whether one is dealing with a malicious human, a poorly-informed human, an AI trained to spread misinformation, or a well-meaning AI that happens to produce false output because there is nothing inherent in its function to prevent that. In each case it is essential for students to be able to investigate and evaluate claims. Their very lives may depend on it. If someone turns to an LLM thinking that it is genuinely intelligent and reliable, and therefore asks it for medical advice, or whether it is necessary to evacuate their home as announced on the news, the automated speech imitation of the LLM could literally cost them their lives. The fault will not be with the technology but with those who didn’t inform themselves adequately about what it is and does.
Solving the Problems We Already Have
Addressing the new technology of LLMs also addresses a long-standing problem that has been responsible for students submitting poor essays: Most students have no concept of how long it takes to research a topic and write up one’s findings. Academics in the humanities are prone to forget that there was a time when we did not know this either. Hopefully, if you are a humanities researcher, as soon as you agree to contribute to an edited volume, you put the deadline in your calendar and set yourself a reminder at least one month—perhaps two or three—before the deadline. Most of your students have probably never used this feature in their calendars; neither K-12 nor higher education institutions tend to talk students through this process. As a result, when students realize they have a deadline, they aim to be finished just in time to meet it. In a best-case scenario, a student will submit as the final version of their essay what ought to have been a first draft. It will have been written hastily in one sitting and finished a few minutes before the essay was due, if the student is a conscientious one.
As an academic in the humanities, I was long under the illusion that I was teaching students research and writing skills simply by giving them assignments that required them to research and write. Research essays do provide opportunities to practice these skills, if students already have them at a basic level. However, just as merely moving fingers on a piano may do as much to reinforce bad habits as good ones, without guidance in the process of research and writing, students will fall back on whatever habits and methods—good or bad—they are already familiar with. By waiting until a point at which it is too close to the deadline for the project to turn out well, they may resort to having AI create their essay for them. They were not intentionally nefarious, lazy, or insincere. They simply didn’t know how to do the job, and then desperation set in. They essentially backed themselves into a corner. It is easy to judge such students as irresponsible, but if we as educators had the opportunity to teach them how to do the work, but we instead just told them to do it and assumed they knew the process, surely some of the blame lies with us.
This point connects directly with one of the ways that some educators have sought to ensure that students will not have ChatGPT write their assignment for them. The traditional way of doing it, which also forces students to follow appropriate research procedures that they otherwise might not, is to require that the essay be created in stages, and that each stage be submitted by periodic deadlines.[14] It is very difficult to get an LLM to provide materials that will genuinely look like successive drafts and component parts of an essay, and to do so in a way that meets specific course requirements. The effort to submit each draft or component in a manner that avoids detection will typically involve more work than researching and writing the essay oneself. This strategy is essentially decreasing the reward (reduced time saved) while simultaneously increasing the risk (higher likelihood of being detected). Keeping the risk-reward-consequence continuum in mind is a useful measure by which to gauge the effectiveness of any assignment that you read about or design yourself. That is the key point that underpins many of the assignments discussed in this book. As educators, we should do our best to ensure that it is genuinely advantageous to the student to do the work honestly.
Show Your Work: Creation of Essays Ex Nihilo
Although most of our suggestions for assignments appear in subsequent chapters, we mention one here not because it is the best solution for all scenarios, but because it is an illustrative example of how an educator may shift the focus back onto the process and discourage the use of AI at the same time. It is also a strategy that can be used for any type of extended written assignment. In math-based courses, it is customary to say that students must show their work. This prevents reliance on (although not necessarily use of) a calculator; students can’t get credit unless they showed how they achieved their final solution. In the humanities, we have sometimes treated essays as though the reasoning articulated in them ”shows a student’s work” by presenting the path from the question posed at the beginning to the conclusion drawn at the end. This leaves to one side the fact that the process of creating the essay is also part of the thinking process.[15] Most of us do not and could not do all of our thinking in our heads and only begin writing once we have everything figured out. We write to think. The process of writing is itself a process of figuring out what we think, of puzzling our way through evidence and arguments to our answer. In the humanities, we have rarely required students to show their work. It is about time we caught up with our math colleagues in this regard. Today’s technology makes it straightforward to see, if not strictly all the stages in the composition process, at least all the points at which a file was saved.
Unless it is something generated by an LLM in response to a prompt, a text that answers a question does not appear suddenly all at once, without genealogy, without ancestral drafts that preceded its polished final form. To be able to see how students create an essay and not merely what it looks like in its submitted form, have them save the file in a folder that you set up for their work in a cloud storage location such as Google Drive, OneDrive, or Dropbox. Then, simply require students to have a draft by a certain date, and a revised and improved draft by another, both before the final submission deadline.[16] This can be done without any necessity that you grade them or provide feedback (although doing so is an option and, in many respects, desirable). The mere fact that you have the option of reviewing steps in a student’s paper-creation process will typically be enough to discourage students from submitting AI-generated content as though it were their own work. In most instances, you will not need to look closely at saved versions and may not even need to look at them at all. It is sufficient to have the option of doing so if what a student submits raises your suspicions. For instance, if an essay is offered in perfect form by the first deadline and the student makes no changes thereafter, you can investigate more closely and take the appropriate action.[17] Inform students that you will be doing this, and they will most likely decide it is not worth the risk of trying to use AI to cheat.
Educators reading this book will find it useful to look at the version history of things they themselves have worked on. If you have never done so, it may be an eye-opening activity for you to learn about your own creative process, as well as help you learn what to look for in student work. While writing this section, I thought that the most meaningful illustration for readers would be for me to look at the version history of this very chapter. The earliest draft saved in my Dropbox folder was from just about a month prior to when I wrote the words you are reading now. At that point, the draft of this chapter was not yet a full page in Microsoft Word. When I wrote what you just read, doing so took me onto the sixth page of the document. A great deal changed in the interim. If you are not familiar with Word’s compare documents feature, you can use it to see precisely what has been changed between an earlier draft and the latest form of your document.
Ideally, you will want students to work on a document that is saved in a shared folder created by you, rather than them, but if they set one up and grant you full rights of access to the folder, then you can monitor what students do in the same manner as if their saved file were in a folder you yourself created. Keep in mind that cloud storage providers often keep drafts for a limited time, so you will either want to ensure that the time frame for an assignment corresponds to how long earlier versions are kept, or otherwise periodically download versions and keep them in a separate folder. If you require students to write something in the document at the beginning of the semester during class, this will also provide you with a writing sample to which you can compare later work. If a student does not write clearly and effectively, you can help them or direct them to other sources of support. If they suddenly submit something that is grammatically perfect when they have consistently failed to do so previously, you can investigate further.
On one level, LLMs are actually quite good at taking drafts and revising them. What is to prevent a student from getting an LLM to generate a draft and then get the LLM to keep making revisions to it? Nothing will prevent a student from doing that, but if you compare student-written drafts and revisions, or your own, with what an LLM does in this respect, you will see the difference in the output. It will stand out to you visually. An LLM does not approach revision the way a human being does. Given that the chance that even a long-standing academic author will write a first draft with no flaws of word choice is close to zero, the odds that a student will do so are lower still. Yet there will always be students who write well, so with this strategy you do not need to rely on looking for signs of human fallibility. If a student is relying on AI throughout the writing process, then both the first draft and the revision will look very different from human work. The differences between the two drafts will stand out starkly as not what one gets when humans revise their writing or copyedit someone else’s. Don’t just imagine what this is like. Take a look for yourself. Download an earlier version of a document of yours, and using the compare documents feature in Word, take a look at the edits and revisions. Now do the same with text revised by an LLM, and you will see the difference.[18]
For an added layer of protection against students getting AI to create their essays, implement the flipped classroom model and have students work on their essays in class. You can wander the classroom, browse what students are doing in real time, and provide comments and feedback right in their documents as they are working on them. If a student has large chunks of texts suddenly appearing, that will be something worth looking into. If an essay is modified outside of class time through the deletion of what was there and substitution of something different, that too will be something to investigate. In subsequent chapters, this book offers some other types of activities and assignments which, when substituted for or added alongside traditional essays, make it even more difficult for students to rely on AI to do their work for them.
For some students, the creation of an essay is a rigidly formulaic process. You have probably heard of, and may have even been taught, the five-paragraph essay. It is worth pointing out that the instructions for creating one of these are an algorithm.[19] The word algorithm is widely used today, often without knowing where the word comes from or what exactly it means. An algorithm is a set of instructions for how to perform a task.[20] Like the word algebra, it originally comes from Arabic, so if you discuss algorithms in class you can also provide some important history about the Arab world as a center of learning that influenced Europe in the Middle Ages.[21] By pointing out that mechanical jobs are the ones most likely to replace the human workforce with robots, you may help students understand why it is crucial to their own success that they develop creativity and not rely on formulaic approaches to tasks in the humanities. This is their opportunity to develop the skills needed to do what AI cannot.
When it comes to the revision of drafts, students regularly fail to make meaningful changes. An LLM, however, often makes arbitrary and unnecessary ones. To get an LLM to begin to take an original, mediocre output and make it genuinely not bad, one would have to know how to prompt the AI to make not just revisions but good revisions. Only a student who truly understands what good writing entails and which authors have produced examples of it can prompt the AI to rework their story in the style of Flannery O’Connor or Ursula Le Guin. The LLM will not produce something that bears strong resemblance to the writing of those authors, but giving it that sort of prompt does sometimes lead to stylistic improvement nonetheless.
The authors whose works students have enjoyed used their human abilities to tell stories that they as readers found it worth spending time on. None of students’ favorite stories were purely AI generated. No purely AI-generated novel or movie has become a bestseller. That isn’t to say AI cannot have a role in the creation of great works. Not at all. The point is that stories are only appreciated by humans; therefore, crafting a story that connects with our expectations while also surprising us requires a human element in the process, however much that human may utilize AI to realize their vision. An AI has no vision in that sense, and that’s the difference we must emphasize to students. Their futures depend on it.
Changing Grades
For a long time, I noticed discussions about “ungrading” and viewed them with curiosity but paid little attention.[22] It was only sometime later I discovered that a technique I had developed in my own courses could be considered a form of ungrading. The terminology I was familiar with was gamification. (If that piqued your interest, there will be more about games as learning activities later in the book.) If you think games can only be useful in elementary education, you’d be wrong. In the present context, I am not discussing individual game-like assignments but a grading system for the course as a whole.
Gamified grading is nothing more than the use of a points-based rather than a percentile grade system in which all points are cumulative. Like in a video game, your ultimate score depends on your total points earned, not how many attempts it took you to get there. To ensure the course has rigor, students still need to achieve certain goals (“level up”) in order to have a shot at a “high score” or a grade in the A range. In most courses, you will not want it to be possible to get an A just by doing 93 easy, one-point tasks. You should set up the course so that certain things must be attempted to earn an A, or that a certain benchmark of competence must be achieved. The precise details will depend on the course, and there are a lot of resources available on this topic, should you need them.[23]
I have presented at GenCon (a local gaming convention here in Indianapolis) and other places about this topic. There, I always mention the news story that persuaded me that educators were missing something crucially important about games and gaming. It was the story of players of the game Halo trying to get into an empty room. (A room within the game, in case that wasn’t clear.) People said it could not be done, but a set of players was determined to prove them wrong. They spent five years working on the problem, and in the end, they cracked it.
What struck me first about this story was realizing that, also within five years, you can become decently fluent in a language or learn to play a musical instrument. We have apparently made learning useful things so little fun that people prefer to spend their time trying to get into an empty room that does not really exist. The second thing that struck me was that the only way to achieve the goal in this or any game scenario is having the ability to try and fail over and over and over again. Our traditional grading systems are set up to make that all but impossible. An educator may tell students to value learning rather than grades. When they try something hard and do poorly, the educator may praise them verbally. The student still gets a C on the assignment, however, and it impacts their grade in the course and perhaps their scholarship or their shot at getting into grad school. If we want students to learn, they must be given the opportunity to do the thing we all do on our way to success—fail.
Hence my integration of a cumulative, points-based system. All points are cumulative. This means that if a student writes a short essay potentially worth 10 points and only gets 3, the student isn’t on the cusp of losing all hope of an A. The student just needs to do something else to earn points, perhaps another of the same assignment on a different topic. Rather than being permanently penalized for failure, the student can do an assignment knowing that if all they get out of it is knowing how to do the next one better, that is fine. While others use elaborate point systems, I have found that students are prone to be confused about how to earn an A. I thus make the points equivalent to the traditional percentile scale: if they reach 80 points, they are at a B–, and if they make it to 93, they have an A. I am struck time and time again by the fact that students who reach the total number of points needed for an A rarely just check out and stop participating. Many end the semester with a surplus of points, indicating that they were motivated by something other than just the grade.
While the freedom to fail is one crucial element from games that educators need to embrace, so too is the element of fun, or perhaps it would be better to say enjoyment and satisfaction. Your course on the Middle Ages will never be as fun as Mario Kart.[24] The point is to make it engaging and rewarding. Psychologist Mihaly Csikszentmihalyi explored how we can lose ourselves in an activity that is benefitting us, such as reading, writing, or playing a musical instrument, and not even notice the passing of time. He called this concept “flow.”[25] Whatever you call it, I hope you have experienced it, or that one day you will. Most educators have. If I did not have pop-up reminders from my calendar to tell me to get to a meeting, I might read or write right through it. The key to this experience of losing yourself in a task is for it to be at that perfect balance point where it is difficult enough to be challenging but not so difficult as to be frustrating. That spot changes as our ability improves, and as long as the learning activities keep pace with our abilities and progress, we will keep evolving.[26] Educators should give conscious attention to this so that we can foster students’ experiencing flow and the learning that accompanies it.
A points-based system can be used no matter the course content and is flexible enough to accommodate changes to its focus. Traditionally, in the humanities, we have sought to cover extensive content that students ought to know and have set quizzes to test their recall of key facts. With the widespread availability of information both accurate and inaccurate, the ability to evaluate sources is now more crucial than the ability to recall a large number of facts. This frees the educator to design assignments that focus more on cultivating research skills than on demonstrating recall of the widest possible array of facts.
Under a points-based system, quizzes can still be used to ensure that reading is done and understood, core facts are remembered, et cetera. The quizzes can be multiple choice or short answer so that the grading thereof can be automated. There can be a bank of questions that randomizes so that students do, in fact, have to learn the content. If it takes them more than one attempt to do so, what matters is that they get there before the end of the course, or by whatever point in the semester you choose. The main essay assignment in the course can then be a single project, the components or stages of which are due periodically throughout the semester, so they are forced to show their work. Peer feedback can also be integrated into the course. If you’ve already begun to incorporate such changes into your courses, they will prepare you well for further adapting assignments in response to AI technology.
There is a need to keep students engaged with content; merely implementing a points-based grading system and calling it “gamified” is not sufficient for accomplishing this. Educators need to be entertainers—not through lighthearted laughs, but through good, thought-provoking content that energizes students and, consequently, encourages them to commit to the experience, one that also makes them think and reflect. If we fail to provide engaging content, we have only ourselves to blame if students prefer to have their minds and browser tabs occupied with other things during our classes. Some of the assignments we recommend throughout this book are certainly ones that can be implemented in dull and uninteresting ways. But none of them has to be approached that way, and, wherever possible, we have shown how to make these assignments more engaging.
- The anecdote is from a tweet, quoted in Janelle Shane’s excellent book, You Look Like a Thing and I Love You: How Artificial Intelligence Works and Why It’s Making the World a Weirder Place (Voracious, 2019), at the beginning of chapter 5. The original source is Custard Smingleigh (@Smingleigh), “I hooked a neural network up to my Roomba,” Twitter (now X), November 7, 2018, https://twitter.com/Smingleigh/status/1060325665671692288. It also ended up being mentioned on a long list of similar instances of AI doing exactly what it was told but not what the programmers had hoped or intended. The technical term for this back-and-forth interaction is specification gaming. ↵
- Obviously, the authors are aware that the average student also differs from a Roomba in other ways, perhaps most notably with respect to how frequently they vacuum. ↵
- We are tempted to create a job posting that requires an applicant wearing Navy blue jeans to use a robot arm to gather heavy Marine life and bring it up into the Air (with) Force. The robot arm would be named Army, of course. We’re still workshopping it. ↵
- No, we won’t ask what you did, though it is an oddly specific series of documents you requested . . . ↵
- After all, one does not simply “Hulk smash” without first “Hulk moderately damage.” ↵
- Our football example is intentionally vague as to which sport we refer to: American football or soccer. In this way, we hope to placate audiences across the globe. ↵
- Richard M. Ryan and Edward L. Deci, "When Rewards Compete with Nature: The Undermining of Intrinsic Motivation and Self-Regulation" in Intrinsic and Extrinsic Motivation : The Search for Optimal Motivation and Performance, edited by Carol Sansone and Judith M. Harackiewicz, Academic Press, 2000, pp.13-54. ↵
- John Warner, “ChatGPT Can’t Kill Anything Worth Preserving,” The Biblioracle Recommends (blog), December 11, 2022, https://biblioracle.substack.com/p/chatgpt-cant-kill-anything-worth. ↵
- Interestingly, this sort of thought process is codified and discussed in computer science circles as the difference between constraints and preferences when designing AI. We use the concept of a “lottery” to provide a rational (and consistent) framework for when one ought to choose a fixed reward vs. an uncertain reward with a potentially higher payout. The goal is ultimately to maximize one’s expected utility. ↵
- Friends fans, think of Rachel when she accompanied Phoebe to her literature class in the season 5 episode “The One with Ross’s Sandwich.” For a clip, see “Friends: Rachel and Phoebe Take a Literature Class (Season 5 Clip),” TBS, July 17, 2021, https://youtu.be/5xiYuZJU4n4?si=8uRByn67V3Hs2W9j. The quote in the main body is of course from "The One Where Rachel Is Late" which first aired May 9, 2002. ↵
- The answer, incidentally, is a 6,040-pound beef patty measuring 24 feet in diameter, which was cooked in Montana in 1999. ↵
- Another equally dated example of things that will no longer happen is when someone else in your house picked up your landline and eavesdropped on your conversation. Good times, the ’80s and ’90s. ↵
- Google does attempt to provide results that are consistent with true information, by using heuristics such as the PageRank algorithm to identify supposedly high-fidelity sources. But as with all heuristics, they are guesses that can sometimes be wrong. Discernment, therefore, was necessary even in the age before generative AI. ↵
- We will explore a more streamlined design for this approach in the next section. ↵
- Shane Parrish, “Writing to Think,” Farnham Street (blog), Brain Food No. 552 – November 26th, 2023 https://fs.blog/writing-to-think/; Herbert Lui, “Don’t Think to Write, Write to Think,” Herbert Lui (blog), August 12, 2022, https://herbertlui.net/dont-think-to-write-write-to-think/; D. Alexis Hart, “Keeping the ‘Human’ in the Humanities,” Campus, February 24, 2023, https://alleghenycampus.com/22783/opinion/keeping-the-human-in-the-humanities/. ↵
- I am not the only educator to come up with this solution: see Dave Sayers, “A Simple Hack to ChatGPT-Proof Assignments Using Google Drive,” Times Higher Education, May 25, 2023, https://www.timeshighereducation.com/campus/simple-hack-chatgptproof-assignments-using-google-drive. ↵
- We can’t unilaterally declare that such a student should fail automatically, because they could be the one unicorn example of a student who thinks first and then writes perfectly afterward. ↵
- Ryan Watkins, “Update Your Course Syllabus for ChatGPT,” Medium.com, December 18, 2022, https://medium.com/@rwatkins_7167/updating-your-course-syllabus-for-chatgpt-965f4b57b003, also mentions the usefulness of the track changes feature in many contexts. ↵
- Also pointed out by Peter Greene, “No, ChatGPT Is Not the End of High School English: But Here’s the Useful Tool It Offers Teachers,” Forbes, December 11, 2022, https://www.forbes.com/sites/petergreene/2022/12/11/no-chatgpt-is-not-the-end-of-high-school-english-but-heres-the-useful-tool-it-offers-teachers/. ↵
- On algorithms as things humans use and not only computers, see Brian Christian and Tom Griffiths. Algorithms to Live By: The Computer Science of Human Decisions. Henry Holt and Company, 2016. ↵
- I (the computer science author) was “today years old” when I learned the etymology of the word algorithm. They should really teach this stuff in college or something... ↵
- Lynn Aaron, Santina Abbate, et. al., Optimizing AI in Higher Education. 2nd edition. SUNY Press, 2024, p.71 also mention alternative grading as relevant to our present context. ↵
- James Paul Gee, What Video Games Have to Teach Us About Learning and Literacy, 2nd ed. (St. Martin’s, 2014) is a pioneering work on this subject. Much more has been published since his book first appeared in 2003. ↵
- And that’s because Mario Kart is pretty dope. ↵
- Mihaly Csikszentmihalyi, Flow: The Psychology of Optimal Experience (HarperCollins, 1991). ↵
- For a game example of what it is like when something is too challenging to be enjoyable, I usually refer to Unfair Mario. See “Unfair Mario,” accessed Feb. 1, 2025, https://www.unfair-mario.com/. The game appeared online in 2013. ↵