It’s actually remarkable that ChatGPT has been around for only six months; not so much because of how much it’s developed and transformed our lives in that time, but because of the volume of discussion about whether it’s about to transform our lives, restructure the entire economy, destroy traditional approaches to education etc. – it feels like I’ve been reading this stuff for ages. I’m aware that I’ve contributed to this, and am indeed egging myself on, so to speak, by attending exploratory workshops on the topic that encourage me to reflect further on its potential impact and how we should respond, and then write blog posts about it. Am I ahead of the game in trying to get to grips with this new technology and its implications – or am I getting caught up with the hype, in the way that people were once convinced that MOOCs would sweep away old-fashioned universities or got incredibly excited about the pedagogical potential of Second Life?
There is one very obvious argument for taking this one more seriously. To generalise wildly, previous techno-panics in HE have either been entirely top-down – creation of would-be disruptive new forms of course delivery driven by venture capital and university leadership trying to jump on the bandwagon, such as MOOCs – or tangential – identifying or imagining that The Kids are doing something new, like disappearing into virtual worlds, and trying to harness this to higher education. What’s different with ChatGPT is that it’s bottom-up: students are using it already (15,000 visits to the website using U. of Exeter WiFi in January – and I don’t use the Exeter WiFi for this, so it’s not me) – and they’re using it for their assessments.
They’re not human! They’re here already! You’re next!
There is, I think, a reasonable argument to be made that the biggest threat from ChatGPT in relation to assessment is not the thing itself but the likely reaction to it; namely, either a reversion to traditional unseen in-person exams, to check that the plausible but vacuous bullshit is the student’s own regurgitation of short-term memory, or a resort to technological counter-measures (note: professors saying in social media that they asked ChatGPT to identify whether it had generated a given bit of text is not a great look for higher education). As I’ve said before, if ChatGPT can perform well in the assessment tasks we set, given that it is a generator of plausible but vacuous bullshit with no actual understanding, then the problem lies with our assessment tasks, even if we then try to ChatPGT-proof them.
It’s also potentially an issue with marking criteria and how we use them. I can see the logic of breaking things down into individual components like ‘knowledge’, ‘understanding’ and ‘argument’ to try to explain to students what we’re looking for in their work – but the risk is that, firstly, they start to think of these as separate whereas actually they are interdependent, and secondly we start to use them in the same way, giving an essay credit for having a load of information without any trace of understanding, or rewarding a bold argument even in the absence of any supporting evidence. As I say to my students early in the first year, at university level we take it for granted that you should know the material, it’s what you do with it that counts – but to be honest I don’t entirely follow through on that in my marking, not least because the criteria encourage me not to.
So, we need to think again about exactly what we expect students to learn, and make sure that this is what we assess – and consider how much of this can be generated automatically, and how much this matters. Some of the more positive/optimistic takes on ChatGPT suggest that we could think of it as a tool like a spell-checker; we don’t penalise students for using one of those, and we might reward them for a well-presented piece of work as a result. If ChatGPT helps produce a clear, accurate, well-written version of an analysis based on the student’s own knowledge and understanding, then that’s not a problem unless we’re over-valuing form of expression rather than content.
Obviously it is a problem if a student resorts to ChatGPT to substitute for their own knowledge and understanding, since at that point we would be giving them credit not for the effective use of a tool but for getting some-one/thing to do all the work for them. (How far we can tell the difference is another thing). It is worth thinking about motive here. There is certainly a tendency for some academics to adopt a basic stance of “all/most/many students are lazy/cheating bastards who will seize every chance to get away with what they can”. I tend to go to the opposite extreme of assuming that the ones who resort to plagiarism are generally stressed, panicking, struggling, confused and/or desperate; they may not have done enough work, they may not be up to the task, but if they know they’re doing wrong – not all do – this is in the spirit of not feeling they have any choice, rather than of “heh heh this’ll fool the idiots”. I imagine ChatGPT gets fired up in a similar spirit – but I would really like to have some data on this.
Of course, as with plagiarism, a generated essay isn’t likely to do the student an enormous amount of good, except insofar as their sole goal is to submit something with a chance of passing. Part of the danger of the ChatGPT hype, especially the constant reference to ‘Artificial Intelligence’, is that it might encourage confidence in something that is actually rather crap (cf. claims about bespoke essay-writing services). The workshop I attended yesterday did, perhaps just for reasons of balance, include some discussion of possible positive uses of ChatGPT, for example to produce introductory literature reviews and summaries of complex ideas. Really? The thing has no knowledge, no understanding; it generates text strings that are statistically plausible in terms of word sequence but have no necessary connection to reality. I could use it to generate research ideas by spotting points where it goes seriously off-piste; if you don’t have enough prior knowledge, you’re not going to know that – so selling it as a source of basic information is actually more dangerous than selling it as a high-level random ideas generator.
So, the student turns in the work with a fair chance that it’s so terrible they fail anyway – and on top of that they haven’t learned anything that the assessment was supposed to teach them, or even any lessons from its failure; whatever happens, it’s just file and forget. But of course that is true of most assessments anyway; students and lecturers alike treat them as one-off moments of credentialling, even if the latter pay lip service to higher ideals – outrage at the possibility of ChatGPT involvement is much more about the idea the student has got a mark they don’t “deserve”, rather than about the failures of the learning process and the possibility that the student is now in an even weaker position going forward, whether to further study or to work.
Part of my solution to all this is ‘slow assessment’ – a process of ongoing development, feedback, revision, discussion. Just having a single draft-feedback-revision system in my final-year modules has meant that most students – the ones who engaged with the process and actually come and talk to me – produce substantially better work, and my hope is that direct experience of responding to feedback will be valuable in itself, besides the opportunity to help them understand where and why their arguments aren’t working properly, to highlight really good things and so forth. This could all be developed further; collective discussion of work in class, for example, in lieu of conventional ‘here is a load of information’ presentations. The exercise that is in some ways closest to this in form, the dissertation, is in other ways the furthest removed, insofar as it lacks the structure for regular review; it’s a struggle to get even the best students to engage with meetings or start writing early in the year.
In the case of the dissertation, this is exacerbated by baroque rules about how much draft work we’re allowed to look at, for fear of creating an unmanageable workload if all students asked for it. Proper assessment simply is expensive in terms of time, even if it’s more rewarding (and hunting down plagiarism or non-existing ChatGPT references is both time-consuming and very unrewarding). Part of the reason that ChatGPT feels such a threat is that even relatively progressive, down with traditional exams types like me had got into a reasonable routine for working through scores of assessments quite efficiently, by setting tasks so I can quickly see whether or not the student is doing what I expect. Slow Assessment is more individually focused, and hence takes longer and demands more concentration – and, no, out-sourcing it to ChatGPT is not the answer. ChatGPT is never the answer…
Update: when I posted something about this on Mastodon, I got a comment to the effect of ‘ChatGPT is here to stay, students need to learn how to use it in work and lecturers need to teach them how to evaluate its outputs’. On the whole, yes, though I’m still waiting for a concrete sense of how it might actually be a useful tool for quality work rather than a bare-minimum-standard way for companies to slash staff costs. The problem, from my perspective, comes if students use it as a means to manipulate learning and assessment processes that are intended to develop the skills and knowledge to use it critically and know when it’s producing bullshit. In terms of effect, it’s like cheating in a driving test or a medical exam – in other words, you don’t these people behind the wheel or poking about in your insides – and that’s the critical thing. It’s just an added paradox, for which so far I’m failing to think of an analogy, that it is possible to use the thing to cheat without having the slightest expertise in its use.
Blagging your way into a job as a lawyer (hello, Community), summoning demonic forces to get into wizard college, or manipulating people to help you embark on a political career all suggest prior possession of the necessary skills and understanding. Using ChatGPT is like calling on family connections to get into a top university and a career in journalism – and I should stop this line of thought before I start thinking more positively of it…
Update 2: since these posts are almost always ‘thoughts in progress’ rather than polished conclusions… The obvious point to make about the “we don’t have a problem with students using spell checkers, do we?” argument is that polishing the presentation is a rather different thing from generating the core content – not just in relation to our valuation of different aspects of the task but also in relation to the likelihood of success. Spell-checkers do of course regularly suggest the wrong word or fail to identify homophones, and that doesn’t matter very much – we just sigh and recommend better proof-reading. ChatGPT suggesting plausible but erroneous content is harder to spot without having already done lots of reading, and because it comes at the beginning of the process it undermines the whole thing.
Update 3: chatting on the Twitter with Greg Woolf about Second Life, I was suddenly reminded of the classic depiction of Virtual Reality in the Community episode ‘Lawnmower Maintenance and Postnatal Care’:
https://www.youtube.com/watch?v=z4FGzE4endQ
Just to illustrate that final point about ChatGPT never being the answer, one of the activities in yesterday’s workshop was to play around a bit with the thing – I guess they weren’t sure how much experience workshop participants would have had – and I took the opportunity to ask it about my own research. This was, as always, illuminating…
Neville Morley has published numerous articles on a wide range of topics in ancient history and classics. Some of his highly cited articles include:
1. Morley, Neville. “The Ancient Economy and Modern Historiography.” Classical Antiquity 21, no. 2 (2002): 283-324. doi:10.1525/ca.2002.21.2.283.
2. Morley, Neville. “Class Conflict and the Roman Republic.” Historia: Zeitschrift für Alte Geschichte 45, no. 4 (1996): 390-416. doi:10.2307/4436295.
3. Morley, Neville. “Beyond the Polis? Greek Regionalism and the Archaeology of the Greek Countryside.” World Archaeology 29, no. 3 (1997): 360-374. doi:10.1080/00438243.1998.9980392.
4. Morley, Neville. “Thucydides and the Idea of Progress.” Classical Quarterly 57, no. 2 (2007): 375-388. doi:10.1017/s0009838807000504.
5. Morley, Neville. “The Invention of the First Professional Archaeologists in Britain and their Mediterranean Interests.” Antiquity 74, no. 286 (2000): 319-327. doi:10.1017/s0003598x00060155.
These articles were published in some of the top journals in the field of classics and ancient history, such as Classical Antiquity, Historia, World Archaeology, and Antiquity.
So, three of these do have plausible titles, given my research interests – although I wasn’t working seriously on Thucydides in time to publish an article in 2007 – and I have published in Historia, though on bees rather than class struggle. They’re all fake, but they do look so convincing…
Leave a Reply