
Yesterday (23 March 2023) I attended a KnowledgeMakers workshop organised by KMi (Knowledge Media Institute) at The Open University in Milton Keynes titled ‘ChatGPT and Friends: How Generative AI is Going to Change Everything.’
It was a fantastic event: 17 presentations in two hours from researchers, academics, developers and managers from across the OU, covering the current state and capabilities of ChatGPT/GPT-3/GPT-4, and applications in learning, teaching and assessment. It was impressive to see the range of uses of and responses to genAI at the OU.
The event opened with an intro to ChatGPT/GPT-3/GPT-4 from Prof John Domingue of KMi, and the development of the tools in terms of parameter count (117 million (GPT-2 2018), 1.5 billion (GPT-2 2019), 175 billion (GPT-3)) and training data (from 4.5GB in GPT-2 to 570GB in ChatGPT). John described GPT as a statistical prediction model, ‘a text predictor on steroids.’ He outlined the AI agents ecosystem being developed at the OU, comprising five main AI services: an AI digital assistant for every OU student, an AI careers advice agent (the dream machine), assessment AIs, smart course delivery platform, and tutor AI assistants.
Next up was a presentation on safeguards, trustworthiness and social responsibility in ChatGPT from Shuang Ao (research student in STEM/KMi). Shuang outlined three ‘failures’ of ChatGPT: reasoning and logic; factual errors (or ‘hallucinations’ – it makes stuff up that doesn’t exist); and bias and discrimination (it generates answers containing racism, sexism, homophobia and misinformation). Shuang: ‘LLMs are still uncontrollable, not transparent and unstable.’
But can it make decisions, asked Lucas Anastasiou (also a research student in STEM/KMi). The answer: not really. Lucas described a series of nano-experiments in which genAIs were used to play chess and poker, build a stock portfolio, and make geo-political predictions. The tools performed badly on all these, though less badly on the stock portfolio experiment. The tools rely on already acquired human knowledge of, for example, chess openings. How would they perform without human knowledge inputs?
Some of the themes of the opening two presentations from Shuang and Lucas echoed through the presentations that followed from OU academics who are using genAI tools in teaching, learning and assessment. Alistair Willis (Senior Lecturer in Computing in the OU STEM faculty) has given OU undergraduate assessment tasks to ChatGPT and found it did very well on structured tasks including writing code, but was ‘rubbish’ on more open tasks and interpretation. Chris Douce (Senior Lecturer and Staff Tutor in STEM) has used ChatGPT to write Java code to solve problems and found it very effective. Tony Hirst (Senior Lecturer in Telematics in STEM) is integrating ChatGPT with authoring workflows (including simple Python APIs), to co-author generative documents incorporating scripted asset creation (e.g. diagrams and mathematical figures). Naturally, all of this raises the issues of plagiarism and integrity of assessment, and colleagues discussed use of generative content detectors such as ZeroGPT and the need for good clear policy.
Concerns about plagiarism and factual errors were rehearsed also in the presentation from the Core team (David Pride and Matteo Cancellieri). Core is the world’s largest collection of open access research papers. David and Matteo used ChatGPT to create an entirely fictional academic, Emeritus Professor Dr Jeffrey Bakker, a professor of stochastic modelling at the University of Utrecht, and to generate his biography and a list of Prof Bakker’s five most-cited publications. In the demo of using the GPT API to search the Core aggregator for research trends in LLMs, the answer included info relating to photomechanisms of gene mutations in neurological diseases. ‘LLMs are still uncontrollable, not transparent and unstable,’ remember.
Others embrace the randomness. Christian Nold (Lecturer in Design) outlined his approach to using LLMs’ ‘performative fakeness’ as a means to defamiliarise and engender creativity. Monoj Nanda (Associate Lecturer) is actively encouraging OU students to use OpenAI and Dall-E2, and audio tools like soundraw.io, to create content while learning skills in prompt engineering. Nicole Lotz (Senior Lecturer in Design) is using genAI with Level-1 design students to generate inspiration, and as a starting point for creative projects, enabling students to get over the ‘fear of the blank page’. Both Nicole and Monoj described how they support students to use the tool iteratively and continuously, rather than episodically and transactionally, to experiment, engender playfulness and reflectivity, and to enable students to develop their own style (and avatars), while also teaching critical issues like accuracy, ownership and bias. Irina Rets (Research Fellow in the Institute for Educational Technology) asked whether there are learning losses and learning gains associated with genAI, such as unintended learning and serendipitous discovery and insight. Aisling Third (Research Fellow in KMi) gave a fun demo of the text-to-image diffusion model Stable Diffusion and its capacity to support playfulness and experimentation, as well as learning around issues of genAI skills and literacy.
Some contributors discussed the impact of genAI on assessment and ways educators and institutions might respond. ‘We can’t and shouldn’t just ban it.’ Dhouha Kbaier (Senior Lecturer in Computing) suggested a number of ways to make assessment more robust, including requiring students to submit work in formats other than text (e.g. video/audio), and assessments which include ChatGPT use and develop skills in prompt engineering. Should we now assess students on the quality of their questions, rather than on the quality of their answers?
Finally, in the closing discussion section, Arosha Bandara (Professor of Software Engineering) and others wondered about what is being lost in discussions of genAI in learning. Is language all of knowledge? Have we defined what is intelligence? Where is subjectivity here? Consciousness? Values? Belief? Connection? Empathy? Aren’t these involved with learning, knowing and wisdom? Others called for the urgent development of clear student-facing policy on use of genAI, and skills development for students and staff alike.
Fascinating two hours. John Domingue ended by saying there is no better place than the OU to be working on AI in education. As always, he’s right.