It HAS to hallucinate: The true nature of LLM’s

by | Sunday, January 28, 2024

Though Generative AI is receiving a great deal of attention lately, I am not entirely sure that the discussions of these technologies and their impact on education and society at large genuinely engage with the true nature of these technologies. In fact I have argued (as in this keynote presentation) that one could take most statements made about generative AI and replace the term “AI” with any other technology—social media, the internet, print, take your pick—and the statement would still hold true. This is problematic because unless we engage with what this technology truly is, we will forever be seeing it as being the same as other technologies that have come in the past and framing our discussions in light of that comparison. This is not to say that there isn’t anything we can learn from the past but rather that we need to approach this emerging future with not just an understanding of what has come before but also a recognition of the fact that technologies have embedded within them ways of thinking and being that play a critical role in its impact on the world.

Marshall McLuhan, famous for his The Medium is the Message aphorism wrote that:

The medium is the message because it is the medium that shapes and controls the scale and form of human association and action. The content or uses of such media are as diverse as they are ineffectual in shaping the form of human associations. Indeed, it is only too typical that the content of any medium blinds us to the character of the medium

Our conventional responses to all media, namely that it is how they are used that counts, is the numb stance of the technological idiot. For the “content” of the medium is like the juicy piece of meat carried by the burglar to distract the watchdog of the mind.

Thus as we look at any medium, we need to go beyond the “content” being conveyed to the deeper ways of being in the world that are captured by the medium.

As Ezra Klein, wrote in a column in the NYTimes (I didn’t want it to be true, but the medium really is the message):

We’ve been told — and taught — that mediums are neutral and content is king. You can’t say anything about television. The question is whether you’re watching “The Kardashians” or “The Sopranos,” “Sesame Street” or “Paw Patrol.” To say you read books is to say nothing at all: Are you imbibing potboilers or histories of 18th-century Europe? Twitter is just the new town square; if your feed is a hellscape of infighting and outrage, it’s on you to curate your experience more tightly.

There is truth to this, of course. But there is less truth to it than to the opposite. McLuhan’s view is that mediums matter more than content; it’s the common rules that govern all creation and consumption across a medium that change people and society. Oral culture teaches us to think one way, written culture another. Television turned everything into entertainment, and social media taught us to think with the crowd.

Readers of this blog will know that this is not a new idea for me. I have written quite a bit about it specially in this series on how media influence our thinking.

There are those who argue that these large language models are nothing but “stochastic parrots”—statistical models, trained on large gobs of data, that predict the next word in a sentence. This may have been true of earlier models but I am not sure this is true of the models we have today. It is clear to me that these models have developed some kind of higher order conceptual structures (maybe akin to schemas, and mental models that we humans have). That said, these higher order conceptual structures are NOT identical to the higher order conceptual structures we have—primarily because they have no referent in the real world. As I wrote elsewhere “ChatGPT is a bullshit artist.”

That said there are overlaps in our conceptual structures and ones created by the large language models mainly because they have been trained on large quantities of information created by us. These technologies are reflections of us—the good, the average and the bad.

The stochastic parrot argument has never really made sense to me—particularly when I see some of the things I have explored with these new tools. I could give lots of examples but let us just look at one—getting ChatGPT to write a mathematical proof in verse, which as an error in it (See Creativity, AI and Education: A reflection and an example).

What is amazing is not just that it wrote a poem (incidentally something that surprised AI researchers as well) but the kind of error it inserted in the poem—an error so subtle that even some mathematicians make the same mistake.

Now step back and consider what it takes to write such a poem. It requires understanding what a proof is, what a “good” error in the proof would be (something that undermines the proof and is difficult for people to catch), what a poem is and finally the ability to mash these understandings together to create an original poem. And to do this in a variety of styles, on demand.

To claim that this is the same as predicting the next word in a sentence undervalues all that is going on here.

There is no magic here—that is not the point I am making. This is most definitely not an argument for intentionality or agency on part of the LLM, just a statement about its way of functioning. And that these feats can be best explained by attributing some level of higher-level conceptual complexity to the LLM.

And here is where it gets interesting. The act of creating an abstraction means that some information is lost in the process. Concepts are by definition broader and more general than facts—that is their strength. But at the same time concepts are also fuzzy, they smudge details to get at a deeper or broader representation.

Thus, an important consequence of the fact that these LLMs make these higher order conceptual abstractions is that, they will, by necessity, hallucinate.

I have believed this for a while but some support for this perspective comes from some new research on the nature of the large language models. As this article in Quanta Magazine (New Theory Suggests Chatbots Can Understand Text) says:

New research may have intimations of an answer. A theory developed by Sanjeev Arora of Princeton University and Anirudh Goyal, a research scientist at Google DeepMind, suggests that the largest of today’s LLMs are not stochastic parrots. The authors argue that as these models get bigger and are trained on more data, they improve on individual language-related abilities and also develop new ones by combining skills in a manner that hints at understanding — combinations that were unlikely to exist in the training data…. [In other words] From all accounts, they’ve made a strong case that the largest LLMs are not just parroting what they’ve seen before.

Jerome Bruner in his classic book (Beyond the Information Given) argued that “the most characteristic thing about mental life, over and beyond the fact that one apprehends the events of the world around one, is that one constantly goes beyond the information given.” Clearly these LLMs do not apprehend the events of the world, BUT in another key aspect these technologies do go beyond the information given. In essence, what this research is demonstrating is that these large “LLMs don’t just rely on combinations of skills they saw in their training data” but rather they are generalizing, generating prose that was NOT in the training data.

And this brings us to the key idea about the nature of this medium. This act of generalization is essentially a process we should see as being “creative,” a leap into the unknown as it were (at the risk of being inaccurate, in fact accuracy is not the goal of these systems). The article has one of the researchers (Arora) as saying that their research does not provide any insights into the accuracy of what the LLMs write. In contrast, he says that:

“In fact, it’s arguing for originality… These things have never existed in the world’s training corpus. Nobody has ever written this. It has to hallucinate.”

Let this sink in.

(Italics mine)

Hallucinations are not a bug but rather, they are a feature of this technology. This is extremely important to keep in mind.

We may try to guardrail against it but these hallucinations will emerge, whether we like it or not. Whether it be in implicit biases (for instance see the amazing work by my colleague Melissa Warr on implicit bias – here and here) or in the ways that AI is weird (here and here).

THIS is the true nature of this new technology—and it is important we recognize and accept it.

These tools can be incredibly powerful as long as we recognize its true nature—and, as my colleague Ron Beghetto has suggested, they can be powerful tools for possibility thinking. Or, alternatively, and a bit more tongue-in-cheek, as I I have suggested, we should consider it as being a smart, drunk intern.

A few randomly selected blog posts…

Uncreativity: An interview with Chris Bilton

Uncreativity: An interview with Chris Bilton

"un-creativity" design, invariant under rotation by 180-degrees In this article, in our ongoing series on Rethinking technology & creativity in the 21st century, we interview Dr. Chris Bilton, Reader at the Centre for Policy Studies at University of...

Happy Hanukkah: New Ambigram

Happy Hanukkah: New Ambigram

In keeping with the holiday theme (see Christmas ambigram below) it seemed appropriate to create a design for Hanukkah. That task actually turned out easier than I had expected - with some natural symmetries that I could take advantage of. The "U" at the...

The joy of learning: A reflection

The joy of learning: A reflection

What is this thing called learning? What does it mean to learn something? What makes us want to learn? Why is it fun? Why do we want to know? Even as educators, we often don't take the time to ask ourselves these foundational questions. So it is rewarding when we get...

Value Laden: A new podcast about ethical leadership

Value Laden: A new podcast about ethical leadership

What is the role of values and principles in educational leadership? What can we learn from inspirational educational leaders? How did they develop their moral/ethical compass, and more importantly, how do they bring these perspectives to the work that they do? These...

Update on SITE08 Keynote

A re-edited version of the SITE 2008 Keynote address (by Matt Koehler and me) has been uploaded to the website. You can find the new version here. This presentation depended quite heavily on the exact synching of slide transitions to the audio - and the previous...

Generative AI in Education: Keynote at UofM-Flint

Generative AI in Education: Keynote at UofM-Flint

A couple of weeks ago I was invited to give a keynote at the Frances Willson Thompson Critical Issues Conference on Generative AI in Education. It was great to go back to Michigan even if for a super short trip. One of the pleasures of the visit was catching up with...

Google, teaching & creativity

Mike DeSchryver and I recently presented a paper at AERA titled "Googling creativity: An investigation into how pre-service mathematics teachers use the Web to generate creative ways to teach." The abstract is as follows: This study examined teacher creativity and its...

Limp Kiss

Just Stumbled upon this: A Poem by Nichita Stãnescu Tell me, if I caught you one day and kissed the sole of your foot, wouldn't you limp a little then, afraid to crush my kiss?... more here

100 and counting: Silver Lining for Learning

100 and counting: Silver Lining for Learning

March 11, 2020 (a little over two years ago), just around when the pandemic had forced educational institutions across the globe to shut down and transition to remote learning, my friend Yong Zhao reached out to Chris Dede, Curt Bonk, Scott McLeod, Shuangye Chen and...

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *