It HAS to hallucinate: The true nature of LLM’s

by Punya Mishra | Sunday, January 28, 2024

Though Generative AI is receiving a great deal of attention lately, I am not entirely sure that the discussions of these technologies and their impact on education and society at large genuinely engage with the true nature of these technologies. In fact I have argued (as in this keynote presentation) that one could take most statements made about generative AI and replace the term “AI” with any other technology—social media, the internet, print, take your pick—and the statement would still hold true. This is problematic because unless we engage with what this technology truly is, we will forever be seeing it as being the same as other technologies that have come in the past and framing our discussions in light of that comparison. This is not to say that there isn’t anything we can learn from the past but rather that we need to approach this emerging future with not just an understanding of what has come before but also a recognition of the fact that technologies have embedded within them ways of thinking and being that play a critical role in its impact on the world.

Marshall McLuhan, famous for his The Medium is the Message aphorism wrote that:

The medium is the message because it is the medium that shapes and controls the scale and form of human association and action. The content or uses of such media are as diverse as they are ineffectual in shaping the form of human associations. Indeed, it is only too typical that the content of any medium blinds us to the character of the medium

Our conventional responses to all media, namely that it is how they are used that counts, is the numb stance of the technological idiot. For the “content” of the medium is like the juicy piece of meat carried by the burglar to distract the watchdog of the mind.

Thus as we look at any medium, we need to go beyond the “content” being conveyed to the deeper ways of being in the world that are captured by the medium.

As Ezra Klein, wrote in a column in the NYTimes (I didn’t want it to be true, but the medium really is the message):

We’ve been told — and taught — that mediums are neutral and content is king. You can’t say anything about television. The question is whether you’re watching “The Kardashians” or “The Sopranos,” “Sesame Street” or “Paw Patrol.” To say you read books is to say nothing at all: Are you imbibing potboilers or histories of 18th-century Europe? Twitter is just the new town square; if your feed is a hellscape of infighting and outrage, it’s on you to curate your experience more tightly.

There is truth to this, of course. But there is less truth to it than to the opposite. McLuhan’s view is that mediums matter more than content; it’s the common rules that govern all creation and consumption across a medium that change people and society. Oral culture teaches us to think one way, written culture another. Television turned everything into entertainment, and social media taught us to think with the crowd.

Readers of this blog will know that this is not a new idea for me. I have written quite a bit about it specially in this series on how media influence our thinking.

So what is the nature of this new medium of generative AI?

There are those who argue that these large language models are nothing but “stochastic parrots”—statistical models, trained on large gobs of data, that predict the next word in a sentence. This may have been true of earlier models but I am not sure this is true of the models we have today. It is clear to me that these models have developed some kind of higher order conceptual structures (maybe akin to schemas, and mental models that we humans have). That said, these higher order conceptual structures are NOT identical to the higher order conceptual structures we have—primarily because they have no referent in the real world. As I wrote elsewhere “ChatGPT is a bullshit artist.”

That said there are overlaps in our conceptual structures and ones created by the large language models mainly because they have been trained on large quantities of information created by us. These technologies are reflections of us—the good, the average and the bad.

The stochastic parrot argument has never really made sense to me—particularly when I see some of the things I have explored with these new tools. I could give lots of examples but let us just look at one—getting ChatGPT to write a mathematical proof in verse, which as an error in it (See Creativity, AI and Education: A reflection and an example).

What is amazing is not just that it wrote a poem (incidentally something that surprised AI researchers as well) but the kind of error it inserted in the poem—an error so subtle that even some mathematicians make the same mistake.

Now step back and consider what it takes to write such a poem. It requires understanding what a proof is, what a “good” error in the proof would be (something that undermines the proof and is difficult for people to catch), what a poem is and finally the ability to mash these understandings together to create an original poem. And to do this in a variety of styles, on demand.

To claim that this is the same as predicting the next word in a sentence undervalues all that is going on here.

There is no magic here—that is not the point I am making. This is most definitely not an argument for intentionality or agency on part of the LLM, just a statement about its way of functioning. And that these feats can be best explained by attributing some level of higher-level conceptual complexity to the LLM.

And here is where it gets interesting. The act of creating an abstraction means that some information is lost in the process. Concepts are by definition broader and more general than facts—that is their strength. But at the same time concepts are also fuzzy, they smudge details to get at a deeper or broader representation.

Thus, an important consequence of the fact that these LLMs make these higher order conceptual abstractions is that, they will, by necessity, hallucinate.

Hallucination is not a byproduct that we can get rid of—it is essential to the nature of this technology. It is an emergent behavior of these complex neural networks making abstractions and extrapolations.

I have believed this for a while but some support for this perspective comes from some new research on the nature of the large language models. As this article in Quanta Magazine (New Theory Suggests Chatbots Can Understand Text) says:

New research may have intimations of an answer. A theory developed by Sanjeev Arora of Princeton University and Anirudh Goyal, a research scientist at Google DeepMind, suggests that the largest of today’s LLMs are not stochastic parrots. The authors argue that as these models get bigger and are trained on more data, they improve on individual language-related abilities and also develop new ones by combining skills in a manner that hints at understanding — combinations that were unlikely to exist in the training data…. [In other words] From all accounts, they’ve made a strong case that the largest LLMs are not just parroting what they’ve seen before.

Jerome Bruner in his classic book (Beyond the Information Given) argued that “the most characteristic thing about mental life, over and beyond the fact that one apprehends the events of the world around one, is that one constantly goes beyond the information given.” Clearly these LLMs do not apprehend the events of the world, BUT in another key aspect these technologies do go beyond the information given. In essence, what this research is demonstrating is that these large “LLMs don’t just rely on combinations of skills they saw in their training data” but rather they are generalizing, generating prose that was NOT in the training data.

And this brings us to the key idea about the nature of this medium. This act of generalization is essentially a process we should see as being “creative,” a leap into the unknown as it were (at the risk of being inaccurate, in fact accuracy is not the goal of these systems). The article has one of the researchers (Arora) as saying that their research does not provide any insights into the accuracy of what the LLMs write. In contrast, he says that:

“In fact, it’s arguing for originality… These things have never existed in the world’s training corpus. Nobody has ever written this. It has to hallucinate.”

Let this sink in.

It has to hallucinate.

(Italics mine)

Hallucinations are not a bug but rather, they are a feature of this technology. This is extremely important to keep in mind.

We may try to guardrail against it but these hallucinations will emerge, whether we like it or not. Whether it be in implicit biases (for instance see the amazing work by my colleague Melissa Warr on implicit bias – here and here) or in the ways that AI is weird (here and here).

It is important that we take these hallucinations for what they are – the conceptual extrapolations by a system that is seeking to generalize and go beyond the information given.

THIS is the true nature of this new technology—and it is important we recognize and accept it.

These tools can be incredibly powerful as long as we recognize its true nature—and, as my colleague Ron Beghetto has suggested, they can be powerful tools for possibility thinking. Or, alternatively, and a bit more tongue-in-cheek, as I I have suggested, we should consider it as being a smart, drunk intern.

← Previous | Education & the Rise of AI Influencers

Next | The Absurd One-Sidedness of the Ethics of AI Debate: A rant →

A few randomly selected blog posts…

Education in an evolutionary perspective

Mar 9, 2010

I just discovered Peter O. Gray's blog on Psychology Today, titled Freedom to Learn: The roles of play and curiosity as foundations for learning. This is an awesome blog and really worth reading. Here are two of his posts that I strongly recommend. The first states...

Yet another periodic table…

Mar 11, 2009

The ongoing saga of mis-representing the periodic table for any darned list of objects continues... Here is a new one sent in by my friend and colleague Patrick Dickson: A periodic table of Typefaces. Now I won't beat a dead horse here, (Nashworld has a great posting...

Rethinking technology & creativity, now in paper form!

Sep 28, 2016

For the past 4 years, the Deep-Play group has written a series of articles for the journal Tech Trends under the broad rubric of Rethinking Technology & Creativity in the 21st Century. The first article was published in 2014 and we are still going strong....

Game of Thrones meets Toyota meets Systems Thinking

Aug 4, 2019

Anyone who works in the area of social design knows how important it is to develop a systems-oriented mindset and how difficult it is to do so. One one hand, we know that sustained change is possible only when we work at the level of systems not individuals and...

Connections: Photo Haiku from Summer 2016

Jan 29, 2017

For the past 17 years (with just two exceptions) my summers have been spent teaching in the MAET program. 2016 was the last time I did that, teaching in Chicago the third cohort of the MSUrbanSTEM project. The MAET program runs somewhat concurrently in three...

Special CITE issue on TPACK

Mar 30, 2009

The CITE Journal had a recent special issue devoted to TPACK. You can access the special issue (edited by Judi Harris and Matt Koehler) here or individual articles below. Bull, G., & Bell, L. (2009). TPACK: A framework for the CITE Journal. Contemporary Issues in...

Plagiarism, note to Root-Bernstein’s and Creativity Portal

Oct 5, 2008

Here are some emails (for the record) that I have sent recently to the Root-Bernstein's (the authors of Sparks of Genius) letting them know of how their intellectual property has been stolen by David Jiles, Ph.D. Details in my original posting: David Jiles, Ph.D.,...

Finding the answers to What, When, & Where

Oct 14, 2017

Three important questions that we often seek answers for are: WHAT is it?WHEN should we do it?WHERE should it happen? Turns out these questions can be answered just by replacing just one letter—namely replace "W" with "T." Here they are: ThatThenThere Simple. Here is...

How to identify AI generated text?

Jun 1, 2023

I think I solved the biggest educational challenge of our time, namely: How do we recognize AI generated text from human-created ones? Just to provide some context, the advent of large language models and generative AI have made it essential that we, as educators,...