It HAS to hallucinate: The true nature of LLM’s

by | Sunday, January 28, 2024

Though Generative AI is receiving a great deal of attention lately, I am not entirely sure that the discussions of these technologies and their impact on education and society at large genuinely engage with the true nature of these technologies. In fact I have argued (as in this keynote presentation) that one could take most statements made about generative AI and replace the term “AI” with any other technology—social media, the internet, print, take your pick—and the statement would still hold true. This is problematic because unless we engage with what this technology truly is, we will forever be seeing it as being the same as other technologies that have come in the past and framing our discussions in light of that comparison. This is not to say that there isn’t anything we can learn from the past but rather that we need to approach this emerging future with not just an understanding of what has come before but also a recognition of the fact that technologies have embedded within them ways of thinking and being that play a critical role in its impact on the world.

Marshall McLuhan, famous for his The Medium is the Message aphorism wrote that:

The medium is the message because it is the medium that shapes and controls the scale and form of human association and action. The content or uses of such media are as diverse as they are ineffectual in shaping the form of human associations. Indeed, it is only too typical that the content of any medium blinds us to the character of the medium

Our conventional responses to all media, namely that it is how they are used that counts, is the numb stance of the technological idiot. For the “content” of the medium is like the juicy piece of meat carried by the burglar to distract the watchdog of the mind.

Thus as we look at any medium, we need to go beyond the “content” being conveyed to the deeper ways of being in the world that are captured by the medium.

As Ezra Klein, wrote in a column in the NYTimes (I didn’t want it to be true, but the medium really is the message):

We’ve been told — and taught — that mediums are neutral and content is king. You can’t say anything about television. The question is whether you’re watching “The Kardashians” or “The Sopranos,” “Sesame Street” or “Paw Patrol.” To say you read books is to say nothing at all: Are you imbibing potboilers or histories of 18th-century Europe? Twitter is just the new town square; if your feed is a hellscape of infighting and outrage, it’s on you to curate your experience more tightly.

There is truth to this, of course. But there is less truth to it than to the opposite. McLuhan’s view is that mediums matter more than content; it’s the common rules that govern all creation and consumption across a medium that change people and society. Oral culture teaches us to think one way, written culture another. Television turned everything into entertainment, and social media taught us to think with the crowd.

Readers of this blog will know that this is not a new idea for me. I have written quite a bit about it specially in this series on how media influence our thinking.

There are those who argue that these large language models are nothing but “stochastic parrots”—statistical models, trained on large gobs of data, that predict the next word in a sentence. This may have been true of earlier models but I am not sure this is true of the models we have today. It is clear to me that these models have developed some kind of higher order conceptual structures (maybe akin to schemas, and mental models that we humans have). That said, these higher order conceptual structures are NOT identical to the higher order conceptual structures we have—primarily because they have no referent in the real world. As I wrote elsewhere “ChatGPT is a bullshit artist.”

That said there are overlaps in our conceptual structures and ones created by the large language models mainly because they have been trained on large quantities of information created by us. These technologies are reflections of us—the good, the average and the bad.

The stochastic parrot argument has never really made sense to me—particularly when I see some of the things I have explored with these new tools. I could give lots of examples but let us just look at one—getting ChatGPT to write a mathematical proof in verse, which as an error in it (See Creativity, AI and Education: A reflection and an example).

What is amazing is not just that it wrote a poem (incidentally something that surprised AI researchers as well) but the kind of error it inserted in the poem—an error so subtle that even some mathematicians make the same mistake.

Now step back and consider what it takes to write such a poem. It requires understanding what a proof is, what a “good” error in the proof would be (something that undermines the proof and is difficult for people to catch), what a poem is and finally the ability to mash these understandings together to create an original poem. And to do this in a variety of styles, on demand.

To claim that this is the same as predicting the next word in a sentence undervalues all that is going on here.

There is no magic here—that is not the point I am making. This is most definitely not an argument for intentionality or agency on part of the LLM, just a statement about its way of functioning. And that these feats can be best explained by attributing some level of higher-level conceptual complexity to the LLM.

And here is where it gets interesting. The act of creating an abstraction means that some information is lost in the process. Concepts are by definition broader and more general than facts—that is their strength. But at the same time concepts are also fuzzy, they smudge details to get at a deeper or broader representation.

Thus, an important consequence of the fact that these LLMs make these higher order conceptual abstractions is that, they will, by necessity, hallucinate.

I have believed this for a while but some support for this perspective comes from some new research on the nature of the large language models. As this article in Quanta Magazine (New Theory Suggests Chatbots Can Understand Text) says:

New research may have intimations of an answer. A theory developed by Sanjeev Arora of Princeton University and Anirudh Goyal, a research scientist at Google DeepMind, suggests that the largest of today’s LLMs are not stochastic parrots. The authors argue that as these models get bigger and are trained on more data, they improve on individual language-related abilities and also develop new ones by combining skills in a manner that hints at understanding — combinations that were unlikely to exist in the training data…. [In other words] From all accounts, they’ve made a strong case that the largest LLMs are not just parroting what they’ve seen before.

Jerome Bruner in his classic book (Beyond the Information Given) argued that “the most characteristic thing about mental life, over and beyond the fact that one apprehends the events of the world around one, is that one constantly goes beyond the information given.” Clearly these LLMs do not apprehend the events of the world, BUT in another key aspect these technologies do go beyond the information given. In essence, what this research is demonstrating is that these large “LLMs don’t just rely on combinations of skills they saw in their training data” but rather they are generalizing, generating prose that was NOT in the training data.

And this brings us to the key idea about the nature of this medium. This act of generalization is essentially a process we should see as being “creative,” a leap into the unknown as it were (at the risk of being inaccurate, in fact accuracy is not the goal of these systems). The article has one of the researchers (Arora) as saying that their research does not provide any insights into the accuracy of what the LLMs write. In contrast, he says that:

“In fact, it’s arguing for originality… These things have never existed in the world’s training corpus. Nobody has ever written this. It has to hallucinate.”

Let this sink in.

(Italics mine)

Hallucinations are not a bug but rather, they are a feature of this technology. This is extremely important to keep in mind.

We may try to guardrail against it but these hallucinations will emerge, whether we like it or not. Whether it be in implicit biases (for instance see the amazing work by my colleague Melissa Warr on implicit bias – here and here) or in the ways that AI is weird (here and here).

THIS is the true nature of this new technology—and it is important we recognize and accept it.

These tools can be incredibly powerful as long as we recognize its true nature—and, as my colleague Ron Beghetto has suggested, they can be powerful tools for possibility thinking. Or, alternatively, and a bit more tongue-in-cheek, as I I have suggested, we should consider it as being a smart, drunk intern.

A few randomly selected blog posts…

Deconstructing TV news

The video below has been getting a lot of attention on the blogs lately, and despite that it is pretty good. No kittens riding skateboards or mentos and Coke here. Just a beautifully constructed take down of TV News. A must see for all media literacy courses. Check it...

Untangling a decade of creativity scholarship

Untangling a decade of creativity scholarship

How do we capture a program of scholarship in an image? This is particularly complicated when the work is a tangled web of connections between research, teaching and practice, spread out over multiple publications, presentations and people. One attempt to do...

Tasteless and offensive

Checking up on urban legends leads to tasteless and offensive error message. I recently received a forwarded email from a friend that listed a bunch of top-notch, companies that were filing for bankruptcy. The list included Blockbuster, Hollywood Video, Circuit City,...

Buttoning on to a trend

There is an barely interesting article on today's NYTimes.com site by Steven Heller on campaign souvenirs being sold by the three presidential candidates through their websites (read: From Mousepads to Piggy Banks). I thought his earlier columns on the graphic design...

Preparing educators for the 21st Century

Back in March of this year, Joel Colbert (friend and former chair of the AACTE Innovation and Technology Committee) spent a few hours working together on a document that AACTE was going to put out. Yesterday, at the meeting of the NTLS meeting in Washington DC, I...

New ambigram book, with 3 of my designs

Ambigrams Revealed: A Graphic Designer's Guide To Creating Typographic Art Using Optical Illusions, Symmetry, and Visual Perception is a new book edited by Nikita Prokhorov. The book showcases the works of ambigram artists from around the world. It includes...

Aesthetics & STEM education: A new framework

Aesthetics & STEM education: A new framework

I have always been intrigued by the nature and role of the aesthetic experience in learning. A few members of the Deep-Play research group have been exploring this issue for a while (for instance we have written on, why science teachers should care about beauty in...

Creativity in Surgery, Music & Cooking

Creativity in Surgery, Music & Cooking

Here is the next article in our series Rethinking Technology & Creativity in the 21st Century for the journal TechTrends. In this article we feature an interview with Dr. Charles Limb,  professor of Otolaryngology and a...

Guest blogging for Nashworld: TPACK video

Sean Nash over at Nashworld asked me to guest blog for this week while he is out with his students doing some really cool stuff. Here is a link to my posting: A TPACK video mashup!. I end the post with a couple of videos, one a commercial and the other my mashup...

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *