ChatGPT as a blurry jpeg of the web

by | Sunday, February 12, 2023

Ted Chiang is one of the greatest, insightful writers working today. I had written previously about one his short stories in a post titled: Truth of fact and feeling: Unpacking McLuhan (2/3) about his short story The truth of fact and the truth of feeling. (If you haven’t read it – please do so).

In a recent piece in the New Yorker, (ChatGPT is a blurry jpeg of the web) Ted Chiang focuses on ChatGPT, and offers one of the best descriptions of the technology, making an analogy with lossy compressions systems (such as the Jpeg file format that we usually use to compress images). The essay is worth reading in full, but here are some key quotes:

… Think of ChatGPT as a blurry jpeg of all the text on the Web. It retains much of the information on the Web, in the same way that a jpeg retains much of the information of a higher-resolution image, but, if you’re looking for an exact sequence of bits, you won’t find it; all you will ever get is an approximation. But, because the approximation is presented in the form of grammatical text, which ChatGPT excels at creating, it’s usually acceptable. You’re still looking at a blurry jpeg, but the blurriness occurs in a way that doesn’t make the picture as a whole look less sharp.

This analogy makes even more sense when we remember that a common technique used by lossy compression algorithms is interpolation—that is, estimating what’s missing by looking at what’s on either side of the gap. When an image program is displaying a photo and has to reconstruct a pixel that was lost during the compression process, it looks at the nearby pixels and calculates the average. This is what ChatGPT does when it’s prompted to describe, say, losing a sock in the dryer using the style of the Declaration of Independence: it is taking two points in “lexical space” and generating the text that would occupy the location between them. (“When in the Course of human events, it becomes necessary for one to separate his garments from their mates, in order to maintain the cleanliness and order thereof. . . .”) ChatGPT is so good at this form of interpolation that people find it entertaining: they’ve discovered a “blur” tool for paragraphs instead of photos, and are having a blast playing with it.

Given GPT-3’s failure at a subject taught in elementary school, how can we explain the fact that it sometimes appears to perform well at writing college-level essays? Even though large language models often hallucinate, when they’re lucid they sound like they actually understand subjects like economic theory. Perhaps arithmetic is a special case, one for which large language models are poorly suited. Is it possible that, in areas outside addition and subtraction, statistical regularities in text actually do correspond to genuine knowledge of the real world?

I think there’s a simpler explanation. Imagine what it would look like if ChatGPT were a lossless algorithm. If that were the case, it would always answer questions by providing a verbatim quote from a relevant Web page. We would probably regard the software as only a slight improvement over a conventional search engine, and be less impressed by it. The fact that ChatGPT rephrases material from the Web instead of quoting it word for word makes it seem like a student expressing ideas in her own words, rather than simply regurgitating what she’s read; it creates the illusion that ChatGPT understands the material. In human students, rote memorization isn’t an indicator of genuine learning, so ChatGPT’s inability to produce exact quotes from Web pages is precisely what makes us think that it has learned something. When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression.

Do read the actual piece… it is worth your time: ChatGPT is a blurry jpeg of the web

A few randomly selected blog posts…

China, Australia, Nepal & Australia: A zoom tour

China, Australia, Nepal & Australia: A zoom tour

Over the past couple of months I have been asked to give presentations at a variety of different conferences or organizations spread across the world. They are archived below. I was invited to give a talk as part of the Dean's Lecture Series at the School of...

Cellphone in classrooms: The Saline story

From the Saline Schools, right here in Michigan, comes a video about how teachers and students are using cellphone in the classroom to enhance teaching and learning. Check it out h/t Superintendent Scot Graden's Blog

A published poet! Yes!

I am now, officially, a published poet!             My poem on imaginary numbers (The Mathematical "i") was published in the March 2013 issue of At Right Angles, a school mathematics journal.  You can read my poem on my website here: The Mathematical "i" You can...

Design: Fixing clocks | Negotiating Systems

Design: Fixing clocks | Negotiating Systems

I just came across a quote from Alan Kay while browsing the web. Alan Kay is a programmer, educator, jazz musician and one of the key inventors of computing as we know it today. He received the A. M. Turning award (informally known as the Nobel Prize of Computing) and...

Flip/Flop: Goodbye 2022 – Welcome 2023

Flip/Flop: Goodbye 2022 – Welcome 2023

Since 2008 our family has been creating short videos to celebrate the end of one year and the beginning of another. Our videos are always typographical in nature with some kind of an AHA! moment or optical illusion built in. This year’s video is no different. Check it...

Disseminating Action Research

Disseminating Action Research

The difference between theory and practice is, in theory, somewhat smaller than in practice — Frank WestphalKnowledge is not simply another commodity. On the contrary. Knowledge is never used up. It increases by diffusion and grows by dispersion— Daniel J....

Brevity is the soul

I had posted earlier (see Twittering a tale) about short, short fiction that is suddenly the rage. Matt Koehler just introduced me to another example of this new emerging genre: Six Word Memoirs. Check it out.

The value of school: Part 1

The value of school: Part 1

Note 1: This is the first of two posts on the value of school by Punya Mishra & Kevin Close. Read the second post: Revisiting Accountability. Note 2: These two blog posts became the basis of an article. Full citation and link below: Mishra, P., & Close, K....

Contextualizing TPACK within systems and culture

Contextualizing TPACK within systems and culture

Melissa Warr and I were recently asked to write a afterword to a special issue of the journal Computers in Human Behavior. The focus of the special issue was on the kinds of knowledge, skills and attitudes (KSA) teachers need to successfully integrate technology in...

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *