ChatGPT does not have a user manual. Let’s not create one.

by | Friday, May 17, 2024

Note: This is the next post in the shared blogging experiment with Melissa Warr and Nicole Oster. This time we question what and how we should be teaching about generative AI. The core idea and first draft came from Melissa, to which Nicole and I added revisions and edits. The final version emerged through a collaborative process between all three.

The hottest new trend in the ChatGPT era is “prompt engineering” — the magical art and science of writing prompts that can help GenAI tools to meet the needs of their users. It has been described as the job of the future. There are books and workshops, YouTube videos and more that promise to give you the keys to making generative AI more useful, productive and accurate.  We agree that learning to interact with GenAI tools is an important skill, and one that takes practice and experience to develop. 

That said, we believe that there is a fundamental mismatch here, mainly having to do with seeing generative AI as any other computer program and the prompt as being a way of getting the program to do exactly what we want. There is some truth to this but also misses some key factors having to do with the very nature of these technologies. 

Prompt Engineering- Does it have to be this hard?

Ethan Mollick has taken a balanced view to this approach—acknowledging that in some cases, such as when you are creating a prompt that will be reused or embedded in a program, it is worthwhile to learn to write something that will give consistent results (note, consistent here doesn’t mean “the same,” it means meeting specific needs based on requested topic or content). However, Mollick emphasizes that complex prompt engineering is not required to use GenAI effectively—and this is likely to be even more true as refinement of these models continue. There can be tips and tricks, for example giving it a persona or asking for multiple responses. But the complex techniques, such as chain-of-thought prompts, contextual arguments, etc, are not necessarily needed to be productive with these tools.

Here are four reasons that we might want to reduce our emphasis on prompt engineering for everyday users (and learners)–and move that attention to something else, such as critical thinking about what we are getting back:

Prompt engineering is often presented as a computational process of developing an input that will get a certain output. This is how most digital technologies work: there is code that tells it exactly what to do and it does it. However, GenAI tools are much more complex and relational. Ethan Mollick explained:

“if you think about it like programming, then you end up in trouble. In fact, there’s some early evidence that programmers are the worst people at using A.I. because it doesn’t work like software. It doesn’t do the things you would expect a tool to do. Tools shouldn’t occasionally give you the wrong answer, shouldn’t give you different answers every time, shouldn’t insult you or try to convince you they love you.

And A.I.s do all of these things. And I find that teachers, managers, even parents, editors, are often better at using these systems, because they’re used to treating this as a person. And they interact with it like a person would, giving feedback. And that helps you.”

A mindset that sees LLMs as programmable input-output machines focuses on patterns defined by the user, and success is defined by whether the output of the LLM is what the user expects. When focusing on prompt engineering, our focus tends to be on “did it do what I wanted it to do,” making it easy to miss the unexpected.

It is critical to consider the weirdness of these models and pay attention to unexpected behavior. Much of our own work has illustrated the bias in these models, bias that may be invisible to the naked eye (for example, see here and here). Rather than looking for expected results, users should be looking for the unexpected, the surprising, and even the disturbing. It is this open mindset that will allow users to discover both new uses of the tool and potentially harmful responses.

Prompt engineering techniques are built on what works for today’s LLMs—but we are in the early days of widespread use of LLMs, and the developers of these tools continue to refine their behavior to better match human needs. Creators of these tools have already begun to incorporate prompt engineering techniques into their models automatically. 

A few examples:

  • Perplexity’s Pro mode will ask for clarifications on details of prompts that may not have adequate details for exploration. 
  • ChatGPT 4 (and 4o) seems to do its own chain-of-thought process, breaking down complex tasks into one step at a time and performing each. Thus, instead of the user needing to give it each step (chain-of-thought prompting), it can do the breakdown itself. This doesn’t mean that chain-of-thought prompting might still work better, but as these flagship models improve, the most effective approaches will be integrated into the models themselves. 
  • When building a CustomGPT, ChatGPT provides suggestions of what to include as well as a framework for key elements such as role, context, tone, and constraints, all through a friendly chatbot.

For example, when Melissa was recently working with ChatGPT 4o, it offered two choices for the response. In Response 1, it started by breaking down the task step-by-step as if it was thinking through the process before executing, something suggested in prompt engineering methods. Response 2 didn’t follow this process, it just gave the result. It asked me to choose which response I preferred. The file given in Response 2 had several errors, whereas Response 1 was higher quality, so I chose Response 1 as the best. This tells the model that more preferred results might come from the first approach, making it more likely to use that approach in the future.

Whether or not these are deliberate applications of prompt engineering techniques, they are approaches developers are using to increase the effectiveness of their tools. In this way, prompt engineering becomes embedded into the model itself rather than requiring special skills from every user.

GPT4o: Embedded Prompt Engineering

Furthermore, as the models progress, the most effective prompt engineering techniques will likely change (already some techniques work better on some models than others). So learning a discrete set of techniques is unlikely to be beneficial in the long run.  It may be far better to develop flexible ways of thinking and mindsets that grasp the variability and creativity required for best taking advantage of these LLMs.

For everyday users, a belief that we need complex skills to use GenAI tools effectively could limit the perceived agency to just explore and do things. If we have to carefully construct every prompt, how do we simply play and explore what the models can do? How do we come up with original uses if every use is carefully planned out? It is through this exploration that we can build the mindsets and skills to interact with this strange and ever-changing thing. Returning to Ethan Mollick:

There is no list of how you should use this as a writer, or as a marketer, or as an educator. They [the developers] don’t even know what the capabilities of these systems are. They’re all sort of being discovered together. And that is also not like a tool. It’s more like a person with capabilities that we don’t fully know yet.

We must play with the GenAI to discover new capabilities, not just replicate templates for uses others have developed.

At some level, the question is not WHAT we should teach but HOW we teach. We have in previous publications argued for a form of open-ended design-based pedagogical approaches that we have called Deep-Play. Briefly, deep-play refers to the immersive engagement with the intricate problems of pedagogy, technology, and their interrelationships. Through deep-play, educators transition from being passive technology users to active designers who creatively repurpose tools and artifacts to achieve their goals and desires. The core objective of learning technology through design courses is to shift away from a utilitarian view of technology, fostering a flexible, context-sensitive, and learner-driven approach. We have argued that learning to use technology for teaching requires adopting a new mindset that prioritizes developing adaptable strategies for deeply considering the role of technology in education over merely mastering specific tools. 

We believe that these same strategies and mindsets apply to generative AI.  

A few randomly selected blog posts…

How cool is that!

I just read on CNN that Obama's likely nominee for energy secretary is physicist and Nobel Laureate Dr. Steven Chu. What a contrast to the previous administration's science policy. (Actually it is still the current administration!) Has a novel prize winner ever served...

Plugin’ into superpowers

Plugin’ into superpowers

I have been playing with couple of the newly released ChatGPT plugins (you have to have the paid version to use them) and want to share some of my early experiments. The two I am going to talk about are the ChatWithPDF and the Wolfram plugins. Short answer, they are...

Sliding into 2018

Sliding into 2018

Over the years our family has developed a mini-tradition of creating short videos to celebrate the new year. These videos are short, always typographical, and usually incorporate some kind of a visual illusion. Our craft has improved over the years, something that can...

Creativity, computers & the human soul

In his article Is Google making us stupid? the author Nicholas Carr takes Sergi Brin to task for something he had said in a 2004  interview with Newsweek. Brin is quoted as saying “Certainly if you had all the world’s information directly attached to your brain, or an...

Cheating in a test, why that’s the way to go

I just read this wonderful essay by UCLA professor Peter Nonacs titled: Why I Let My Students Cheat On Their Game Theory Exam. In this essay he describes an experiment he recently conducted in his game theory class. This is what he told his students a week before the...

TPACK is top story on eSchool News

I just discovered that TPACK made the Top Story of the Week for Educators on eSchool News! Written by Laura Devaney, Senior Editor of eSchoolNews the article is titled, TPACK explores effective ed-tech integration. It is a pretty comprehensive piece with quotes from...

Going back home

Amita Chudgar, friend and colleague, just sent me this really nice article in today's NYTimes, titled "India Calling" about the second generation of Indian Americans who are now going back to India. These are kids born and brought up in the US, whose parents had...

The one rule of teaching

Pauline Kael is regarded to be one of the best film reviewers to have ever lived. Sam Sacks has a piece on Kael in which he describes her style of film review, one based less on academic nitpicking and the presence (or absence) of directorial flourishes than on her...

France Sings for USA

In a previous post I talked about Pangea Day and the Imagine anthem series, where people from one country sing the national anthem of another. Here's another one, France sings for the USA. Enjoy. [youtube]http://www.youtube.com/watch?v=3T60NaNPiMg[/youtube]

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *