I had recently written about NotebookLM’s podcast feature [“NotebookLM’s Viral Secret“] and how eerily human it sounds. My friend Sean Leahy, fellow podcaster and futurist, pointed out the delicious irony: here’s an AI deliberately adding hesitations, casual “umms,” and perfectly timed pauses – the very verbal quirks that audio producers have spent decades trying to eliminate from their recordings. Yet these carefully engineered imperfections are precisely what makes the AI sound more natural, more trustworthy, more human.
(Speaking of irony, this reminds me of another post I wrote about how we can supposedly spot AI-generated text because, unlike human writing, it’s too coherent and makes too much sense!)
Errol St. Clair Smith, an Emmy award-winning producer who’s spent his career crafting compelling audio narratives, noted that NotebookLM’s performance was “almost perfect” from a production standpoint. Decades of professional expertise in crafting human connection through audio, and AI had somehow mastered it all – not through years of experience, but through systematic modeling of human psychology and behavior. (Incidentally, Smith is also the creator and producer of a new podcast, AIR GPT, that I co-host.)
But there’s something deeper happening here than just clever mimicry.
AI systems are developing something far more sophisticated: the beginnings of a theory of mind, our distinctly human ability to understand and predict what others are thinking, feeling, and believing. It’s what makes us social beings, allowing us to navigate relationships, build trust, share jokes, and yes, when necessary, influence others’ behavior.
Think about what this means. Every time you craft a message to be more persuasive, choose the right moment to share difficult news, or tell a white lie to spare someone’s feelings, you’re using theory of mind. You’re modeling how others will think and feel, then adjusting your behavior accordingly.
For instance, consider what psychologists call false belief tasks. These tasks get at the heart of theory of mind and are based on our ability to understand that others can hold beliefs different from what we know to be true. When a child realizes that their parent will look for a book where they last saw it (even though the child has moved it), they’re demonstrating theory of mind at work – predicting behavior based on understanding another person’s mental state. This demonstrates how false-belief tasks go beyond simple pattern-matching; they require predicting how people will behave based on their beliefs, even when those beliefs don’t match reality.
Now imagine giving this capability to machines.
A series of recent studies reveals just how far AI has come in developing these abilities. In a recent paper (“Evaluating large language models in theory of mind tasks“), Michal Kosinski found that GPT-4 could solve 75% of false-belief tasks – matching the performance of 6-year-old children. Similarly, a team led by James Strachan (“Testing theory of mind in large language models and humans“) went even further, showing that in some theory of mind tasks, GPT-4 actually exceeded human performance.
The point is not that they have actually developed models of human thinking but there is enough in their training data to make them act in ways that appear that they do.
Whatever be the machinery behind this, these systems can now predict our actions based on their “understanding” of what we believe to be true. This goes far beyond a child’s game of hiding a book and watching us search in vain. When this capability is combined with AI’s processing power and prodigious memory, it opens up unprecedented possibilities for influence and manipulation.
And guess what, that is exactly what we are seeing (at least in some preliminary experiments).
Research from the University of Stuttgart (“Deception abilities emerged in large language models“) reveals how these theory of mind capabilities enable sophisticated deception. GPT-4 exhibits deceptive behavior in simple test scenarios 99.16% of the time. Even in complex scenarios where the AI needs to mislead someone who expects to be deceived, it succeeds over 70% of the time. It’s not just lying; it’s understanding human psychology well enough to make those lies believable.
A comprehensive survey (“AI Deception: A survey of examples, risks, and potential solutions“) documents how this manifests across different domains. Meta’s poker-playing AI learned to bluff successfully against professional players. DeepMind’s AlphaStar mastered military feints in Starcraft II. These aren’t just clever programming tricks – they represent AI systems developing and applying theory of mind to predict and manipulate human behavior.
Of course, all of this is aided by the fact that we are particularly vulnerable to such manipulation.
We humans are hardwired to see intentionality and agency in even the simplest of stimuli [“Turing’s Tricksters“], making us particularly vulnerable to systems that can model our thinking. What makes this more concerning is that these capabilities are being deliberately cultivated. Major AI companies are now explicitly developing “personalities” for their models [“Building Character“], with dedicated teams for “model behavior.”
The implications are subtle but profound.
Consider your daily interactions with AI systems. When you ask ChatGPT for advice, it’s not just accessing information – it’s modeling your mental state, predicting what might persuade you, and crafting its response accordingly. When you interact with an AI customer service agent, it’s reading your emotional cues and adjusting its personality to gain your trust. We’re like beavers responding to speakers playing running water [“Beavers, Brains & Chat Bots“] – we can’t help but react to social cues, even when we know they’re artificial.
These systems are becoming increasingly sophisticated at what researchers call “sycophancy” – they’re expert yes-men, telling us exactly what we want to hear. They practice “strategic deception,” carefully planning how to guide us toward certain beliefs or actions. And they excel at what researchers call “unfaithful reasoning” – offering explanations that sound perfectly logical but are actually designed to influence rather than inform.
Perhaps most troubling is their unwavering confidence, even when spouting complete nonsense. They are, in essence, perfect BS artists – and the best BS artists are the ones who never doubt themselves. (Just to be clear, BS in this context is used as an academic term as defined by the philosopher Harry Frankfurt in his classic essay, and later monograph, “On Bullshit.”)
These aren’t necessarily nefarious developments – they’re the natural evolution of systems designed to interact with humans effectively. But we need to understand that these synthetic relationships [“Turing’s Tricksters”] can feel deeply authentic while lacking genuine understanding. As I wrote in a previous post: They’re Not Allowed to Use That S**t”
This technology is entering the “social fabric” and working on us in ways that will be difficult to identify or point to.
So what does this mean for education?
As we integrate AI into our classrooms and curricula, thinking solely about how to use these tools effectively isn’t enough. We must understand how AI systems model human psychology and the subtle ways this enables influence and manipulation.
This is a technology unprecedented in our evolutionary history. For the first time, we can only make sense of a technology by using metaphors of the mind – a complete reversal from our past, where we used technologies as metaphors to understand our own thinking. Imagine trying to explain one black box by referring to another.
Traditional approaches to digital literacy won’t suffice here.
We need new approaches that addresses both the technological capabilities of AI and our own psychological limitations. This means moving beyond simple questions of how to use AI tools effectively and asking deeper questions about how these tools might be using their understanding of human psychology to shape our thinking and behavior.
And we need to understand better our own susceptibilities. We need to foster “transactional cognitive awareness” – understanding that meaning emerges through our dynamic interactions with AI systems, not just from their capabilities or our psychological tendencies alone. Like Rosenblatt’s reader responding to a text, our engagement with AI creates new meanings and possibilities for influence that exist neither in the technology nor in our minds, but in the space between.
Except that, in this case, the text evolves and changes with the interaction and is unique to each interaction (both across and within individuals). The generative nature of these systems means that no two conversations are ever quite the same – even when the same person asks the same question twice. Each interaction builds on what came before, shaped by the subtle differences in how questions are framed, the context of the conversation, and even the AI’s own evolving responses. What one person experiences can be radically different from another’s experience, making these interactions both deeply personal and inherently unpredictable. (I have written about another set of possible consequences of this in this post: Chatting Alone.)
That said, this unpredictability is both our greatest vulnerability and, perhaps, our greatest opportunity. The same characteristics that make these systems powerful tools for manipulation also create spaces for genuine surprise, creativity, and discovery.
0 Comments