S’more Problems: Generative AI, Marshmallows, and the Flattening of Culture

by | Monday, September 02, 2024

A few days ago, The Washington Post published a story that caught my eye. Titled: The Marshmallow Test and other predictors of success have bias built in, researchers say, the article discusses the famous Marshmallow Test, long heralded as a predictor of future success.

The Marshmallow Test is a psychological experiment developed in the late 1960s by Stanford psychologist Walter Mischel. In this test, children are offered a choice: eat one marshmallow immediately or wait for a short period (usually about 15 minutes) to receive two marshmallows. The ability to delay gratification, as measured by waiting for the second marshmallow, was later correlated with various positive outcomes in life, including better academic performance, higher SAT scores, lower body mass index (BMI), and even higher income in adulthood.

This simple test gained immense popularity and has been widely cited as a predictor of future success. It’s been used to argue for the importance of self-control and delayed gratification in achieving long-term goals. The test and its interpretations have influenced educational policies, parenting advice, and even economic theories about personal success and social mobility.

However, the WaPo story complicates the narrative. For instance, the story described how, when the same test was applied to Yucatec Maya children, the results were puzzling. Instead of patiently waiting for a second marshmallow, many children simply left the room. And this was true not just of this test but a range of other tests that have been often been used to study children’s “executive functions”, i.e. the mental muscles that help us stay focused, think on our feet, and actually get stuff done.

The researchers argue that it was not because they lacked self-control but because sitting alone in a room doing nothing made little sense in their cultural context. As the article says,

The researchers raised a pointed question: If a child from a poor family or a child from a different culture doesn’t perform well, is the fault in the child or the test?

And these findings really complicate a whole area of research that has focused on determining human universals in how the human mind develops. What this research shows is that these “universals” are culturally and context dependent.

Of course these are preliminary findings and there is much more research that needs to be done to explicate the complex relationship between human development and culture.

But enough about marshmallows. Let’s sink our teeth into the AI flavor of the year: Large Language Models (LLMs).

What do these findings mean for the training and outputs of LLMs such as ChatGPT, Gemini, Claude and more.

As I have argued elsewhere, when our AI systems are fed with data primarily from Western, middle-class contexts, we risk creating technology that misunderstands, misclassifies, or even discriminates against vast swathes of the global population. In other words these data are WEIRD (Western, Educated, Industrialized, Rich and Democratic). Just as the Marshmallow Test failed to account for cultural differences in how children behave and learn, AI trained on WEIRD datasets may fail to understand or properly serve diverse global populations.

Another prime example of a WEIRD construct deeply embedded in AI systems is the concept of the “terrible twos.” (For instance, see this video about the role of culture in creating our reality). This Western notion characterizes two-year-olds as particularly difficult, defiant, and prone to tantrums. However, this is far from a universal experience across cultures. In many non-Western societies, where children are integrated differently into family and community life, this phase is not observed or is not seen as particularly challenging. Yet, an AI trained on predominantly Western data will likely reinforce this idea globally, potentially pathologizing normal developmental stages in cultures where the “terrible twos” simply don’t exist.

These examples do not just illustrate a potential danger—they expose a present and persistent problem in AI, particularly in Large Language Models (LLMs). These models have already been trained on vast corpora of data that speak glowingly of the Marshmallow Test and similar WEIRD psychological constructs. The biases are not a future risk, but a current reality deeply embedded in these systems.

The implications are far-reaching and resistant to change. New studies challenging these WEIRD concepts, if they appear at all, will barely make a dent in the massive corpus of data that affirms them. This means that for years to come, LLMs will continue to propagate these biases.

Perhaps most alarmingly, as the article points out, these culturally biased assessments can make children who are perfectly functional in their own contexts feel inadequate or “behind.” When theories developed in WEIRD contexts are taught as universal truths about human psychology, we risk creating a world where cultural differences are pathologized rather than celebrated. A psychologist or parent in Ghana or Mexico, consulting AI for advice, may be told their child lacks “executive control” or “self-regulation” based on behaviors that are perfectly normal and adaptive in their cultural context.

As an academic studying these issues, I feel compelled to point out these problems and make people aware of them. However, I’m increasingly pessimistic about our ability to truly fix this situation. The world of AI development is deeply entrenched in WEIRD perspectives, and the momentum of technological progress often outpaces our ability to correct course.

The homogenizing force of AI, trained on biased datasets and deployed globally, seems almost unstoppable. While we can strive to be more aware of these biases in our own work and thinking, the broader trend of cultural flattening through technology appears set to continue.

And yet, the story of human culture gives us reason for cautious optimism. Consider the journey of cultural forms like jazz and rap. Born from the specific context of African American experiences in the United States, these art forms have spread across the world, taking on rich and varied flavors in different cultural contexts. From Japanese jazz fusion to French hip-hop, these once-localized forms of expression have been adopted, adapted, and transformed by diverse cultures worldwide. Perhaps AI will follow a similar path. As it spreads across the globe, different cultures may find ways to reshape and repurpose these technologies, infusing them with local knowledge, values, and ways of thinking. While the initial wave of AI may be WEIRD, the ingenuity and adaptability of human cultures might, just might, lead to newer, better culturally sensitive forms of AI.

To be honest, I am not too optimistic. The WEIRD forces we face are far too strong – as I have written about elsewhere.

A few randomly selected blog posts…

TPACK & Art Education

Camille Dempsey, a professional development consultant in instructional technology, education, arts and leadership as well as a doctoral candidate in in the Leadership and Instructional Technology Program at Duquesne University has been " investigating TPACK in...

Happy Diwali

Happy Diwali For an interactive card click here ... . Remember to turn your volume way up, and click anywhere in the sky above the Taj Mahal for some environmentally friendly, fireworks.

New ambigram: Motivation

Just as the subject line says, new ambigram design this time for the word "motivation"

Improv here, there, everywhere…

A few months ago I wrote about Professor R. K. Joshi (here and here). He was, as I said in the piece, maybe the single greatest influence on my role as a teacher. I had mentioned that R.K. loved absurdity and play. I was reminded of this when I read about this group...

David Jiles, Ph.D., Creativity Expert, Plagiarist!

I do not know who David Jiles, Ph.D. is. I do know that he is a plagiarist, an intellectual crook and a thief of the highest order! I do not write these words lightly. I was first informed of this by Mike Deschryver and have over the past few days attempted to look...

TPACK Newsletter #2: Feb09 Edition

TPACK Newsletter, Issue #2: Special SITE conference issueLate February 2009 Welcome to the second edition of the TPACK Newsletter. If you are not sure what TPACK is, please feel free to surf over to www.tpack.org to find out more. Gratuitous Quote about Technology...

The loneliness of a long distance migrant

“On bad days, I do feel lonely in a way that I can’t explain,” so says Dilip Ratha, a World Bank economist who studies the economics of migration. The article, a profile of Ratha's life and work, is worth a read, but what really stood out for me was the above quote,...

Wicked problems, Design & Horst Rittel

Matt Koehler and I have often talked about the wicked problems of design and teaching with technology (most specifically in our handbook chapter on TPACK). We take the idea of wicked problems from a classic paper written by Rittel and Weber back in 1973. As Wikipedia...

Expert eyes on creativity

Expert eyes on creativity

Since 2012, the Deep-Play Research Group has been publishing a series of articles under the broad rubric of Rethinking Technology & Creativity in the 21st Century in the journal Tech Trends. This has led to 33 articles (and counting) and...

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *