S’more Problems: Generative AI, Marshmallows, and the Flattening of Culture

by | Monday, September 02, 2024

A few days ago, The Washington Post published a story that caught my eye. Titled: The Marshmallow Test and other predictors of success have bias built in, researchers say, the article discusses the famous Marshmallow Test, long heralded as a predictor of future success.

The Marshmallow Test is a psychological experiment developed in the late 1960s by Stanford psychologist Walter Mischel. In this test, children are offered a choice: eat one marshmallow immediately or wait for a short period (usually about 15 minutes) to receive two marshmallows. The ability to delay gratification, as measured by waiting for the second marshmallow, was later correlated with various positive outcomes in life, including better academic performance, higher SAT scores, lower body mass index (BMI), and even higher income in adulthood.

This simple test gained immense popularity and has been widely cited as a predictor of future success. It’s been used to argue for the importance of self-control and delayed gratification in achieving long-term goals. The test and its interpretations have influenced educational policies, parenting advice, and even economic theories about personal success and social mobility.

However, the WaPo story complicates the narrative. For instance, the story described how, when the same test was applied to Yucatec Maya children, the results were puzzling. Instead of patiently waiting for a second marshmallow, many children simply left the room. And this was true not just of this test but a range of other tests that have been often been used to study children’s “executive functions”, i.e. the mental muscles that help us stay focused, think on our feet, and actually get stuff done.

The researchers argue that it was not because they lacked self-control but because sitting alone in a room doing nothing made little sense in their cultural context. As the article says,

The researchers raised a pointed question: If a child from a poor family or a child from a different culture doesn’t perform well, is the fault in the child or the test?

And these findings really complicate a whole area of research that has focused on determining human universals in how the human mind develops. What this research shows is that these “universals” are culturally and context dependent.

Of course these are preliminary findings and there is much more research that needs to be done to explicate the complex relationship between human development and culture.

But enough about marshmallows. Let’s sink our teeth into the AI flavor of the year: Large Language Models (LLMs).

What do these findings mean for the training and outputs of LLMs such as ChatGPT, Gemini, Claude and more.

As I have argued elsewhere, when our AI systems are fed with data primarily from Western, middle-class contexts, we risk creating technology that misunderstands, misclassifies, or even discriminates against vast swathes of the global population. In other words these data are WEIRD (Western, Educated, Industrialized, Rich and Democratic). Just as the Marshmallow Test failed to account for cultural differences in how children behave and learn, AI trained on WEIRD datasets may fail to understand or properly serve diverse global populations.

Another prime example of a WEIRD construct deeply embedded in AI systems is the concept of the “terrible twos.” (For instance, see this video about the role of culture in creating our reality). This Western notion characterizes two-year-olds as particularly difficult, defiant, and prone to tantrums. However, this is far from a universal experience across cultures. In many non-Western societies, where children are integrated differently into family and community life, this phase is not observed or is not seen as particularly challenging. Yet, an AI trained on predominantly Western data will likely reinforce this idea globally, potentially pathologizing normal developmental stages in cultures where the “terrible twos” simply don’t exist.

These examples do not just illustrate a potential danger—they expose a present and persistent problem in AI, particularly in Large Language Models (LLMs). These models have already been trained on vast corpora of data that speak glowingly of the Marshmallow Test and similar WEIRD psychological constructs. The biases are not a future risk, but a current reality deeply embedded in these systems.

The implications are far-reaching and resistant to change. New studies challenging these WEIRD concepts, if they appear at all, will barely make a dent in the massive corpus of data that affirms them. This means that for years to come, LLMs will continue to propagate these biases.

Perhaps most alarmingly, as the article points out, these culturally biased assessments can make children who are perfectly functional in their own contexts feel inadequate or “behind.” When theories developed in WEIRD contexts are taught as universal truths about human psychology, we risk creating a world where cultural differences are pathologized rather than celebrated. A psychologist or parent in Ghana or Mexico, consulting AI for advice, may be told their child lacks “executive control” or “self-regulation” based on behaviors that are perfectly normal and adaptive in their cultural context.

As an academic studying these issues, I feel compelled to point out these problems and make people aware of them. However, I’m increasingly pessimistic about our ability to truly fix this situation. The world of AI development is deeply entrenched in WEIRD perspectives, and the momentum of technological progress often outpaces our ability to correct course.

The homogenizing force of AI, trained on biased datasets and deployed globally, seems almost unstoppable. While we can strive to be more aware of these biases in our own work and thinking, the broader trend of cultural flattening through technology appears set to continue.

And yet, the story of human culture gives us reason for cautious optimism. Consider the journey of cultural forms like jazz and rap. Born from the specific context of African American experiences in the United States, these art forms have spread across the world, taking on rich and varied flavors in different cultural contexts. From Japanese jazz fusion to French hip-hop, these once-localized forms of expression have been adopted, adapted, and transformed by diverse cultures worldwide. Perhaps AI will follow a similar path. As it spreads across the globe, different cultures may find ways to reshape and repurpose these technologies, infusing them with local knowledge, values, and ways of thinking. While the initial wave of AI may be WEIRD, the ingenuity and adaptability of human cultures might, just might, lead to newer, better culturally sensitive forms of AI.

To be honest, I am not too optimistic. The WEIRD forces we face are far too strong – as I have written about elsewhere.

A few randomly selected blog posts…

Youth participatory creativity in digital spaces

Youth participatory creativity in digital spaces

Ioana Literat is Assistant Professor in the Communication, Media, and Learning Technologies Design program at Teachers College, Columbia University, and the Associate Director of the Media & Social Change Lab (MASCLab). Her research focuses on the dynamics of...

Natural v.s. Artificial Intelligence in Teaching

The field of educational technology is littered by attempts to replace the teacher by creating some kind of a technological learning system that would make the teacher redundant. All such attempts have failed. This has, however, not prevented people from trying. This...

Killing with a thought

I had recently posted a note (It's only a game...) building on some thoughts in an article by William Saletan. In this article Saletan describes how weapons are increasingly becoming like games. His recent post takes that whole thing one level further. He describes...

A visit to Israel

A visit to Israel

I just got back from a trip to Israel. I was invited by the MEITAL 2019 conference and the Kibbutzim College of Education, Technology and the Arts. MEITAL is an organization of higher education institutions in Israel focusing on understanding and responding to local...

It Takes Two: A scientific romp using AI

It Takes Two: A scientific romp using AI

Dark 'n' Light is an e-zine that "explores science, nature, social justice and culture, through the arts and humanities." It is a labor of love by a small, dedicated team led by Susan Matthews, former legal and policy wonk, turned editor and podcaster. I came to know...

Student Panel at FOLC Fest

Student Panel at FOLC Fest

On March 14-15, 2024, Arizona State University hosted its inaugural Future of Learning Community (FOLC) Fest at the Omni Hotel in Tempe. This conference brought together educators, technologists, and student success advocates to explore how ASU can fulfill its charter...

MAET virtual help desk

Theresa Hamilton & Amy Gracik are two of our Technology Interns in Education. They are now part of a pilot project to offer software technology support to students in our MAET program. This help-desk available online at http://groups.google.com/group/maetsupport....

6 Videos (on the 5 spaces for design in Education)

6 Videos (on the 5 spaces for design in Education)

Learning Sparks is a new initiative at ASU that feature short, 5-minute, videos showcasing the expertise of a range of ASU faculty members. These videos are carefully crafted, with high-production values seeking to capture big ideas in bite-sized chunks. A few months...

Happy Thanksgiving

Happy Thanksgiving

 I feel lucky (and quite undeserving) for all that I have been given in this life. It is no surprise that Thanksgiving is my favorite day of the year. To celebrate Thanksgiving 2016 here is a new version an ambigram that I had made before.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *