Whose Voice? Whose Accent? Navigating Authenticity & Impact in AI-Generated Content

by | Friday, September 12, 2025

I’ve had the pleasure of co-hosting the AIR|GPT podcast, where I’ve gotten to know Errol St. Clair Smith as one of the most thoughtful curators of education-related news and information I’ve encountered. Errol has this uncanny knack for bringing diverse voices together through his BRN (BAM Radio Network) platform, creating spaces where meaningful conversations about education can flourish. His latest experiment pushes these boundaries even further—and it’s both impressive and maybe a bit unsettling.

Recently, Errol shared with me his ambitious experiment with NotebookLM and AI-generated podcasts. What caught my attention wasn’t just the technology itself, but his approach to it. While others might generate a podcast in minutes and call it done, Errol spent 18 hours crafting a hybrid model that combines AI capabilities with professional journalism and production expertise. This is the difference between an expert and a novice in any field.

Here is the episode titled: Moonshot: Google’s New AI Push Into Classrooms, Some Educators Are Thrilled, While Others See A Troubling Trend

And this is part of a broader experiment that Errol is engaged in, what he is calling “a playground” for experimenting with AI to create, curate, new content building on 18 years of audio archives. Learn more at BRN-X: The Gen AI Podcast Lab or at How Resisting Fixed Ideas About “AI Slop” Led to a More Reliable Way to Identify What Works in Education.

As I wrote in a recent blog post: “Expert teachers know when to stop prompting and start editing. Novice teachers may not.” Errol embodies this principle perfectly. He understands that the real work begins after the AI generates its first draft. The magic happens in the iterative process—the 10, 15, or 25 versions it takes to get the AI hosts to tell the story he envisions rather than the one the AI assumes he wants.

But here’s where things get complicated for me. While I admire Errol’s curatorial expertise and his thoughtful approach to AI experimentation, I find myself viscerally reacting to the voices that NotebookLM generates. It’s not that they sound artificial—they don’t. It’s that they carry what I can only describe as a “certain Midwestern TV-news blandness” that feels culturally homogenizing.

My voice is integral to who I am—Indian accent plus 30+ years of living in the US. It represents my journey, my perspective, my authenticity. And it is a “voice” that I have attempted to capture not just in my public speaking, or on this blog, but also in my academic writing. In fact, I still remember when Chip Bruce, one of my professors and mentors in graduate school, said that reading one of my papers was like listening to me speak! That, to me, was the highest compliment. That connection between my written and spoken voice is something I’ve cultivated deliberately over years.

I have also written earlier (in a post titled: How to fix your Indian accent) about how bothered I was when I discovered that Adobe’s Podcast Enhance tool—ostensibly designed just to clean up audio quality—was subtly altering my accent without my knowledge or consent. While the software claimed to simply remove background noise and improve audio quality, it was actually taking the ‘edges’ off my Indian accent, making me sound more ‘American.’ As my colleague Melissa Warr put it after listening to the enhanced version: ‘Adobe totally changed your voice—it doesn’t sound like you, it sounds fake… It really seems to take out your identity.’ This wasn’t advertised as a feature; it was just something built into the system. The software was making editorial decisions about whose voice was worth preserving and whose needed ‘correction.’

This raises profound questions about representation and authenticity in our AI-mediated future. Errol made an interesting point in our exchange: these bland, familiar voices “resonate with a broad audience” and represent “the voices that will be familiar to over 100 million students through Google Classroom.” There’s strategic wisdom in this observation—sometimes the message matters more than the messenger.

But I can’t help wondering: what are we losing when we default to these sanitized, culturally neutral voices? What happens to the rich tapestry of accents, inflections, and linguistic diversity that makes human communication so vibrant?

Despite my concerns about voice, I’m genuinely excited about what Errol is creating. His Gen AI Podcast Lab has already spawned five new AI-hosted shows, each tackling different aspects of education. More importantly, he’s not positioning this as an either-or proposition. As he put it, “AI generated podcasts are not likely to replace traditional podcasts—each can do things the other can’t.”

This experimental space he’s creating allows us to explore questions we couldn’t ask before: How can AI help us process and synthesize massive amounts of educational content? What happens when we can turn any collection of documents into an engaging conversation? How might these tools democratize access to complex information?

What makes Errol’s work particularly valuable is his understanding that being an expert in this medium makes him an excellent curator. He’s not just throwing content at NotebookLM and hoping for the best. He’s applying decades of broadcasting expertise to guide, shape, and refine the output. The result is something genuinely new—neither purely human nor purely AI, but a thoughtful collaboration between the two.

Errol’s experiments are opening up new spaces for all of us to think and experiment. The question isn’t whether AI will change how we communicate—it already has. The question is whether we’ll approach these changes with the thoughtfulness of an expert curator or the haste of someone who mistakes the first draft for the final product.

A few randomly selected blog posts…

New media, new genres

There is an interesting article in today's NYTimes titled Content and its discontents by Virginia Heffernan. In this article she makes the argument the new digital, online media require new ways of representing information, new ways of thinking about how ideas are...

Banksy’s biggest trick OR why I hate art museums

I have been a fan of Banksy, the subversive British street artist, for a long time. I love the visuals he comes up with, the subversive quality of his art and most importantly his ability to take art out of the galleries into the real world. His most recent trick,...

Thoughtless acts? Technology, creativity & teaching

I have always been interested in the manner in which people use (or re-use) everyday things for purposes they were never intended for. Be it a piece of red tape to mark a glass door so that people don't slam into the glass (as I see at the MSU clinical center every...

India Week @ Erickson Hall

The Indian community in the greater Lansing area celebrates India Week every year (more or less) around March. [More details here and here.] As a part of this event I (and other members of the College of Education) have been organizing an Indian themed breakfast and...

Coding + Aesthetics: New Journal Article

Coding + Aesthetics: New Journal Article

Does beauty have a role to play in learning to code? Can code aspire to beauty and elegance? In this article, we argue that it does and it should. Read on... Good, J., Keenan, S. & Mishra, P. (2016). Education:=Coding+Aesthetics; Aesthetic Understanding,...

ON@TCC: Do not toss aside lightly…

One Night at the Call Center is the second novel by Chetan Bhagat. I picked it up from the library, since I had read nice things about it on some website somewhere. What a tragic waste of time. This is a terrible novel - maybe the worst I have read in a long, long...

Total eclipse of the moon

Tonight was a full lunar eclipse - the last one we will have till December 2010. Lucky for us we had a pretty clear sky - a welcome change from the past few days. Shreya and Soham and I tracked it since it started till it was almost complete - and then they had to go...

A visit to Israel

A visit to Israel

I just got back from a trip to Israel. I was invited by the MEITAL 2019 conference and the Kibbutzim College of Education, Technology and the Arts. MEITAL is an organization of higher education institutions in Israel focusing on understanding and responding to local...

3 Comments

  1. Carlin Borsheim-Black

    Thank you so much for your posts about AI. I am planning to share your writing with a faculty work group at CMU on this topic. I just wanted to comment, because I think this post is particularly important–and I think it could be underscored even more emphatically. You said, “But I can’t help wondering: what are we losing when we default to these sanitized, culturally neutral voices? What happens to the rich tapestry of accents, inflections, and linguistic diversity that makes human communication so vibrant?” I would argue that these sanitized voices are not “culturally neutral” at all. Erasing accents is not neutral at all. This example is so, so important. Thank you for sharing!

    Reply
    • Punya Mishra

      Thank you for your kind words.

      Reply
  2. Errol St.Clair Smith

    Punya, I thoroughly enjoyed the rich and candid discussions we had on this topic. Your points around appreciating and preserving your authentic voice resonate with me deeply–as I also have a non-Midwestern voice that even Eleven Labs was unable to duplicate with fidelity.

    The cultural homogenization piece is interesting because it requires historical context to appreciate and evaluate. From my research, little is known about how Notebook LM’s synthetic voices were created. Did they use voice actors or curate audio from various sources? It’s a known unknown. What is clear is that the voice profiles are very familiar to most of us. As I watched the Falcons and Vikings game this weekend, I heard those voice profiles in the color commentary. As we discussed, the Midwestern vocal profile is ubiquitous because it’s broadly understood, which, of course, is priority one across the communications field.

    That said, much has changed since I was not accepted into a broadcast training program because of my “New York” accent. Back then, the Midwestern vocal profile was what most on-air talent was aiming to achieve, which likely explains why the profile is still ubiquitous. However, today’s media landscape encompasses a diverse range of voices, including some with an Indian flavor. Fareed Zakaria immediately comes to mind as one of the most globally respected voices in broadcast journalism. But Zakaria is not just a token; there’s Donie O’Sullivan, Christiane Amanpour, and Harry Enten, whose New York accent is unmistakable.   Shift to late-night TV and there was Trevor Noah’s South African accent that viewers loved, James Corden, and John Oliver, whose British accent and sensibility took home another Emmy last night.

    Moreover, digital media has displaced broadcast media as the leading source of news, information, and entertainment. The shift has unleashed and popularized voices of every kind imaginable. 

    Finally, though I have no inside confirmation, I believe Google may already be working on adding a variety of voices. I’ve heard glimpses of them in some of the audio clips Notebook LM has generated. So I could be wrong, but I suspect that in the not-too-distant future, Notebookm LM will offer voice options that will make you smile.

    Reply

Submit a Comment

Your email address will not be published. Required fields are marked *