A recent study by Neil Selwyn, Marita Ljungqvist, and Anders Sonesson titled When the Prompting Stops begins with GenAI’s beguiling offer to educators: “What can I do for you?”
But what unfolds isn’t a story of seamless automation—it’s a story of repair. Drawing on interviews with 57 secondary school teachers in Sweden and Australia, the researchers uncover the invisible labor behind AI-assisted teaching. It should be noted that these were all teachers who had “experience using GenAI tools,’ and felt “confident in using GenAI for future teaching work.”
Far from simply prompting and pasting, teachers are checking, tweaking, rejecting, and re-authoring machine-generated outputs—not because they lack “prompt literacy,” but because AI lacks the contextual intelligence that teaching demands. As one teacher put it,
[GenAI] doesn’t know the context. It doesn’t know my students. It doesn’t know the school, the school values
or school rules. So, what it provides is general lesson plans which then I have to modify a lot.
All of this should come as no surprise to any educators who have worked with these tools. A couple of recent blog posts on this site have addressed these very concerns—though it is good to have empirical evidence for the same.
For instance, just the past week, in a post titled, The Edge Cases Are Endless: Google’s Digital Plastic and Other Curriculum-Shaped Objects, I unpacked three interrelated concepts: digital plastic (AI can be both useful and harmful to the broader ecosystem of education), curriculum-shaped objects (materials that mimic the form of good pedagogy while lacking its substance), and the insight that the edge cases of real teaching are endless. Selwyn’s study gives empirical weight to these ideas—showing that AI-generated content may pass surface inspection but consistently breaks down in the unpredictable, relational, and nuanced contexts where real teaching and learning happen.
In February, I wrote about the “expertise paradox” (in a post titled GenAI and Expertise Paradox: Why it makes expert work important but harder) how generative AI tools, rather than diminishing the need for expertise, actually makes it more essential. In fact, contrary to previous research which showed that expertise increased automaticity and reduced cognitive load, working with GenAI did the reverse. The research showed that the more expertise a worker had, the more cognitive effort they expended when using GenAI. Experts don’t just accept outputs; they interrogate them, restructure them, wrestle with them. AI amplifies their effort, not their ease.
Note, added August 5, 2025: A similar conclusion emerges from a field that has often been argued to be at the forefront of AI integration—programming. Here is an article from Ars Technica titled: Developer survey shows trust in AI coding tools is falling as usage rises. The key sentence is
“AI solutions that are almost right, but not quite” lead to more debugging work.

Debugging, now is part of an expert educator’s work-flow as well.
Selwyn’s study not only affirms this, but also illustrates a broader systemic illusion. As they write:
Rather than GenAI working for teachers, we find teachers having to undertake considerable amounts of work for GenAI – amending, rewriting, reworking and sometimes completely replacing generative AI outputs in order to make these outputs useable and useful for the classroom.
In their paper they build on the argument made by Mateescu and Elish in their work on self-checkout machines, that the apparent success of automation often hinges on invisible human labor, i.e. people smoothing out the machine’s rough edges. They argue that it is with GenAI in schools: expert teachers are doing the equivalent of scanning barcodes twice, bypassing broken prompts, restocking cognitive shelves. They make the system look better than it is. Their expertise doesn’t just fill in the gaps—it masks them.
In this they are aided by two things. First, their expertise, not just of the content but also of their specific classroom context and of their learners. Second, the care and pride they take in their work, to get it right. (I had addressed this issue of “caring” for the work, in another post (Dewey or don’t we care? Addressing the Novice’s Dilemma in Learning with Gen AI).
Now, the situation changes when we come to novice teachers—who may care deeply but just do not have the knowledge and expertise with which to judge the outputs of AI. Novices lack the pedagogical, curricular, and contextual knowledge to know when to intervene—or how. Without that professional filter, they are most vulnerable to mistaking curriculum-shaped objects for the real thing. While experts quietly repair, novices often don’t even recognize that something is broken.
They are in what we might call the “bottom-left quadrant” of my previous blog post: low on knowledge of AI as well as being low on expertise (which in this case could be content knowledge, pedagogical content knowledge, and even pedagogical sense-making, either individually or collectively). They lack the very filters that would allow them to recognize when GenAI is going wrong. And when you can’t see what’s broken, you can’t repair it.
Expert teachers know when to stop prompting and start editing (debugging). Novice teachers may not.
And clearly this is not a problem that more prompt engineering workshops will solve.
If anything it reiterates the importance of contextual expertise, of knowledge and judgement. Paradoxically, the advent of GenAI has made the need for teacher knowledge, judgment and insight even more important. This is not just about technology integration. It’s about the cultivation of professional judgment.
If Selwyn et. al. show us anything, it’s that AI doesn’t replace teacher expertise. It reveals it. It relies on it.






0 Comments