AI Bias in Education Hides Where Teachers Cannot See It

I have argued for years that pedagogy decides whether AI helps or hurts a classroom. But pedagogy alone cannot fix a tool that arrives carrying biases its users cannot see. That is the uncomfortable territory Warr and Heath (2025) walk into with their new study in the Journal of Teacher Education, and it is the kind of work that should reshape how teacher educators think about AI bias in education.

Their argument starts with a concept most teachers have heard during their preparation: the hidden curriculum. Apple and Giroux used the term decades ago to describe the unspoken lessons schools pass along about who belongs, who gets authority, and whose knowledge counts. Warr and Heath ask a question that feels obvious once it lands. What happens when a large language model, trained on the same culture that produced those unspoken lessons, becomes a feedback partner for students and teachers?

The Audit Method and Why It Works

Warr and Heath built what they call an evocative technology audit. They wrote varied student profiles, changing race, social class, school setting, and even music preference, then asked ChatGPT 3.5, ChatGPT 4.0, and Google Gemini to score the same writing samples and produce written feedback. Their goal was to provoke reflection in teacher educators by showing them what the tools do when nobody is watching. Statistical generalizability was never the point.

I find the method itself a contribution worth highlighting. Warr and Heath are giving teacher educators something they can actually run in a methods course next week: a reproducible way to surface bias without needing a research lab. That moves the conversation past abstract warnings about AI bias and into something teachers can see for themselves.

The findings are nothing short of interesting. When the prompt explicitly identified a student as Black or Hispanic, Warr and Heath report that the LLMs assigned slightly higher scores, almost as if performing fairness on cue. When the same demographic information was implied through cues like an inner-city school or a stated preference for rap music, the scores dropped. The bias migrated into the proxies.

This is the part teacher educators have to absorb carefully. The models have learned to reject overt racial bias because their guardrails were built to catch it. They have not learned to reject indirect signals, because those signals look like neutral data points to a system that processes everything as text. A school name, a music preference, a zip code: each one carries social weight the model has absorbed without ever being asked to name it.

Warr and Heath then turn to LIWC analysis to study the language of the feedback itself. The measure they focus on is “clout,” which gauges how authoritative or directive a piece of writing sounds. Feedback addressed to students labeled as Black or Hispanic consistently showed higher clout. The AI took on a more commanding, less collegial tone with those students, which is exactly the pattern decades of classroom research has documented in human teachers.

AI Bias in Education

Why the Objectivity Frame Is the Real Problem

Warr and Heath argue that the perceived objectivity of LLMs is part of what makes them dangerous in classrooms. When a teacher questions a colleague’s grading bias, there is a person to engage with. When the bias comes from a system marketed as neutral and data-driven, the bias becomes harder to see and harder to challenge. The authority of the machine launders the prejudice it inherited.

As the authors put it, “if LLMs follow societal patterns when interacting with students, the perceived objectivity of LLMs may lead to these patterns seeming to reflect unquestioned truths, continuing cycles of inequity” (p. 255). I would add a sharper version of the same concern. A biased human grader can be challenged in a department meeting. A biased model shows up wearing the costume of mathematics, and the bias travels under the protection of that costume.

This connects to a thread I have followed in earlier posts on this blog. Roe, Furze and Perkins (2025) used the metaphor of digital plastic to describe how AI outputs accumulate unnoticed across education systems, and Kalantzis and Cope (2025) argued that AI literacy has to be rebuilt around understanding how these systems actually work. Warr and Heath give those arguments empirical teeth.

The paper has limitations the authors name themselves. The audit ran on a small set of prompts and one writing task. They are claiming demonstration: here is the method, here is what it surfaces, here is why it matters for teacher education. I think that framing is the right one. A larger replication would be welcome, and the burden of proof should not fall on critical researchers to prove that bias exists at scale before we take it seriously in teacher preparation.

Warr and Heath conclude that GenAI in schools “act as a useful trojan horse for hidden curriculum in schools and teacher preparation programs” (p. 257). The image is the right one. The technology arrives wrapped in promises of efficiency and personalization, and inside it carries the same patterns of authority and social sorting that schools have been reproducing for decades, now scaled and automated. Teacher educators who name that openly in their courses are doing the work. The ones who hand students a tool and say “try it” are part of the problem.

References

  • Kalantzis, M., & Cope, B. (2025). Literacy in the time of artificial intelligence. Reading Research Quarterly, 60, e591. https://doi.org/10.1002/rrq.591  
  • Roe, J., Furze, L., & Perkins, M. (2025). Digital plastic: A metaphorical framework for Critical AI Literacy in the multiliteracies era. Pedagogies: An International Journal. Advance online publication. https://doi.org/10.1080/1554480X.2025.2557491  
  • Warr, M., & Heath, M. K. (2025). Uncovering the hidden curriculum in generative AI: A reflective technology audit for teacher educators. Journal of Teacher Education, 76(3), 245-261.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top