I’ve spent the last year covering research on AI literacy, AI assessment, and AI pedagogy. Most of those papers focus on higher education. Yim and Su’s (2025) scoping review shifts the lens downward, to K-12 classrooms, and that shift raises questions the higher education literature barely touches. What happens when the students are six years old? What tools make sense when learners can’t write code? And who’s actually testing whether any of this works?
The review covers 46 studies published between 1995 and 2023, drawn from ACM, EBSCO, Web of Science, and Scopus. Research activity surged after 2018, with most studies coming from the US, China, Finland, Hong Kong, and Spain. The concentration in primary schools is heavy. Kindergarten classrooms are nearly invisible in the literature, and that absence tells its own story about how the field has prioritized certain age groups over others.

Four Types of AI Learning Tools in K-12 Classrooms
Yim and Su organize the 46 studies around four categories of learning tools: intelligent agents (Google Teachable Machine, Learning ML), software-focused devices (Scratch, Python), hardware-focused devices (Lego Mindstorms, PopBots, Raspberry Pi), and unplugged activities that teach AI concepts without screens or code. Intelligent agents appeared in 20 studies, software tools in 19, hardware in 10, and unplugged activities in 6.
The dominance of intelligent agents and software tools makes practical sense. They’re free or cheap, they run on school devices, and they don’t require a robotics lab. Yim and Su note that “without prior programming experience, these learning tools, such as Popbots, Teachable Machine, and Scratch, can help address the diverse needs of younger students across K-12 educational levels” (p. 117). That finding pushes against the assumption that AI education requires advanced technical skills before students can participate.
I want to flag something the review doesn’t address, though. These 46 studies predate the generative AI wave almost entirely. The tools under review are classical AI and machine learning platforms. Since 2023, the conversation has shifted dramatically toward large language models, AI chatbots, and generative tools that students encounter daily on their own.
A review anchored in pre-generative-AI tools gives us useful historical grounding, but K-12 education in 2026 looks nothing like the one these studies documented. Teachers aren’t just deciding whether to introduce Scratch. They’re figuring out what to do when their students are already using ChatGPT for homework.
Pedagogy Comes First, Tools Come Second
Project-based learning dominated the pedagogical approaches, appearing in 27 of the 46 studies. Human-computer interaction showed up in 7, play-based learning in 5, and game-based approaches in several more. Yim and Su group these under an “authentic and constructive” orientation, meaning students are building, solving, and interacting with AI systems directly.
The grade-level differences are worth tracing. Kindergarten classrooms lean on play-based and participatory approaches with physical robots like PopBots. At the primary level, Scratch and Teachable Machine dominate, usually embedded in project-based learning designs. Secondary students get more structured programming environments, experiential tasks, and collaborative problem-solving. This pattern makes developmental sense, and I’ve seen similar age-appropriate scaffolding recommendations in work I covered on AI literacy for young learners (Su, Ng & Chu, 2023).
One finding I find genuinely interesting is the emergence of analogy-based pedagogy in primary schools. Students compare how humans learn with how AI systems learn, using drawings and guided discussion to build that comparison. Yim and Su see this as a move toward dialogic learning, where students and AI systems develop shared thinking. I agree that this is a promising direction. It asks students to reason about intelligence itself, not just operate a tool. That kind of metacognitive engagement is exactly what gets lost when AI becomes a black box, something I’ve written about in the context of cognitive surrender research (Shaw & Nave, 2026).
The Assessment Problem Nobody Has Solved
Cognitive outcomes appeared in 31 studies, and most reported positive gains in AI knowledge, computational thinking, and content understanding. Affective outcomes were measured less often but showed increases in motivation, self-efficacy, and interest in AI. On the surface, that’s encouraging. But the assessment methods behind those numbers are inconsistent at best and flimsy at worst.
Yim and Su found that questionnaires and surveys were used in 30 studies, artifact-based evaluation in 17, and interviews in 14. Some researchers used established instruments like the Torrance test for creativity. Many relied on self-report surveys or evaluated student projects without clear rubrics.
Several studies ran AI learning activities without formally assessing outcomes at all. The authors name this gap clearly: “there is currently insufficient theory-based, rigorous research on the effectiveness of AI educational tools to meet the diverse learning needs of students” (p. 119).
I’ve covered this problem from the higher education side through papers like the RAND survey on AI tools in K-12 (Diliberti et al., 2024), and the pattern holds at every level. The tools are multiplying faster than our ability to measure what they’re actually producing. You can’t build a case for scaling AI education across a school district when the evidence base consists of pilot studies with convenience samples and no standardized measures.
Yim and Su call for a standardized AI assessment mechanism that works across grade levels. They’re right, but I’d add that any such mechanism needs to go beyond knowledge recall and test for the kind of critical, evaluative thinking that AI literacy actually requires, something frameworks like those reviewed in Chee, Ahn, and Lee’s (2025) AI literacy competency work have started to map out.
What’s Missing from the Picture
The review identifies a significant shortage of theoretical frameworks guiding AI education research. Few of the 46 studies adopted formal conceptual models for designing curricula, activities, or tools. That’s a problem because without theoretical grounding, every intervention is ad hoc. You can run a great Scratch project in one school and have no basis for explaining why it worked or how to replicate it down the road.
Yim and Su also recommend that educators incorporate ethical dimensions into AI teaching, including fairness, transparency, data justice, and social responsibility. They argue that “it is essential to involve teachers in the design of learning tools and understand their perceptions regarding AI literacy education” (p. 119). I agree completely. Teachers who feel excluded from the design process are less likely to adopt tools with any depth or consistency.
The biggest limitation of this review, through no fault of the authors, is temporal. The 46 studies span 1995 to 2023, which means the entire generative AI era is absent. That doesn’t make the findings irrelevant. Classical AI and ML concepts are still foundational. If students understand how a classification algorithm works, that foundation helps them make sense of what a language model is doing at a much larger scale. But the pedagogical challenges educators face in 2026 have expanded dramatically beyond what Scratch projects and Teachable Machine workshops can address. The next wave of K-12 AI education research needs to account for a world where the tools aren’t just in the lab. They’re in every student’s pocket.
References
- Chee, H., Ahn, S., & Lee, J. (2025). A competency framework for AI literacy: Variations by different learner groups and an implied learning pathway. British Journal of Educational Technology, 56, 2146-2182. https://doi.org/10.1111/bjet.13556
- Diliberti, M. K., Schwartz, H. L., Doan, S., Shapiro, A., Rainey, L. R., & Lake, R. J. (2024). Using artificial intelligence tools in K–12 classrooms. RAND Corporation. https://www.rand.org/t/RRA956-21
- Shaw, S. D., & Nave, G. (2026). Thinking fast, slow, and artificial: How AI is reshaping human reasoning and the rise of cognitive surrender. Working paper, The Wharton School, University of Pennsylvania. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646 // https://medkharbach.com/cognitive-surrender-how-ai-is-quietly-reshaping-the-way-we-think/
- Yim, I. H. Y., & Su, J. (2025). Artificial intelligence (AI) learning tools in K-12 education: A scoping review. Journal of Computers in Education, 12(1), 93–131. https://doi.org/10.1007/s40692-024-00316-7
