Assessment is where the AI conversation gets most heated. It’s also where it gets most important. Teachers want clarity. Students want consistency. And institutions want something defensible. The AI Assessment Scale (AIAS) is a practical framework for addressing all three.
Perkins, Roe, and Furze (2024) revised their original AIAS to reflect how quickly the landscape has changed since ChatGPT’s release. The authors are blunt about detection and clearly stated that “AI detection is impossible and inefficient” (p. 4). I’ve been saying the same thing in workshops for over a year now. Spending energy on detection damages trust between teachers and students and pulls attention away from the more productive question: how do we design assessments that remain valid in an AI-saturated environment?

The AI Assessment Scale: Five Levels of Integration
The revised AIAS presents five levels. None is superior to the others. Each level aligns with specific learning goals, and the appropriate choice depends on what you’re trying to assess.
Level 1 (No AI) now requires supervised, controlled settings. The authors acknowledge that “merely declaring an AI-free assessment without environmental controls was increasingly untenable” (p. 10). Simply telling students not to use AI on a take-home assignment doesn’t work anymore. If you genuinely need to assess unassisted work, you need to create conditions where that’s possible, like in-class writing or supervised exams.
Level 2 (AI Planning) allows students to use AI for brainstorming, ideation, and organizing their thinking, but the development and execution of the work remain their responsibility. The assessment focuses on the quality of their development process, not just the final product.
Level 3 (AI Collaboration) is where things get particularly interesting from a pedagogical perspective. Students can use AI for drafting and composition, but they’re expected to critically evaluate everything AI produces. The authors introduce an interesting phrase the “illusion of finality,” which they describe as “the tendency to accept AI-generated text as complete and authoritative” (p. 12).
Anyone who has watched students interact with ChatGPT has seen this happen. The output arrives looking polished, well-structured, and confident. And the instinct is to accept it. Level 3 directly targets that instinct by making critical evaluation the thing being assessed.
I wrote about a closely related phenomenon in a previous post on cognitive surrender, where Shaw and Nave (2026) showed how students defer to AI outputs without critically processing them. The AIAS addresses this head-on. Students don’t get credit for producing polished text. They get credit for demonstrating that they can question, revise, and improve what AI gives them.
The authors reinforce this: “Students must be supported in understanding the limitations of AI systems and recognising that their own knowledge, critical thinking skills, and subject expertise are essential” (p. 12).
Level 4 (Full AI) assesses strategic deployment. Can the student use AI effectively to achieve a specific learning outcome? The emphasis is on judgment, decision-making, and purposeful use.
Level 5 (AI Exploration) pushes furthest. Students co-create new forms of assessment, potentially working with multimodal and synthetic media systems. This level is forward-looking and experimental, designed for contexts where the goal is to explore what AI makes possible.
The AI Assessment Scale and Learning Theory
The theoretical grounding is clearer in this revision. Perkins et al. align the AIAS with Vygotsky’s social constructivism, positioning GenAI as a mediating technology within the Zone of Proximal Development. They explain that GenAI tools “can function within this zone by providing scaffolding that helps bridge the gap between their current and potential performance” (p. 5).
I like this framing because it gives teachers a principled reason for allowing AI in certain assessments. AI is and should never be a shortcut, it’s scaffolding. And like all scaffolding, it should be removed as students develop independent competence. The level system makes that removal visible and structured.
At a different note, one of the strongest moves in the revised framework is the explicit prioritization of assessment validity. The authors rightly argue that assessment must reflect students’ actual knowledge and skills. If a student can produce a perfect essay by prompting ChatGPT but can’t explain the argument in a follow-up conversation, the assessment hasn’t measured what it was supposed to measure. The product looks right, but the learning behind it may be hollow.
This is where the AIAS connects naturally to the broader research on AI and cognition. Kosmyna et al. (2025) showed that students who used ChatGPT for writing couldn’t recall or quote their own essays minutes later. The polished output existed, but the understanding didn’t. The AIAS gives teachers a way to design around that problem by choosing the right level of AI involvement for the right learning goal.
The framework also encourages what Guo et al. (2025) found in their year-long classroom study: when the pedagogy is redesigned with intentionality, AI can support learning. When it isn’t, AI becomes a workaround. The AIAS gives teachers a structure for the intentional version.
Perkins et al. close with a call for dialogue: “educators must engage in open dialogue with students about appropriate AI integration in assessments” (p. 15). They also stress flexibility, writing that “the AIAS is not prescriptive; institutions should adapt it to their specific contexts and needs” (p. 15).
Honest About What’s Unresolved
I appreciate that the authors don’t present the AIAS as a finished solution. They acknowledge environmental costs, equity issues, and the fact that students are still reluctant to disclose AI use even when it’s permitted. Cultural change has to happen alongside structural reform.
As they caution: “frameworks for AI integration in education cannot ignore the broader ethical and environmental impacts of these technologies” (p. 14). Roe, Furze, and Perkins (2025) pick up exactly this thread in their Critical AI Literacy paper, where they argue that any framework for AI in education needs to address power, bias, and access alongside practical use.
The AIAS won’t solve every assessment challenge. But it gives teachers, departments, and institutions a shared language and a structured starting point. And right now, that’s exactly what most educators need.
Reference
- Guo, F., Li, T., & Cunningham, C. J. L. (2025). One year in the classroom with ChatGPT: Empirical insights and transformative impacts. Frontiers in Education, 10, 1574477. https://www.frontiersin.org/journals/education/articles/10.3389/feduc.2025.1574477/full
- Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X.-H., Beresnitzky, A. V., Braunstein, I., & Maes, P. (2025). Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing tasks. MIT Media Lab. https://www.media.mit.edu/publications/your-brain-on-chatgpt/
- Perkins, M., Roe, J., & Furze, L. (2024). The AI Assessment Scale revisited: A framework for educational assessment (Preprint). December 2024. https://arxiv.org/abs/2412.09029
- Roe, J., Furze, L., & Perkins, M. (2025). Digital plastic: A metaphorical framework for Critical AI Literacy in the multiliteracies era. Pedagogies: An International Journal. Advance online publication. https://doi.org/10.1080/1554480X.2025.2557491
- Shaw, S. D., & Nave, G. (2026). Thinking fast, slow, and artificial: How AI is reshaping human reasoning and the rise of cognitive surrender. Working paper, The Wharton School, University of Pennsylvania. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
