AI Assessment Policy: Why the All-or-Nothing Approach Fails

The temptation to sort AI in assessment into clean categories is understandable. Educators are under pressure to respond to generative AI quickly, and binary rules feel decisive. But decisive is not the same as sound. Guy J. Curtis, an academic integrity researcher at the University of Western Australia, makes a pointed case in a new comment piece (2025) that the so-called “two-lane” approach to genAI and assessment creates more problems than it solves.

The two-lane model, originally proposed by Liu and Bridgeman (2023), divides all assessments into two categories: Lane 1, where genAI is completely banned and assessments are secured (think supervised exams), and Lane 2, where students can use genAI without any restrictions. Curtis acknowledges that the proposal was well-intentioned and even well-argued.

Its advocates want educators to rethink assessment, focus on higher-order thinking, and account for the reality that students will use AI in the workplace. Those are goals Curtis shares. His objection is structural. A system that only allows full prohibition or full permission leaves no room for the kind of incremental, scaffolded learning that good education depends on.

The Driving Analogy and What AI Assessment Policy Gets Wrong

Curtis uses a driving analogy that’s hard to shake. Unlimited genAI access for early-stage students, he argues, is like handing a teenager the keys to a high-powered sports car and telling them to have fun. No one teaches driving that way. Learners start with rules, constraints, dual controls, and speed limits. They build competence gradually. Assessment in higher education works the same way, or at least it should. If we skip the scaffolding and let students use AI freely from the start, we’re ignoring the process that actually produces learning.

I’ve covered the AI Assessment Scale from Perkins, Roe and Furze (2024), and Curtis’s argument aligns directly with that framework. The Scale creates graduated levels of AI use, from no AI to full AI, with several middle stages. That’s Lane 3, as Curtis calls it. Some assessments are secured. Some allow full AI. And many others set specific boundaries on how genAI can be used: for brainstorming but not drafting, for research but not writing, for feedback but not final submission. The specific limitation depends on what the assessment is trying to measure.

AI Assessment Policy

Curtis walks through a telling example. A student assigned to write an essay on the theme of death in Hamlet, under an unlimited AI policy, could paste the prompt into ChatGPT, copy the output, and submit it with their name on top. Technically, that’s permitted under Lane 2. The student learned nothing. The essay might be polished, but the thinking, reading, and reflection that make the assignment valuable never happened.

I’ve written about this exact dynamic in the context of metacognitive laziness, where Fan et al. (2025) found that students using ChatGPT skipped the evaluation and orientation steps that lead to deeper learning. Curtis is making the same point from an assessment design angle: if the policy allows students to bypass the cognitive work, many of them will.

The Detection Argument Is a Dead End

One of the strongest sections of Curtis’s piece takes apart the logic behind “we can’t detect AI use, so we might as well allow it.” He draws a direct parallel to contract cheating before text-matching software existed. Plagiarism was also hard to detect in that era. Nobody proposed allowing unlimited plagiarism as a solution. The reasoning would have been absurd then, and Curtis argues it’s equally absurd now.

I agree with him here, and I’ve made similar arguments across several posts on this blog. The inability to catch every instance of rule-breaking has never been a reason to abandon the rule. Curtis puts it plainly: “the inability to prevent a breach of the rules does not stop the rule from being set because the rule has value and serves a purpose” (p. 2154). Most students, he notes, want to learn and will follow assessment rules when those rules are clearly communicated. A long line of academic integrity research supports that claim.

Curtis is careful not to dismiss AI detectors entirely. He acknowledges their biggest flaw, false positives, and the danger of using a detector score as the sole evidence for a misconduct charge. But he also points to the growing body of evidence suggesting these tools can provide useful probative information when combined with other indicators: metadata analysis, writing process logs, the student’s ability to discuss their work in person.

I covered this layered approach when writing about the wicked problem of AI and assessment from Corbin, Bearman, Boud and Dawson (2025). The Swiss Cheese Model that Curtis references, where multiple imperfect barriers create a strong collective defense, is exactly the kind of thinking the field needs.

Equity, Access, and the Cost of Doing Nothing

Curtis raises an equity concern that complicates the two-lane debate further. Students don’t have equal access to AI tools. Some use free versions with limited capabilities. Others pay for premium models that produce higher-quality outputs. An all-or-nothing policy that permits unlimited AI use in unsecured assessments rewards the students who can afford better technology. Setting consistent limitations across an assessment levels the playing field.

This connects to something Dawson, Bearman, Dollinger and Boud (2024) argued in their influential paper on validity: the focus should be on whether an assessment validly measures what it claims to measure. Curtis extends that logic. If unlimited AI use means the assessment no longer reflects the student’s own understanding, the validity problem is baked into the policy itself. You can’t fix a structural design flaw by policing behavior after the fact.

Curtis concludes that the all-or-nothing model risks “an impoverished approach to education and educational assessment” (p. 2151) by throwing away the assignments that build understanding gradually, through research, reflection, revision, and incremental thinking. It undervalues discipline-specific knowledge and treats evaluative judgment, the very skill required to use AI well, as something that can be skipped.

I find his argument convincing because it starts from pedagogy, not panic. His proposal asks institutions to take the middle road seriously: set clear rules about what AI can and can’t do in each assignment, communicate those rules to students, and back them up with layered security. Perfection is not the standard. Good-faith effort with reasonable safeguards is. The road to better AI policy runs through the middle, and we’ve spent too long arguing about whether to build walls or tear them all down.

References

  • Corbin, T., Bearman, M., Boud, D., & Dawson, P. (2025). The wicked problem of AI and assessment. Assessment & Evaluation in Higher Education. 1–17. https://doi.org/10.1080/02602938.2025.2553340 // 
  • Dawson, P., Bearman, M., Dollinger, M., & Boud, D. (2024). Validity matters more than cheating. Assessment & Evaluation in Higher Education, 49(7), 1005–1016. https://doi.org/10.1080/02602938.2024.2386662 https://medkharbach.com/assessment-validity-vs-cheating/
  • Curtis, G. J. (2025). The two-lane road to hell is paved with good intentions: Why an all-or-none approach to generative AI, integrity, and assessment is insupportable. Higher Education Research & Development, 44(8), 2151-2158. https://doi.org/10.1080/07294360.2025.2476516
  • Fan, Y., Tang, L., Le, H., Shen, K., Tan, S., Zhao, Y., Shen, Y., Li, X., & Gašević, D. (2025). Beware of metacognitive laziness: Effects of generative artificial intelligence on learning motivation, processes, and performance. British Journal of Educational Technology, 56(2), 489–530. https://doi.org/10.1111/bjet.13544 
  • Perkins, M., Roe, J., & Furze, L. (2024). The AI assessment scale revisited: A framework for educational assessment. arXiv preprint arXiv:2412.09029.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top