AI-Resistant Assessments in Higher Education: Do They Actually Work?

The phrase “AI-resistant assessment” has become one of those terms that sounds right until you actually think about it. It frames the relationship between AI and assessment as a battle, something you defend against, something you build walls around. And I get why. Faculty are worried. Students are submitting work they didn’t write. Institutions are scrambling. But framing assessment design as resistance is a conceptual trap, and a recent perspective paper from Awadallah Alkouk and Khlaif (2024) shows both the promise and the limits of thinking about assessment this way.

The paper draws on training workshops the authors conducted with 333 educators across ten countries in the Global South, including Palestine, Egypt, Iraq, South Africa, Malaysia, and Morocco. The Global South perspective alone makes the paper worth reading, because most of the conversation about AI in education has been dominated by institutions with very different resources and institutional cultures.

AI-Resistant Assessments in Higher Education

The Process-Product Assessment Model

The most interesting contribution in the paper is what Awadallah Alkouk and Khlaif call the Process-Product Assessment Model. The idea is straightforward: don’t just grade the final output. Grade the process that got students there. That means evaluating how students developed their prompts, how they collaborated with AI tools, and the decisions they made along the way. The authors argue that this approach “evaluates how students develop prompts to guide AI tools and how they collaborate with AI throughout the learning journey” (p. 4).

I’ve seen this argument before, and I think it’s fundamentally correct. Assessment that focuses only on what students hand in has always been incomplete, and AI just made that incompleteness impossible to ignore. The process-product idea connects to a much larger body of work on assessment validity. I’ve covered Corbin, Bearman, Boud, and Dawson’s (2025) argument that AI has turned assessment into a wicked problem, one that can’t be solved with a single strategy. The process-product model is one piece of that puzzle, not the whole solution.

What I find less convincing is the assumption that documenting the process is itself sufficient evidence of learning. A student can describe their prompt engineering decisions in beautiful detail and still not understand the content they’re working with. Process documentation tells you what the student did. It doesn’t always tell you what the student learned.

Reflective Writing as AI-Resistant Assessment

The paper gives significant attention to reflective writing as a specific form of AI-resistant assessment. Awadallah Alkouk and Khlaif argue that reflection requires “personal, subjective experiences that are challenging for AI to replicate” (p. 5). Students use AI for brainstorming or drafting, then write a reflection documenting their prompts, their rationale for choosing certain AI-generated content, and how they critiqued or modified the AI’s output.

The logic is sound in principle. The nursing example in the paper, where students write reflective essays about ethical dilemmas they encountered in clinical practice, is a strong illustration. A student who actually faced that dilemma will write something very different from what any AI could generate.

But I’d add a caution. AI has gotten significantly better at producing plausible reflective writing since this paper’s workshops took place. The workshops ran during 2023 and 2024, and the paper was published in December 2024. We’re now in 2026. Models can mimic personal voice, fabricate plausible experiences, and produce reflective prose that reads as authentic.

Calling reflective writing “AI-resistant” was more defensible two years ago than it is today. The better framing is that reflection tied to verifiable experiences, ones the instructor can confirm actually happened, retains its value. Reflection as a genre, without that anchor, is increasingly vulnerable.

What’s Missing from the AI-Resistant Assessments Framework

The paper introduces the AI-Resistance Assessment Scale (AIAS) as a tool for guiding faculty in designing these tasks, but it never actually presents the scale. The authors mention plans to publish a detailed guide later. That’s a significant gap. Faculty reading this paper for practical guidance will find the concept but not the instrument.

There’s also no outcome data. This is a perspective paper, not an empirical study. We know 333 educators participated in the workshops. We know 51% of them already used generative AI daily and another 38% weekly. But we don’t know whether the assessment strategies discussed actually improved student learning, reduced AI misuse, or changed how faculty think about assessment design in the long run. The ideas are plausible, but they’re untested in the paper itself.

I also think the paper’s alignment of AI assessment with Bloom’s Taxonomy deserves scrutiny. Awadallah Alkouk and Khlaif suggest using AI at lower cognitive levels (brainstorming, idea generation) and requiring students to demonstrate their own thinking at higher levels (analysis, evaluation, creation).

That sounds clean, but current AI models are quite capable at analysis and evaluation too. Bloom’s doesn’t map onto AI capabilities the way it might have in 2023. Desai’s UNESCO report (2025) on the future of assessment makes a stronger case for rethinking assessment around learning goals and evidence, not around cognitive taxonomies that were never designed with AI in mind.

The Global South Context and Institutional Policy

One of the paper’s genuine strengths is its attention to Global South institutions. Awadallah Alkouk and Khlaif note that “progress in AI adoption remains slow” in many Global South countries despite UNESCO guidelines, and they point to attitudes, vision, and strategy as the real barriers, not just money. That diversity is rare in the AI-in-education literature, which tends to draw from a narrow set of well-resourced institutions.

The policy discussion is practical. Faculty reported creating custom GPT models for their courses, uploading lecture content, and using AI to generate exam questions. Some colleagues found this useful. Others complained that AI-generated questions were repetitive or misaligned with course material.

These are the kinds of ground-level observations that policy papers often miss. The conversation I covered on Moorhouse, Yeo, and Wan’s (2023) review of how top universities handled AI assessment guidelines dealt with a very different institutional context, but the underlying challenge is the same: institutions need clear policy, and faculty need room to experiment within it.

Reading This Paper in 2026

The concept of AI-resistant assessment is useful as a starting point, but it needs to evolve. Resistance implies that the goal is to keep AI out. The better goal, and the one that more recent research supports, is to design assessments where AI use is visible, intentional, and pedagogically grounded. Perkins and Roe (2025) made this argument forcefully when they called for the end of assessment as we know it, arguing that the old model of testing what students can produce alone is already gone.

Awadallah Alkouk and Khlaif are moving in the right direction. The process-product model, the emphasis on reflection, the Global South perspective, these are real contributions. But the paper stops short of the harder question: what counts as evidence of learning when AI is part of every student’s workflow? That question doesn’t have a clean answer. The frameworks are still catching up to the technology.

The assessments we design today need to assume AI is in the room. Not as an intruder to resist, but as a tool whose use tells us something about how students think.

References

  • Awadallah Alkouk, W., & Khlaif, Z. N. (2024). AI-resistant assessments in higher education: Practical insights from faculty training workshops. Frontiers in Education, 9, 1499495. https://doi.org/10.3389/feduc.2024.1499495
  • Corbin, T., Tai, J., & Flenady, G. (2025). Understanding the place and value of GenAI feedback: A recognition-based framework. Assessment & Evaluation in Higher Education, 50(5), 718–731. https://doi.org/10.1080/02602938.2025.2459641
  • Desai, H. (2025). What’s worth measuring? The future of assessment in the AI age. UNESCO. https://www.unesco.org/en/articles/whats-worth-measuring-future-assessment-ai-age
  • Moorhouse, B. L., Yeo, M. A., & Wan, Y. (2023). Generative AI tools and assessment: Guidelines of the world’s top-ranking universities. Computers and Education Open, 5, 100151. https://doi.org/10.1016/j.caeo.2023.100151  
  • Pensky, A. E. C., Usdan, J. H., & Chang, H. (2025). Generative AI’s impact on graduate student professional writing productivity and quality. International Journal of Artificial Intelligence in Education, 35, 4057-4082. https://doi.org/10.1007/s40593-025-00528-z

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top