AI Assessment Guidelines: What Top Universities Got Right and Wrong

When ChatGPT dropped in late 2022, universities scrambled. Some banned it outright. Others released cautious memos. A few tried to get ahead of the disruption with actual guidelines. Moorhouse, Yeo, and Wan (2023) decided to find out what the world’s top-ranking institutions actually told their instructors to do, and the results tell a story that’s equal parts progress and missed opportunity.

The study reviewed publicly available assessment guidelines from the top 50 universities according to the Times Higher Education 2023 World University Rankings. The search was conducted on June 15, 2023, which makes it an early snapshot, not a comprehensive audit.

Of those 50 institutions, only 23 had publicly available, instructor-facing guidelines at the university level. That number alone says something. Less than half of the world’s most prestigious universities had managed to produce official guidance for their faculty in the months following what many were calling the biggest disruption to education in decades.

The geographic breakdown is telling too. Moorhouse et al. report that 15 of the 23 guidelines came from the United States, four from the UK, two from Canada, and one each from Japan and Australia. The Global South is almost entirely absent. The authors acknowledge this as a limitation, and it’s worth flagging because the AI-in-education conversation can’t keep centering the same handful of countries. The challenges facing universities in Sub-Saharan Africa, South Asia, or Latin America are different, and the policy responses coming out of Stanford or Oxford won’t transfer cleanly.

Most of the guidelines came from university centres for teaching and learning, 15 out of 23. And here’s an important detail: 57% of the guidelines mentioned only ChatGPT by name. Moorhouse et al. point out that “by focusing on one tool predominantly, it may leave instructors with a limited view of GAI and its effect on assessments” (p. 9).

I think that’s exactly right. If your institutional guidance treats ChatGPT as a synonym for generative AI, you’re already behind. DALL-E, CoPilot, Midjourney, Claude, Gemini, and dozens of other tools were already reshaping how students approach creative work, coding, research, and writing. The entire conversation, built around one chatbot, was always going to age poorly, and it did.

AI Assessment Guidelines

The guidelines clustered around three main areas: academic integrity, assessment design, and communication with students.

On academic integrity, Moorhouse et al. found that 60% of the guidelines addressed plagiarism directly. They identified three forms of GAI-related plagiarism that institutions were worried about: copying and pasting AI-generated responses, running material through multiple AI paraphrasers to dodge detection, and failing to document GAI use.

About 61% mentioned detection tools like GPTZero and Turnitin, but most actually discouraged relying on them. The reasons? Inaccuracy, privacy concerns, and the risk of eroding trust between faculty and students. That’s a significant finding, because it shows that even at an early stage, many institutions recognized that detection is a dead end. I’ve written about the postplagiarism argument before, where the focus shifts from catching students to rethinking what we ask them to do. The fact that top universities were already backing away from detection tools in mid-2023 supports that trajectory.

AI Assessment Guidelines

Assessment design is the section that carries the most weight. Moorhouse et al. found that 74% of the reviewed universities provided specific advice on redesigning tasks. Five themes came through: test your assessments on GAI tools to see what they produce, redesign tasks to require creativity and critical thinking and personal reflection, focus on process through scaffolded assignment stages, incorporate GAI tools directly into the assessment, and use in-class assessments (with caveats).

Monash University gave particularly concrete guidance, Moorhouse et al. note, walking instructors through a step-by-step process: paste your assignment brief into ChatGPT, review the output, add details from your rubric and teaching materials, and test multiple prompt variations. That’s practical and useful. Carnegie Mellon, on the other hand, offered a warning I think is underappreciated: designing around current AI limitations is a temporary fix because GAI tools keep getting better. Both things can be true. You test your assessments against what AI can do now, and you build assignments that focus on the kind of thinking AI can’t replicate. But you don’t pretend that today’s limitations will still hold next semester.

Ten universities went further and suggested incorporating GAI into the assessment itself. Moorhouse et al. describe recommendations where students would generate responses using ChatGPT and then critically evaluate what the tool produced. Yale, for instance, recommended having students engage with ChatGPT as “a tool that exists in the world” (p. 7) and critically assess its outputs. I’ve seen this idea gain traction across several papers I’ve covered, including the work on AI assessment scales (Perkins et al., 2024) that map different levels of AI integration into assignments. What Moorhouse et al. are documenting is the early institutional version of that idea, before it had a formal framework.

On communication, 87% of the guidelines advised instructors to be clear and upfront with students. Moorhouse et al. found that suggested channels included syllabus statements, open classroom discussions about GAI, and collaboration with librarians. The content ranged from setting expectations about acceptable use to discussing ethics, limitations, original thinking, and the value of intellectual struggle.

I think the communication piece is undervalued in these discussions. Students aren’t going to stop using AI because a policy says they can’t. They’ll stop hiding it when they believe their instructors are thinking carefully about what AI means for learning, and when they feel safe enough to be transparent about how they’re using it.

The tension running through the whole study is this: most guidelines still lean defensive. Moorhouse et al. put it this way: “The emphasis in the guidelines still seem to focus on limiting or preventing GAI use in assessment tasks” (p. 9). And then they argue for a different direction: “Allowing or even requiring students to use GAI at various stages of the assessment process would, in fact, enhance the authenticity of assessments” (p. 9). I think they’re right about where the field needs to go. If professionals in every industry are using generative AI, then asking students to prove they can work without it doesn’t prepare them for anything real. The authenticity argument is strong.

Moorhouse et al. also propose a new competency they call “generative artificial intelligence assessment literacy.” It has three components: recognizing GAI’s implications for academic integrity, designing assessments that make room for both student learning and GAI use, and communicating with students about productive and ethical GAI use.

The authors argue that “the specific nature of assessments means that instructors need to develop GAI assessment literacy” (p. 9). I’d complicate that slightly. It’s not that instructors need a whole new literacy. It’s that the assessment literacy they’ve always needed now has a new dimension. The principles of good assessment design, alignment, authenticity, transparency, don’t change. What changes is the environment those principles have to work in.

This paper is a useful baseline. It documented what the world’s most resourced institutions were thinking in the first half of 2023. But a baseline is only useful if we measure the distance traveled since. The question now is whether these guidelines evolved or whether they calcified.

References

  • Eaton, S. E. (2023). Postplagiarism: Transdisciplinary ethics and integrity in the age of artificial intelligence and neurotechnology. International Journal for Educational Integrity, 19(23). https://doi.org/10.1007/s40979-023-00144-1
  • Moorhouse, B. L., Yeo, M. A., & Wan, Y. (2023). Generative AI tools and assessment: Guidelines of the world’s top-ranking universities. Computers and Education Open, 5, 100151. https://doi.org/10.1016/j.caeo.2023.100151
  • Perkins, M., Roe, J., & Furze, L. (2024). The AI Assessment Scale revisited: A framework for educational assessment (Preprint). December 2024. https://arxiv.org/abs/2412.09029

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top