What Happens When Students Get Real AI Writing Instruction

I’ve said it before and I’ll keep saying it: giving students access to AI tools without teaching them how to use those tools is like handing someone a scalpel and calling them a surgeon. Access is cheap. Pedagogy is the hard part. And a new study from Pensky, Usdan, and Chang (2025) offers some of the cleanest evidence yet that structured genAI instruction, not the tools themselves, is what actually moves the needle on student writing.

The study comes out of Carnegie Mellon University, where 27 graduate students took a 7-week elective course on artificial intelligence. The course covered how large language models work, prompt engineering, failure modes, and hands-on practice with genAI across specific writing sub-skills like summarization, research, analysis, and policy development. Students completed professional memo writing assignments under two conditions: one with no AI at all, and one after receiving all that instruction and with full genAI assistance.

What AI Writing Instruction Produced

The productivity numbers are hard to ignore. Pensky, Usdan, and Chang report that students’ median time on the research and writing task dropped from 150 minutes without genAI to 65 minutes with it, a 56.7% reduction. Only one student in the entire sample reported taking longer with AI assistance. The effect size was large.

Writing quality improved too. The median rubric grade went from 91.67% (an A-) to 95.83% (a solid A). In graduate-level writing where grades are already compressed at the top, that shift is real. Pensky, Usdan, and Chang are careful to note that “we cannot untangle whether our instruction on or practice with genAI was specifically the reason for the observed benefits in the present study, given the swift advancement in genAI” (p. 4076). I respect that they name it outright. The study can’t isolate instruction from practice or from AI improvements, and with only 27 students at a single elite university, we’re looking at a very specific population. These aren’t your average freshmen figuring out college writing for the first time.

But the direction of the findings aligns with everything the broader research base has been showing us. I covered a study by Fan et al. (2025) on metacognitive laziness where AI improved the written product but students skipped the cognitive steps that make writing a learning experience.

The Pensky study is an interesting counterpoint because the instruction explicitly taught students to recognize when the AI was wrong, to evaluate outputs critically, and to co-author with the tool. That pedagogical scaffolding is exactly what was missing in the Fan et al. design.

AI Writing Instruction

AI as an Equalizer for Non-Native English Speakers

The most compelling finding is what happened with non-native English speaking (NNES) students. Their writing quality jumped from a median grade of 87.5% without genAI to 95.8% with it, essentially matching their native English speaking (NES) peers. NES students were already scoring high (median 91.7% without AI, 95.8% with it), and the change for them wasn’t statistically significant. GenAI closed the gap.

The productivity story tells the same thing from a different angle. Without genAI, NES students finished their assignments an average of 55.5 minutes faster than NNES students. With genAI, that gap shrank to 7 minutes. NNES students went from a median of 179.5 minutes down to 66 minutes with AI assistance.

Pensky et al. frame this through Bandura’s concept of self-efficacy, and the data backs it up: NNES students’ perceived writing competency increased significantly over the course. NES students’ self-assessment stayed flat. This connects to findings I covered by Hysaj, Dean, and Freeman (2025) on multicultural students and genAI academic writing, where non-native speakers consistently reported AI as helpful for overcoming language barriers. The pattern across studies is becoming clear: genAI may be one of the most powerful equity tools we have for students writing in a second language, if the instruction is there to support it.

That’s a big “if.” The Pensky study invested several weeks of class time in teaching students how genAI systems work, where they fail, and how to use them productively across different writing sub-tasks. Most institutions aren’t doing that. Most courses don’t have the bandwidth. And the students here were graduate students at Carnegie Mellon who already had strong baseline writing skills. The question of whether similar results would hold with undergraduates, or at institutions with fewer resources, remains open.

Where GenAI Fell Short

The quality improvements weren’t spread evenly across writing sub-tasks. Pensky et al. found that organization and style, flow, and concision showed the strongest gains with genAI. Policy logic and policy argumentation, by contrast, showed almost no change. The authors connect this to what Dell’Acqua et al. (2023) called the “jagged technological frontier,” the idea that genAI performs well on some tasks and falls apart on others, with no clear boundary marking where competence ends.

This is consistent with what I’ve been arguing across multiple posts on this blog. AI is good at the mechanics of writing: tightening prose, improving flow, fixing structure. It is far less reliable when the task requires genuine reasoning, weighing competing arguments, or building a case from evidence. Students who recognize where that boundary falls will use AI well. Students who don’t will hand over the thinking along with the typing.

What concerns me in the Pensky data is that students overwhelmingly valued speed improvements over quality improvements. On a scale where 1 meant they valued quality and 10 meant speed, the average was 6.7. They attributed 67% of their improvement to AI and only 33% to repeated practice. The McKinsey report on AI fluency (Yee et al., 2025) makes a similar observation from the employer side: the workforce is moving toward AI-augmented productivity, and speed is becoming the default metric of value. But speed without critical engagement is exactly the problem that cognitive offloading research keeps flagging. The Pensky study shows students can use AI productively with the right instruction. The open question is what happens when the instruction goes away and the deadline pressure stays.

AI Writing Instruction as Curriculum

The finding I keep returning to: 59.26% of students believed a course teaching AI-assisted productivity should be required for a graduate degree at CMU. Every single student planned to use genAI in future writing tasks. The demand is already there. The question for institutions is whether they’ll meet it with serious pedagogy or leave students to figure it out alone.

Pensky, Usdan, and Chang’s instructional model covered how LLMs are built, prompt engineering, genAI failure modes, and specific techniques for using AI across research and writing sub-tasks. That’s a curriculum, not a one-off workshop. And the results suggest it makes a measurable difference, at least for this group of students at this institution.

I don’t think every course needs to become an AI course. But every program that involves writing, and that means every graduate program, needs to grapple with the reality that students are already using these tools. The Pensky study is a small one, 27 students at one institution, and the authors are upfront about that. But it points in the same direction as every other credible study I’ve read this year: pedagogy determines whether AI helps or hurts. Teach the tool. Teach its limits. And teach the judgment that no AI can provide. The writing will get better. The learning might too.

References

  • Fan, Y., Tang, L., Le, H., Shen, K., Tan, S., Zhao, Y., Shen, Y., Li, X., & Gašević, D. (2025). Beware of metacognitive laziness: Effects of generative artificial intelligence on learning motivation, processes, and performance. British Journal of Educational Technology, 56(2), 489–530. https://doi.org/10.1111/bjet.13544 
  • Hysaj, A., Dean, B. A., & Freeman, M. (2025). Exploring the purposes and uses of generative artificial intelligence tools in academic writing for multicultural students. Higher Education Research & Development, 44(7), 1686–1700. https://doi.org/10.1080/07294360.2025.2488862 
  • Pensky, A. E. C., Usdan, J. H., & Chang, H. (2025). Generative AI’s impact on graduate student professional writing productivity and quality. International Journal of Artificial Intelligence in Education, 35, 4057-4082. https://doi.org/10.1007/s40593-025-00528-z
  • Yee, L., Madgavkar, A., Smit, S., Krivkovich, A., Chui, M., Ramirez, M. J., & Castresana, D. (2025, November). Agents, robots, and us: Skill partnerships in the age of AI. McKinsey Global Institute.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top