Every conversation about AI in education eventually lands on the same claim: we need to teach students to think critically. I’ve made that argument myself, many times on this blog. But I’ve also noticed that most of the people making it, myself included, spend far longer arguing that critical thinking matters than explaining how to actually build it in a classroom.
This paper by Abrami et al. (2015) provides workable answer. Their meta-analysis synthesized 341 effect sizes from quasi-experimental and true-experimental studies, and the answer is both reassuring and humbling: critical thinking can be taught, but the effect is modest, and the instructional method matters enormously.

Teaching Critical Thinking Strategies
Based on Abrami et al. (2015), here are the CT teaching strategies that showed positive effects:
Dialogue strategies:
- Teacher-posed questions
- Whole-class teacher-led discussions
- Small group discussions
Authentic/Anchored Instruction strategies:
- Applied problem solving using real-world scenarios
- Role-playing
- Case studies and simulations
Mentoring strategies:
- Teacher modeling of thinking processes
- Coaching and guided practice
- One-on-one or small group tutoring
Course design approaches (Ennis’s four types):
- General: CT taught as a standalone course
- Infusion: CT goals made explicit within a content course
- Immersion: Deep content engagement with CT goals left implicit
- Mixed: Separate CT instruction combined with content integration
Critical Thinking Instruction Works, But the Effect Is Modest
The headline finding is that instructional interventions aimed at critical thinking produce a statistically significant positive effect. Abrami et al. built their definition of critical thinking on the 1990 APA Delphi consensus (Facione, 1990), which treats CT as both a set of cognitive skills (interpretation, analysis, evaluation, inference, explanation, self-regulation) and a set of dispositions (open-mindedness, inquisitiveness, flexibility). That dual emphasis is important because, as the data later shows, skills and dispositions don’t move at the same pace.
Abrami et al. developed a three-category instructional classification that I think educators will find directly useful. The categories are Dialogue (discussions, debates, teacher-posed questions), Authentic or Anchored Instruction (real-world problems, case studies, simulations, role-playing), and Mentoring (coaching, modeling, guided practice from a teacher or tutor).
All three produced significant positive effects. But the real finding is about combination. Studies where all three strategies were present generated an effect size of g+ = 0.57, nearly double the overall average. That came from just 19 studies where all three dimensions were coded as strongly present.
Mentoring is the interesting case. Abrami et al. report that it “did not generate especially strong results when analyzed on its own” (p. 302), but when layered onto dialogue and authentic instruction, it nearly doubled the effect. The authors describe it as working in a “catalytic capacity,” amplifying other strategies without being powerful in isolation. I think that’s one of the most practical insights in the entire paper. A teacher who runs strong discussions and uses real-world problems is already doing solid CT work. Intentional modeling and coaching on top of that pushes the whole effort to a different level.
Among dialogue subcategories, teacher-posed questions (g+ = 0.38), whole-class teacher-led discussions (g+ = 0.42), and small group discussions (g+ = 0.41) all showed significant effects. For authentic instruction, applied problem solving (g+ = 0.35) and role-playing (g+ = 0.61) were the strongest performers. Role-playing’s effect size is striking, though it comes from a small number of studies.
Content-Specific Thinking Beats Generic Thinking
One finding struck me as particularly relevant for the AI conversation: content-specific CT outcomes, where thinking was assessed in relation to the actual subject being taught, produced an average effect size of g+ = 0.57, compared to g+ = 0.30 for generic CT skills.
It aligns with something I’ve argued on this blog before. Training students in “critical thinking” as an abstract, decontextualized skill is less effective than teaching them to think critically about specific problems in specific domains. Abrami et al. also tested Ennis’s (1989) four course types (General, Infusion, Immersion, Mixed) and found no significant differences between them. I expected Infusion or Mixed approaches to outperform standalone CT courses. The data didn’t support that, though the authors note substantial variability within each category.
The Disposition Gap
CT dispositions, the inclination to actually use critical thinking when it counts, showed the weakest results in the entire analysis. This gap has enormous implications in the AI context. I’ve written about cognitive surrender (Shaw and Nave, 2026) and metacognitive laziness (Fan et al., 2025), and both of those phenomena are fundamentally about dispositions, not skills.
Students may know how to evaluate an argument or spot a logical flaw. The question is whether they’ll bother doing it when ChatGPT produces a polished answer in seconds. Abrami et al.’s data tells us that changing thinking habits is harder than building thinking skills, and that tracks with what the recent AI cognition research has been showing.
Gerlich’s (2025) work on cognitive offloading and critical thinking erosion fits here too. If the disposition to think critically is already the hardest outcome to develop through instruction, and AI tools create new reasons to skip the thinking entirely, then we’re looking at a compounding problem that no single pedagogical fix can solve.
A 2015 Paper in a 2026 World
This paper was published in 2015, and the studies it synthesized stretch back decades. None of the included research could have anticipated a world where students carry a reasoning engine in their pockets. That doesn’t make the findings irrelevant. If anything, it makes them more urgent.
The instructional strategies Abrami et al. identified, dialogue, authentic problem-solving, mentoring, are exactly the kinds of activities that can’t be outsourced to AI. A classroom discussion requires presence. Role-playing demands improvisation you can’t script. And coaching only works inside a real relationship. These are human instructional moves, and the data says they work.
Abrami et al. are careful about overpromising. They write that they “regard teaching CT as a complex and multifaceted process, in which there is no magic recipe for the ‘production of learner success'” (p. 303).
Measurement limitations run beneath all of this. Most CT instruments in the reviewed studies assess what Paul (1990) called “weak sense” critical thinking: discrete, testable skills like argument analysis and logical reasoning. That’s a narrow slice of what most educators mean when they talk about critical thinking. Abrami et al. acknowledge this and argue, correctly I think, that dismissing skill-based gains because they don’t represent the full picture would be a mistake.
What This Means for Teaching Critical Thinking in the AI Age
If you’re an educator trying to build critical thinking alongside AI, Abrami et al.’s meta-analysis offers a clear evidence base. Combine dialogue, authentic instruction, and mentoring. Don’t rely on any single approach. Tie critical thinking to your content area, because that’s where the strongest effects show up. And know that building the disposition to think critically, the willingness to question and slow down, is the hardest part of the work. It was hard before AI. It’s harder now.
The technology keeps getting smarter. The pedagogical fundamentals haven’t changed.
References
- Abrami, P. C., Bernard, R. M., Borokhovski, E., Waddington, D. I., Wade, C. A., & Persson, T. (2015). Strategies for teaching students to think critically: A meta-analysis. Review of Educational Research, 85(2), 275–314. https://doi.org/10.3102/0034654314551063
- Fan, Y., Tang, L., Le, H., Shen, K., Tan, S., Zhao, Y., Shen, Y., Li, X., & Gašević, D. (2025). Beware of metacognitive laziness: Effects of generative artificial intelligence on learning motivation, processes, and performance. British Journal of Educational Technology, 56(2), 489–530. https://doi.org/10.1111/bjet.13544 //
- Gerlich, M. (2025). AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies, 15(1), Article 6. https://doi.org/10.3390/soc15010006
- Shaw, S. D., & Nave, G. (2026). Thinking fast, slow, and artificial: How AI is reshaping human reasoning and the rise of cognitive surrender. Working paper, The Wharton School, University of Pennsylvania. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
Important CT References cited in Abrami et al.’s paper:
- Ennis, R. H. (1987). A taxonomy of critical thinking dispositions and skills. In J. Baron & R. Sternberg (Eds.), Teaching thinking skills: Theory and practice (pp. 9–26). New York, NY: W. H. Freeman. Ennis, R. H. (1989). Critical thinking and subject specificity: Clarification and needed research. Educational Researcher, 18, 4–10. doi:10.3102/0013189X018003004
- Ennis, R. H., & Millman, J. (1985). Cornell critical thinking test. Pacific Grove, CA: Critical Thinking Books & Software.
- Facione, P. A. (1990). Critical thinking: A statement of expert consensus for purposes of educational assessment and instruction. Research findings and recommendations. Newark, DE: American Philosophical Association. (ERIC Document Reproduction Service No. ED315423)
- Halliday, J. (2000). Critical thinking and the academic vocational divide. Curriculum Journal, 11, 159–175. doi:10.1080/09585170050045182
- Hyslop-Margison, E. J. (2003). The failure of critical thinking: Considering a virtue epistemology pedagogy. Philosophy of Education Society Yearbook, 2003, 319–326.
- McMillan, J. H. (1987). Enhancing college students’ critical thinking: A review of studies. Research in Higher Education, 26, 3–29. doi:10.1007/BF00991931
- McPeck, J. (1981). Critical thinking and education. Toronto, Ontario, Canada: Oxford University Press.
- Norris, S. P. (1985). Synthesis of research on critical thinking. Educational Leadership, 42, 40–45.
- Paul, R. W. (1985). The critical thinking movement: A historical perspective. National Forum: Phi Kappa Phi Journal, 42, 2–3. Paul, R. W. (1990). Critical thinking: What every person needs to survive in a rapidly changing world. Santa Rosa, CA: Foundation for Critical Thinking.
- Paul, R. W., & Binker, A. J. A. (1990). Strategies: Thirty-five dimensions of critical thinking. In A. J. A. Binker (Ed.), Critical thinking: What every person needs to survive in a rapidly changing world (pp. 305–349). Rohnert Park, CA: Centre for Critical Thinking and Moral Critique, Sonoma State University.
- Paul, R. W., Elder, L., & Bartell, T. (1997). Study of 38 public universities and 28 private universities to determine faculty emphasis on critical thinking on instruction. Retrieved from http://www.criticalthinking.org/pages/center-for-critical-thinking/401
- Pithers, R. T., & Soden, R. (2000). Critical thinking in education: A review. Educational Research, 42, 237–249. doi:10.1080/001318800440579
- Rose, M. M. (1997). Critical thinking skills instruction for postsecondary students with and without learning disabilities: The effectiveness of icons as part of a literature curriculum (Doctoral dissertation). Available from ProQuest Dissertations and Theses database. (UMI No. 9806188)
- Sá, W. C., Stanovich, K. E., & West, R. F. (1999). The domain specificity and generality of belief bias: Searching for generalizable critical thinking skill. Journal of Educational Psychology, 91, 497–510. doi:10.1037/0022-0663.91.3.497
- Scriven, M., & Paul, R. (1996). Defining critical thinking: Critical thinking as defined by the National Council for Excellence in Critical Thinking, 1987.
