The Impact of Gen AI on Human Learning: a research summary

A literature review of the most recent & important peer-reviewed studies

Jan 24, 2025

Many have hailed the rise of Gen AI tools like ChatGPT, Claude and Gemini as a golden bullet and turning point for human learning. Learners on the ground seem to agree; at a recent educators’ meeting that I attended with OpenAI, we were told that the number one use case of ChatGPT globally is learning. Great news, right?

Perhaps.

At the same time as the use of generic AI for learning proliferates, more and more researchers raise concerns about about the impact of AI on human learning. The TLDR is that more and more research suggests that generic AI models are not only suboptimal for for human learning — they may actually have an actively detrimental effect on the development of knowledge and skills.

In this week’s blog post, I share a summary of five recent studies on the impact of Gen AI on learning to bring you right up to date.

TLDR:

Without the necessary pedagogical safeguards to optimise the design and delivery of AI-powered learning, generic AI could negatively impact human learning by fostering over-reliance, diminishing critical thinking and reducing engagement in self-regulated learning (SRL).
On the flip side, if we take a “pedagogy first” approach to building AI models and products, the impact of AI on human learning could be exponentially positive.

Let’s dive in! 🚀

Study 1: Systematic Review & Meta-Analysis of ChatGPT’s Effects on Student Learning

Title: How ChatGPT impacts student engagement from a systematic review and meta-analysis study (2024)

How ChatGPT impacts student engagement, one of many recent articles designed to test the impact of AI on human learning.

Research Question: How does ChatGPT impact student academic performance, emotional engagement, and higher-order cognitive processes like critical thinking across various educational contexts?

Findings:

A systematic review of 17 studies revealed that ChatGPT-based learning moderately improved overall student engagement, particularly cognitive engagement (Hedges’ g = 0.593) and behavioral engagement (Hedges’ g = 0.454).
Emotional engagement also showed medium improvements, especially among students struggling with traditional instruction.
However, ChatGPT had inconsistent effects on higher-order cognitive processes like critical thinking. Some studies even reported reduced critical thinking skills when ChatGPT was used without structured guidance.

Key Takeaways:

Surface-Level Gains: Generative AI tools like ChatGPT improve task-specific outcomes and engagement but have limited impact on deeper learning, such as critical thinking and analysis.
Emotional Engagement: While students feel more motivated when using ChatGPT, this does not always translate into better long-term knowledge retention or deeper understanding.

Implications for Educators and Developers

For Educators:

Combine ChatGPT with Structured Activities: Ensure AI tools are part of a structured learning process that promotes deeper engagement rather than simple task completion. Example: If students use ChatGPT to draft essays, follow up with an activity requiring them to critique the AI’s output by identifying logical gaps, factual inaccuracies, or weak arguments. This encourages analytical thinking.
Use ChatGPT as a Supplement, Not a Replacement: Integrate ChatGPT in ways that support, but do not replace, foundational skills development. Example: When teaching math, let students solve basic problems manually first. Then use ChatGPT to check their answers or explore alternative approaches, prompting reflection on why an alternative solution might work.
Promote Self-Reflection and Evaluation: Pair AI tools with activities that require students to reflect on what they’ve learned and evaluate the AI’s role in their understanding. Example: After using ChatGPT for research, ask students to write a paragraph summarising what they learned, what gaps they noticed in the AI’s output, and what additional research they needed to conduct.

For Developers:

Reimagine AI for Reflection-First Design: Create features that encourage users to think critically about the AI’s answers before accepting them. Example: Include prompts like “What do you think the solution might be, and why?” before ChatGPT provides its response to a question.
Develop Tools that Foster Critical Thinking: Design AI systems that guide users to analyze and question the provided information, rather than simply accepting it at face value. Example: When generating answers, the tool could ask users follow-up questions like “Does this explanation align with what you already know? If not, what might be missing?”
Integrate Adaptive Support: Build systems that adapt their behavior based on the user’s proficiency, ensuring appropriate levels of guidance. Example: For advanced users, ChatGPT could skip basic explanations and offer high-level insights, while for beginners, it could break tasks into smaller, manageable steps and provide context for each step.

Study 2: Dual Impact of AI in Coding Education

Title: The Impact of Large Language Models on Programming Education and Student Learning Outcomes (2024)

Research Question: How does LLM usage affect task performance, foundational skill development, and independent problem-solving in programming education?

Findings:

LLMs improved task performance during assignments but significantly reduced students’ ability to solve similar problems independently in controlled settings.
Beginners who relied on LLMs for code generation and debugging struggled to develop foundational problem-solving skills.
Advanced learners benefited from LLMs by tackling complex problems and expanding their understanding.

Key Takeaways:

Over-reliance on AI tools hinders foundational learning, especially for beginners.
Advanced learners can better leverage AI tools to enhance skill acquisition.
Using LLMs for explanations (rather than debugging or code generation) appears less detrimental to learning outcomes.

Implications for Educators and Developers

For Educators:

Restrict AI Use in Foundational Courses: In introductory programming classes, students should be asked to write code manually and debug errors themselves before being allowed to use LLMs like ChatGPT or GitHub Copilot. Example: Assign beginners a Python exercise on basic loops or conditionals where LLMs are not permitted. Once they solve it manually, they could use AI tools to compare their approach to the AI’s suggestions.
Introduce Scaffolding Techniques: Pair students with structured tasks that encourage reflection and incremental problem-solving before turning to AI for help.Example: For a React assignment, provide prompts like “What key component might you need here?” or “How would lifting the state simplify this application?” These scaffolded steps help students plan their code instead of relying on AI-generated solutions.
Encourage Independent Problem-Solving: After using AI to debug or generate code during guided exercises, assign follow-up tasks requiring students to replicate similar outcomes without AI assistance. Example: If students debug a syntax error with an LLM’s help, they could later work on a problem where they intentionally induce and resolve similar errors themselves.

For Developers:

Design Adaptive AI Systems for Beginners: Build tools that provide step-by-step guidance tailored to novice programmers instead of delivering complete solutions. Example: When a beginner writes incomplete code, the AI could highlight specific areas for improvement and offer hints such as: “What condition might stop this loop from running indefinitely?” rather than auto-generating the entire solution.
Encourage Reflection Through Prompts: Include features that prompt users to think critically about their solutions and AI-generated suggestions. Example: After suggesting code fixes, the AI could ask, “What do you think this code will do? Why might it work or fail?” to encourage active learning and engagement.
Limit Dependency on Direct Answers: Design systems that gradually reduce the availability of full solutions as learners progress, encouraging them to tackle challenges independently. Example: A beginner’s interface might offer detailed explanations and hints, while advanced interfaces could provide only high-level guidance or feedback on user-generated code.
Integrate Debugging Guidance: Develop tools that guide learners through debugging processes rather than fixing issues outright. Example: Instead of fixing a syntax error, the AI could ask the user to identify the error by highlighting the problematic area and offering a hint like: “Have you closed all brackets and checked variable names?”
Foster Incremental Learning: Incorporate features that guide users through breaking down complex problems into smaller, manageable steps. Example: For a coding project, the AI might prompt learners to focus first on creating a function’s framework before writing the detailed logic, reinforcing problem decomposition skills.
Track and Analyse Learning Progress: Include dashboards or progress trackers that help learners and educators monitor skill development over time. Example: A tool could display metrics like “percentage of tasks solved independently” or “types of errors encountered and resolved,” providing insights into growth areas and opportunities for improvement.

Study 3: High School Math Tutoring with GPT Tools

Title: Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation (2023)

Research Question: How can pedagogically enhanced GPT-based tutoring systems, using customised exercises, improve independent problem-solving and long-term skill retention in mathematics education compared to generic AI models?

Findings:

A novel approach, CEMAL, uses GPT to generate customised exercises tailored to learners’ weaknesses, improving problem-solving skills and overall performance.
Generic AI models like GPT-3 are less effective for skill retention as they lack tailored, iterative scaffolding.
Customised exercises led to significant accuracy improvements in solving math word problems, outperforming both generic AI models and traditional fine-tuned approaches.

Key Takeaways:

Scaffolding Through Customisation: Iterative feedback and tailored exercises significantly enhance learning outcomes and long-term retention.
Generic AI Risks Dependency: Relying on AI for direct solutions undermines critical problem-solving skills necessary for independent learning.

Implications for Educators and Developers

For Educators:

Integrate Structured AI Tools with Independent Tasks: Use AI tools for guided practice but ensure learners tackle similar problems independently to reinforce skills. Example: After students use AI-generated exercises for guided geometry problem-solving, assign variations of these problems for manual completion.
Identify Weaknesses Through Customised Exercises: Leverage AI to pinpoint students’ problem areas and generate specific exercises to address them. Example: If students struggle with fractions, AI tools can produce targeted problems involving fraction addition and subtraction for additional practice.

For Developers:

Design Adaptive AI Systems: Build AI tutors that evaluate learners’ weaknesses and provide targeted, step-by-step exercises instead of direct answers. Example: When a student makes a mistake in a word problem, the AI might prompt, “Can you explain why you chose this formula?” before offering corrective hints.
Emphasise Incremental Learning: Develop features that allow for progressive skill-building, ensuring exercises become more challenging as students improve. Example: An AI tutor could start with single-step math problems and gradually introduce multi-step problems once proficiency is demonstrated.

Study 4: Cognitive Offloading and Critical Thinking

Title: AI Tools in Society: Impacts on Cognitive Offloading and the Future of Critical Thinking (2024)

Research Question: How does AI tool usage influence critical thinking skills, and what role does cognitive offloading play as a mediating factor?

Findings:

Cognitive offloading strongly correlates with reduced critical thinking (r = -0.75), as reliance on AI tools discourages active engagement in analytical tasks.
Younger participants are more dependent on AI tools and show lower critical thinking scores compared to older, more experienced users.
Higher education levels mitigate the negative effects of AI tool usage on critical thinking.

Key Takeaways:

Offloading Reduces Cognitive Engagement: Delegating tasks to AI tools frees cognitive resources but risks diminishing engagement in complex and analytical thinking.
Age and Experience Mitigate AI Dependence: Older, more experienced users exhibit stronger critical thinking skills and are less affected by cognitive offloading.
Trust Drives Offloading: Increased trust in AI tools encourages over-reliance, further reducing cognitive engagement and critical thinking.

Implications for Educators and Developers

For Educators:

Balance AI Use with Critical Thinking Activities: Integrate exercises that promote critical engagement alongside AI use. Example: After students use AI for research summaries, have them identify potential biases or gaps in the information provided.
Teach AI Literacy: Educate students on the limitations of AI tools and how to cross-check AI outputs for accuracy. Example: Instruct students to verify AI-generated data by consulting primary sources or alternative viewpoints.

For Developers

Incorporate Reflection Prompts: Build tools that require users to engage with outputs critically. Example: After providing a summary, an AI tool could ask, “What do you think might be missing from this summary?”
Design for Transparency and Feedback: Ensure AI tools explain their reasoning and encourage user evaluation. Example: An AI-powered decision-making tool could highlight its data sources and ask users to rate its reliability.

Study 5: Beware of Metacognitive Laziness

Title: Beware of Metacognitive Laziness: Effects of Generative Artificial Intelligence on Motivation, Self-Regulated Learning, and Performance (2024)

*Beware of Metacognitive Laziness: Effects of Generative Artificial Intelligence on Motivation, Self-Regulated Learning, and Performance (2024)*

Research Question: How does the use of generative AI tools impact learners’ motivation, self-regulated learning (SRL) processes, and metacognitive engagement?

Findings:

Learners using AI tools showed significant short-term performance improvements (e.g., essay scores) but no significant differences in intrinsic motivation, knowledge gain, or knowledge transfer compared to other groups.
Interaction with AI reduced engagement in key SRL processes, such as reflection and self-evaluation, leading to metacognitive laziness—over-reliance on AI instead of actively regulating learning tasks.
Learners reported inflated confidence in their performance, despite minimal gains in deep learning or transferable knowledge.

Key Takeaways:

Confidence ≠ Competence: Generative AI fosters overconfidence but fails to build deeper knowledge or skills, potentially leading to long-term stagnation.
Reflection and SRL Are Crucial: Scaffolding and guided SRL strategies are needed to counteract the tendency of AI tools to replace active learning.

Implications for Educators and Developers

For Educators:

Combine AI Tools with Reflection Tasks: Pair AI usage with activities that require learners to reflect on their performance and evaluate AI outputs critically.
- Example: After using ChatGPT for an essay, ask students to identify weaknesses in its suggestions and propose improvements.
Scaffold SRL Processes: Provide explicit prompts that encourage metacognitive activities such as goal setting, monitoring, and evaluation.
- Example: Use checklists or rubrics that guide learners to review their work independently before consulting AI tools.

For Developers

Promote Metacognitive Engagement: Design AI tools that ask users to reflect before and after receiving assistance. Example: Include features like: "What do you think is missing from your solution?" or "How could this answer be improved further?"
Balance Assistance with Challenge: Limit direct answers and incorporate tasks that require active learner input. Example: For a writing task, instead of generating complete paragraphs, AI could offer targeted feedback or ask learners to build arguments from outlines provided by the tool.

Conclusion

Current research overwhelmingly suggests that generic Gen AI tools do not just fail to advance human learning—they often actively hinder it. Across all five of the most recent studies on the topic, while tools like ChatGPT, Claude, and Gemini improve immediate task performance, they also reduce cognitive engagement, critical thinking, and self-regulated learning (SRL).

However, the potential of AI to transform education remains huge if we shift toward structured and pedagogically optimised systems.

To unlock AI’s transformative potential, we must prioritise learning processes over efficiency and outputs. This requires rethinking AI tools through a pedagogy-first lens, with a focus on fostering deeper learning and critical thinking. For example:

Scaffolding and Guidance: AI tools should guide users through problem-solving rather than providing direct answers. A math tutor, for instance, could ask, “What formula do you think applies here, and why?” before offering hints.
Reflection and Metacognition: Tools should prompt users to critique their reasoning or reflect on challenges encountered during tasks, encouraging self-regulated learning.
Critical Thinking Challenges: AI systems could engage learners with evaluative questions, such as “What might be missing from this summary?”

To achieve this vision, we need a concerted, interdisciplinary effort. For AI to truly transform learning, we must intentionally enable a fundamental cultural shift which prioritise human learning outcomes over efficiency and profit. Most current AI-ed products focus on speeding up tasks or driving user engagement, often at the expense of meaningful learning. We need to redefine “success” by focusing on metrics like knowledge acquisition, skill growth, and behavioural change rather than speed, accuracy, or user love.

In essence, when building any education technology our goal should be not just to build smarter tools, but to build tools that make us smarter.

Happy experimenting!
Phil 👋

PS: Want to dive deeper into AI and instructional design? Apply for a place my AI & Learning Design Bootcamp where we explore these concepts and get hands-on to develop our AI skills.

Dr Phil's Newsletter, Powered by DOMS™️ AI

Discussion about this post