Beyond Audio Summaries: How to Use NotebookLM to *Actually* Design Better Learning
Five methods to maximise the value of NotebookLM's features
Hello folks 👋,
At the start of my bootcamp, I always ask participants what AI tools they use most — and Google’s NotebookLM consistently comes out on top.
When I dig deeper, it soon becomes clear that most learning designers have focused on NotebookLM’s ability to generate content — video & audio overviews, mind maps, infographics - from uploaded documents.
While it’s true that NotebookLM can generate better audio summaries and infographics than other AI models, the conversation about this technology has so far massively under-sold what’s actually different and valuable about this tool for those of us who design learning experiences.
So in this post, I’m breaking down five evidence-based instructional design methods from cognitive science and performance technology — and showing exactly how NotebookLM’s unique source-grounding architecture can operationalise each one.

Each section includes:
Research-based summary of what the method is and why it’s powerful
How NotebookLM uniquely supports it (no other LLM can do this)
Try This prompt you can paste into your own workflow
Let’s dive in!
NotebookLM for Learning Design 101
Before we dive into the playbook, here’s a quick overview of what makes NotebookLM different from generic LLMs like Copilot, Claude or ChatGPT:
Most LLMs (Claude, ChatGPT, Gemini) are trained on massive internet text collections. They excel at fluent writing and creative generation, but have some significant limitations for instructional design:
👉 Training data blending: When you paste a document into ChatGPT or Claude and ask for analysis, the model blends your content with everything it was trained on — millions of generic articles, textbooks, and web pages. The result might sound authoritative, but you can't tell which insights came from your data and which came from its general training. For instructional designers, that's a problem: you need outputs you can defend to stakeholders
👉 No persistent multi-source workspace: Most AI tools have a limit on how much text you can feed them in one session (called a context window) — and even as those limits grow, you still can't upload 50 PDFs and keep them live for weeks.
👉 No numbered citations: They might say “studies show” but can’t always point to paragraph 3 of your uploaded Mayer paper.
👉 No multimodal studio: One upload → one text response. No native audio, video, mind maps, quizzes, or slide decks from the same source set.
In practice, what makes NotebookLM different for learning designers is four things:
🔥 Answers grounded in your sources (with citations): Most AI tools answer from general internet knowledge. NotebookLM only answers from the documents you upload — and it tells you exactly which sentence it drew from. This approach is called Retrieval-Augmented Generation (RAG) if you want to go deeper, but the practical effect is simple: every output it produces can be traced back to a specific line in your source material.
🔥 Source toggling: You can switch individual documents on or off within the same workspace. For example, you might run a task analysis with only the job description active, then switch on the SME transcript and re-run it to see how the analysis changes — without starting a new session or repasting everything.
🔥 Multi-format studio & multi-source summaries: From one upload, NotebookLM allows you to generate audio podcasts, video overviews, infographics, quizzes, mind maps, slide decks, data tables.
🔥 Persistent workspace: Unlike a chat session that disappears when you close it, NotebookLM keeps your uploaded sources live indefinitely (up to ~50 documents on the free tier). You could build a workspace at the start of a project, design your course, then return three months later, upload post-training evaluation data, and ask: 'Based on the evaluation results, what does the data suggest about gaps in the original design?
TLDR: NotebookLM provides a workspace where we can:
Upload messy inputs and get structured analysis with citations
Compare conflicting sources systematically
Audit designs against your own principles with traceable violations
Generate multiple artefacts (audio, slides, quizzes) from one source set
Build living design systems that evolve with new data
This shift — from using AI to generate plausible-sounding content, to using AI to verify, audit, and trace every output back to real evidence — is what makes NotebookLM so different. It's not a writing assistant. It's a design accountability tool
But how do you move beyond just using NotebookLM for summaries? Let’s check out five pro methods.
5 Evidence-Based Methods NotebookLM Operationalises
1. Task & Gap Analysis
What it is
Dick & Carey (2014) hierarchical task analysis systematically decomposes complex jobs into major tasks, enabling subtasks, and terminal behaviours. Research shows task analysis enables us to produces high-quality objectives, sequencing, and assessments that match real performance demands (Dick, Carey & Carey, 2014).
Why it’s powerful
Misaligned training fails because it teaches generic skills instead of the actual job tasks. Task analysis ensures every objective traces to a real work behaviour.
How NotebookLM uniquely helps
Upload job description + SOP + SME transcripts → one prompt generates a cited task hierarchy. .
Try This Prompt
Using the Dick & Carey hierarchical task analysis method, decompose the Senior Support Specialist role into major tasks, enabling subtasks, and terminal behaviours.
The Dick & Carey model systematically breaks complex jobs into:
• MAJOR TASKS: High-level responsibilities from job description
• ENABLING SUBTASKS: Procedural steps required to complete major tasks
• TERMINAL BEHAVIORS: Observable actions indicating competence
• KEY DECISIONS: Critical judgment points requiring expertise
• COMMON ERRORS: Documented failure modes from actual performance
From the SOP, job description, and SME transcript, create a table:Major Task | Enabling Subtasks | Required Knowledge | Key Decisions | Common Errors | Source Citation
Prioritise P1 handling tasks first, and help surface gaps in performance. Cite specific passages. Use only uploaded documents.
2. Training Vs Non Training
What it is
Thomas Gilbert’s 1978 Behaviour Engineering Model (BEM) diagnoses performance gaps by distinguishing between trainable behaviour deficits and environmental barriers across six factors: Data, Resources, Incentives, Knowledge, Capacity, Motives. Kirkpatrick Level 3 evaluation explicitly recommends BEM‑style analysis to surface transfer barriers.
Why it’s powerful
Research using Gilbert's model consistently finds that the majority of performance problems — often cited as over 80% — stem from environmental barriers rather than individual skill deficits (Gilbert, 1978).
How NotebookLM uniquely helps
Upload metrics + process docs + manager notes → BEM‑structured table with citations showing which factors explain the gap. Source grounding ensures recommendations tie to your evidence, not generic advice.
Try This Prompt
Using Gilbert's Behaviour Engineering Model, run a performance diagnosis to distinguish behaviour gaps (trainable) from environmental barriers (non-trainable) for Kirkpatrick Level 3 evaluation.
Create a table analysing slower resolution time and higher reopen rates:
Performance Issue | Evidence | Gilbert BEM Factor | Trainable OR Non-Trainable | Recommendation | Source Citation
BEM Factors: Data/Expectations, Tools/Resources, Incentives, Knowledge/Skills, Capacity, Motives.
Use only uploaded sources. Sort by impact.
3. WCAG 2.1 Accessibility Checks
What it is
WCAG 2.1 AA is the international standard for accessible web content, including learning materials (W3C 2018).
Why it’s powerful
Accessible design benefits all learners — and non-compliance carries real legal risk. Accessibility lawsuits against organisations are increasing year on year, and responsibility often sits with the designer who signed off on the materials.
How NotebookLM uniquely helps
Upload content + WCAG checklist → violation table with severity levels and fixes, each tied to specific WCAG criteria.
Try This Prompt
Conduct a WCAG 2.1 AA accessibility audit against the checklist.
Table: WCAG Violation | Content Location | Success Criterion | Severity | Fix | Citations
Prioritise 1.4.1 (colour), 1.2.2 (captions), 4.1.2 (ARIA). Use only uploaded sources.
4. Mayer Storyboard Audit
What it is
Mayer’s 12 principles reduce extraneous cognitive load and maximise germane load through coherence, signalling, contiguity, and modality (Mayer, 2009).
Why it’s powerful
Multimedia instruction following Mayer’s principles produces 1.5x learning gains vs conventional slides (meta‑analysis of 100+ studies).
How NotebookLM uniquely helps
Upload storyboard + Mayer principles → systematic audit table showing violations with dual citations (content + principle violated).
Try This Prompt
Conduct a Mayer Multimedia Principles audit of this storyboard.
Table: Violation | Scene | Mayer Principle Broken | Why Violated | Fix | Citations (Storyboard + Mayer)
Prioritise Coherence (extraneous), Redundancy, Signalling, Spatial Contiguity. Use only uploaded sources.
5. Constructive Alignment
What it is
Constructive alignment, developed by John Biggs (1996), is the principle that every assessment item should directly reflect a stated learning objective. Put simply: if you can’t draw a straight line from your quiz question back to an objective, the question probably shouldn’t be there
Why it’s powerful
Misaligned evaluation measures the wrong things.
How NotebookLM uniquely helps
Upload objectives + Kirkpatrick framework → instrument table with objective mappings + citations.
Try This Prompt
Using Biggs constructive alignment, draft Level 1-3 evaluation items mapped to objectives.
Table: Item Type | Item Text | Mapped Objective | Kirkpatrick Level | Citation

Closing Thoughts
The key message is this: NotebookLM only becomes powerful when you bring the expertise. Without a proven method behind the prompt, you’re just getting a well-organised summary. With one, you’re producing auditable, evidence-based design artefacts you can defend to any stakeholder.
These five methods — grounded in decades of cognitive science and performance technology research — are the highest-leverage place to start. NotebookLM doesn’t replace your judgment as a designer. It makes that judgment scalable, traceable, and harder to argue with.
If you haven’t opened NotebookLM before, here’s your first move: create a free Google account, go to notebooklm.google.com, and start with Method 1. Upload your job description and one SOP, paste the Task & Gap Analysis prompt, and hit go. You’ll have a cited task hierarchy in under two minutes — and a much clearer sense of what this tool can actually do in your hands.
Happy designing! Phil 👋
PS: Want to master AI‑augmented instructional design? Apply for a place on my AI & Learning Design Bootcamp where we get hands on try and test methods just like this.






Thank you Dr Phil! Most useful.