Hey folks!
This week, I’ve been doing a deep-dive on the pedagogical risks and benefits of audio-based learning. Specifically I’ve been exploring two questions:
What does the research tell us about the potential benefits (and risks) of audio-based learning?
How might Gen AI impact our ability to design and develop auditory content with the potential to drive better learning outcomes?
The findings were super interesting, so I thought I’d share them with you.
Let’s dive in! 🚀
Audio-Based Learning 101
As the name suggests, audio-based learning involves using auditory content like spoken narration, music, and sound cue to drive learner engagement, deliver content and provide learning support.
While it has never been anywhere near as prolific as video-based learning, as a learning modality and pedagogical approach audio has been shown to have a number of potential benefits.
Here’s the TLDR:
Audio Enhances Comprehension, Knowledge & Skills
Auditory learning, particularly when combined with visual elements, is proven to have the potential to enhance cognition, increase problem-solving skills, improve executive functioning and support memory retention.
The Research:
Mousavi, Low, and Sweller (1995) conducted six experiments using geometry problems that demonstrated a mixed auditory-visual presentation significantly improves learning outcomes compared to a purely visual approach. Their research showed that using multiple sensory channels, including primarily audio, expands effective working memory capacity.
A study by Sánchez & Sáenz (2005) showed that audio-based learning environments can significantly enhance cognition and problem-solving skills. Audio descriptions and walkthroughs proved to enhance mathematical comprehension, managing cognitive load and thus making abstract concepts more accessible and comprehensible to all learners.
Instrumental music has been proven to enhance cognitive abilities, memorisation / recall & motor skills in students (Peard, 2012; Strong, 2017 and James et al., 2019).
Audio Enhances Engagement & Motivation
Auditory learning, particularly through podcasts and guided audio activities, has been shown to increase student engagement, reduce cognitive overload, and foster deeper learning. By creating a rich, connected learning environment, audio-based methods enhance motivation, satisfaction, and knowledge retention across various educational settings.
The Research:
Research by Dowdle (2005) found that audio-based learning activities can engage students more effectively compared to text-based learning, potentially reducing cognitive overload. Dwodle’s research found that audio-based activities which “talk" students through” an activity enhance engagement and foster deeper learning compared to conventional teaching methods.
A study by Khechine et al. (2009) found that providing audio podcast versions of course content significantly enhanced knowledge retention and improved learning outcomes in both online and in-person learning environments.
Research by Tarmawan et al. (2021) showed that students reported significantly increased motivation and satisfaction when using podcasts as learning tools.
A study by Middleton (2016), showed that audio is the most effective means through which to create a rich learning space by meaningfully connecting tutors, students, and those beyond the existing formal study space. Ak key mechanism for this connection was audio based feedback — which brings me to…
Audio Feedback Drives Deeper Learning
Audio feedback creates a more engaging and personal learning experience by effectively conveying tone, emotion, and emphasis, enhancing student motivation and comprehension. It also fosters meaningful connections between students and educators while improving feedback engagement and efficiency for instructors.
The Research:
Audio feedback can be more engaging and personal compared to written feedback, enhancing student motivation and comprehension (Nortcliffe & Middleton, 2009).
Studies show that students prefer audio feedback as it conveys tone, emotion, and emphasis more effectively than written feedback (Rodway-Dyer et al., 2011).
Audio feedback is reported to be time-efficient for instructors while improving student engagement with feedback content (Evans & Palacios, 2010).
Of course, like any modality, if we don’t use it correctly audio-based approaches can negatively impact learning. Key risks highlighted in the research include:
Insufficient Visual Context – Audio-only content may not effectively convey complex or abstract information that benefits from dynamic visualisations, leading to reduced comprehension (Sweller, 1988; Guo-you, 2004).
Limited Interactivity – Over-reliance on only audio-based learning can lack interactive elements, which in turn limit learner participation and engagement (Middleton, 2016; Doolan & Simpson, 2010).
Challenges in Note-Taking and Retention – If learners cannot easily pause and review audio, they may struggle to take effective notes or retain information, particularly when the audio content is fast-paced (Mund et al., 2023).
Overall, while audio-based learning comes with certain challenges, the overwhelming body of research suggests that it is an under-used yet potentially highly effective tool for enhancing learning outcomes.
Despite its benefits, audio is still far less commonly used than video or text-based learning, largely due to historical biases in instructional design and the perception that audio is somehow pedagogically inferior to video.
AI-Powered Audio-Based Learning
As AI-tools continue to advance, two key barriers to tapping into the potential of audio-based learning are being removed from instructional designers:
Thanks to tools like Consensus & STORM, now have more open access than ever to research about how to use audio-based learning optimally
Generative AI tools like ElevenLabs, Notebook LM & Google MusicFX can now assist in crafting dynamic, personalised and interactive auditory experiences more rapidly and effectively than ever before.
Below, you can see a table of the most successful experiments I ran this week, so you can try them for yourself:
Research Outputs
Here are some samples of what I produced.
Audio Signalling
https://drive.google.com/file/d/1meWJkBcTsPjdnL32tifmIghku0g2_hul/view?usp=sharing
Music for Learning with Google MusicFX
First, I worked with Google’s MusicFX to generate a focus track to enhance learners’ cognitive function, memorisation and motor skills. As you can see, I attempted to optimise what MusicFX gave me for impact by using information from the research to write the prompt:
You can listen to the track I created in under 60 seconds here:
You can remix the track or the try Google MusicFX for yourself here.
ElevenLabs for Audio Signalling & Feedback
Next, I worked with ElevenLabs to rapidly generate some audio signalling & feedback content.
Feedback:
First, I worked with Consensus & STORM to define what optimal feedback looks like, e.g. structure, tone and focus.
Then, I worked with Claude to turn the research on the “how” into a feedback script.
Finally, I pasted the script into ElevenLabs and generated the audio feedback.
You can hear the result here:
Audio Signalling
Next, I took some intro text from my bootcamp and generated in in Eleven Labs. You can hear the result here and read along below.
Concluding Thoughts
At the end of all of this my reflection is that the research paints a pretty exciting picture - audio-based learning isn't just effective, it's got some unique superpowers when it comes to boosting comprehension, ramping up engagement, and delivering feedback that really connects with learners.
While audio has been massively under-used as a mode of learning, especially compared to video and text, we're at an interesting turning point where AI tools are making it easier than ever to tap into audio's potential as a pedagogical tool.
What's super interesting is how the solid research backing audio's effectiveness is and how well this is converging with these new AI capabilities. Tools like ElevenLabs and Google MusicFX are helping to remove a lot of barriers that have made creating quality learning audio content challenging. It’s now easier than ever for anyone interested in innovating their pedagogical approaches to start experimenting with evidence-based audio content in their work.
Of course, there is still a lot we don’t know. How do humans respond to AI-generated audio? Is the impact on learning the same of audio is AI-generated rather than delivered by an embodied human? The research here is still emerging and so far mixed in its findings.
One example here is audio-based coaching. There's a lot of chat about coaches in general and lots of potential to explore audio coaching in general. However, the research that has been completed so far suggests we need to proceed with optimism and caution.
In a study comparing human and AI coaching over a 10-month period, for example, AI coaching was found to be just as effective as human coaching in helping users achieve their goals. However, AI lacked empathy, which remains a critical missing piece when it comes to learning (Terblanche et al., 2022). In 2020, Terblanche also found that AI coaching can be more scalable and cost-effective but struggles with individualised feedback and emotional support.
As ever, the optimal value lies in combining instructional design expertise with AI: only by understanding the science of learning and clearly defining the “how” can we ensure that AI-generated audio is optimised for learner motivation, learning and impact.
My top tip: always start with the research and the define the “how” before you start to build with AI.
Happy experimenting!
Phil 🚀
PS: Want to get hands on and experiment with AI in instructional design with me? Apply for a seat on one of my upcoming AI Learning Design Bootcamps.