How to Build Your First Team of AI Agents
A simple, practical blueprint for building your first team of agents for L&D — and what this means for your role
Hello folks 👋
If you’ve been anywhere near the AI conversation in the couple of few months, you’ll have noticed the language shifting. It’s no longer just about chatbots and prompts. Instead, talk has turned primarily to agents, specifically “agentic teams” or so-called- “agentic AI AI ecosystems”.
Gartner reported a 1,445% surge in multi-agent system enquiries from Q1 2024 to Q2 2025. PwC just launched its “Learning Collective” — an “ecosystem for accelerated growth built for the possibilities of the AI age.” Most AI vendors and consultancies are reaching for the same vocabulary.
And yet….when I talk to instructional designers about agents, the conversation almost always starts the same way:
“I keep hearing about AI agents, but I have no idea what a team of agents would actually look like in my day to day work.”
The gap between the buzzwords and the concrete is enormous. Despite all the noise, we have almost nothing showing an instructional designer what an agent ecosystem for their practice would look like — what it would do, what tools it would use, and what it would feel like on a Monday morning.
That’s what this post is about.
I'm going to lay out a practical, buildable V1 agentic team for instructional designers — five agents, one for each phase of our day to day work, that you can build this week. Then I'll share a glimpse of where this is heading: more ambitious ecosystems, and what your role might look like alongside them.
Let’s go!
What Is an Agent?
Let’s start with a simple but important. You likely already know how to prompt AI. You may have built a custom GPT or Claude Project loaded with your course materials. Those are valuable — but they wait for you. You go to them, give them an input and get output.
An agent is different. An agent works in the background. It’s triggered by an “event” — e.g. a survey response being submitted, a calendar date arriving, a file being saved — which sets it off on a task and eventually ends with it delivering an output to you without being actively asked. You didn’t open a chat. You didn’t paste anything. You instructed the agent once, then it watched, processed, and delivered as you instructed.
Think of agents in two buckets: tasks so simple and repetitive you’d hand them to an apprentice without checking every time — and tasks you simply can’t do because of the sheer volume involved.
In my own practice, I think about agents in two buckets. The first is tasks that are so simple, structured, and repetitive that I’d confidently delegate them to an apprentice — without needing to check the quality in depth every time. Things like: scanning a batch of learner feedback and sorting it into themes. Checking whether a set of assessment items are guessable. Flagging content assets that are past their review date . These are all jobs with clear inputs, clear rules, and predictable outputs. I know what “good” looks like, and I’d spot a bad result quickly.
The second is tasks I simply couldn’t do because of the volume of work involved. Reading every question learners ask a course assistant across a full cohort and spotting the patterns. Comparing this week’s LMS data to a baseline across five active programmes. Running a freshness audit on 200 content assets every month. These aren’t hard — they’re just too time-consuming to do manually, so they never happen.
The V1 Agentic Ecosystem for Instructional Design
After working with a tonne of orgs to build out early agentic ecosystems, here’s my hot take on what a starter agent ecosystem looks like for Instructional Designers / L&D teams.
Think: one agent for each phase of our work, chosen for the highest ratio of insight to effort. Each one is buildable this week with tools you can access (on trial) for free.
1. The Feedback Agent
The problem it solves: You have learner feedback data — surveys, NPS scores, end-of-cohort evaluations. You also don’t have time to properly analyse it. So it sits in a spreadsheet, and you skim the top-line numbers and move on. The themes, the shifts between cohorts, the emerging issues — they stay buried.
How it works: Build it with Lindy or Zapier. Set up a trigger so that when a survey closes or a new piece of feedback lands in a folder, the agent reads the data, clusters feedback by theme (relevance, pacing, assessment, UX, facilitator quality), rates sentiment per theme, compares to the previous cohort’s data (stored in a simple spreadsheet), and flags the top three issues. The output lands in your inbox or as a message in Teams or Slack: “Cohort 7 feedback is in. Pacing complaints up 15% vs Cohort 6. New theme emerging: learners say Module 4 pre-work feels irrelevant to the live session. Here’s the full breakdown.”
Starter prompt for the agent builder:
“When a new file lands in [FOLDER], analyse it as learner feedback. Group the feedback into themes (relevance, pacing, assessment, facilitator, UX). For each theme, say whether sentiment is positive, mixed, or negative, and pull out 2–3 representative quotes. Compare to the previous cohort’s data in [SPREADSHEET] and flag anything that’s shifted. Tell me the top 3 issues. Send the summary to [SLACK / EMAIL].”

To go deeper: tell the agent how to turn insights into design improvements, and have it both report on the feedback and suggest design edits.
2. The QA Agent [Assessment]
The problem it solves: Assessment quality is one of the most neglected aspects of instructional design. We invest hours in content and activities, then write quiz questions in the last 30 minutes before a deadline. Nobody systematically checks whether a clever learner could game the answers — agents can add a lot of value here.
How it works: Build it with Lindy or Zapier. Set up a file-watch trigger on your assessment folder — when a new or updated document lands, the agent reads it and red-teams it automatically, playing the clever-but-unmotivated learner: trying to pass each item without genuine understanding, flagging guessable items, ambiguities, and pattern exploits. You get a vulnerability report: “12 items reviewed. 4 flagged: 2 guessable (longest answer is always correct), 1 ambiguous (options B and D both defensible), 1 exploitable (answer inferable from the stem alone). Suggested rewrites attached. Overall vulnerability score: 33%.”
Starter prompt for the agent builder:
“When a new document is added to [ASSESSMENT FOLDER], read it as a training assessment. Act as a clever but lazy learner trying to pass without real understanding. For each question, try to get the right answer using only shortcuts — answer length patterns, obvious language, process of elimination. Rate each item as GUESSABLE, AMBIGUOUS, or SOUND. For weak items, explain the shortcut and suggest a rewrite. Give me an overall vulnerability score and list the 3 worst items. Send the report to [SLACK / EMAIL].”
*you can get a 7 day free trial of Lindy, but you need to enter your bank details to do so.
To go deeper: add objective-alignment checking (does each item test what it claims to?)
3. The Content Update Agent
The problem it solves: Content goes stale, and nobody has a system for catching it. Screenshots show interfaces that changed six months ago. Data references are from 2023. Policy citations were updated last quarter. You know this is a problem. You don’t have time to audit it.
How it works: Build it with Lindy or Zapier, scheduled to run on the first Monday of each month. The agent reads a content inventory spreadsheet that logs each asset with: title, type, date created, date last reviewed, and expiry-sensitive flags (screenshots, statistics, policy citations, external links). It checks dates against your freshness thresholds — data references older than 12 months, screenshots older than 6 months — and runs a link check on external URLs. You get a monthly digest: “7 assets flagged. 2 contain data references older than 12 months. 1 external link is broken. 3 screenshots may show outdated interfaces. Here’s the priority list.”
Starter prompt for the agent builder:
“On the first Monday of each month, read the content inventory in [SPREADSHEET]. Flag anything that looks stale: data or statistics older than 12 months, screenshots older than 6 months, policy references older than 6 months. Check any external links and flag broken ones. Sort the flagged items by risk — anything used in an assessment goes to the top. Send me the list via [SLACK / EMAIL].”
To go deeper: have it draft replacement copy for flagged items, so you review edits rather than start from scratch.
4. The Question Agent
The problem it solves: If you’ve built a learner-facing AI course assistant (e.g. a custom GPT or Claude Project that answers learner questions from your course materials), you have a goldmine of data about what learners are confused about — and you’re probably not looking at it. This agent sits downstream of your course assistant and turns learner questions into rich design intelligence.
How it works: Build it with Lindy or Zapier, scheduled to run every Friday. The agent pulls the week’s conversation logs from your course assistant, clusters questions by theme, identifies the top confusion patterns, flags questions the assistant couldn’t answer (knowledge gaps in your materials), and suggests specific course adjustments. You get a weekly report: “47 learner questions this week. Top themes: (1) Confusion about the difference between X and Y — 12 questions, all variants of the same misconception. (2) Can’t find the assessment brief — 8 questions. (3) Module 3 pre-work instructions unclear — 6 questions. Suggested adjustments attached.”
Starter prompt for the agent builder:
“Every Friday at 5pm, pull the conversation logs from [COURSE ASSISTANT / SPREADSHEET] for the past 7 days. Group all learner questions into themes — bunch together questions about the same confusion, even if they’re worded differently. Rank the top 3–5 themes by how often they came up. For each one, give me: the theme name, the count, a couple of example questions, and one specific thing I could change in the course to fix it. Flag any questions the assistant couldn’t answer — those are gaps in my materials. Send to [SLACK / EMAIL].”
5. The Course Health Tracker
The problem it solves: Most evaluation happens once — at the end of the programme, if it happens at all. By then, the problems you discover are months old. This agent replaces the end-of-programme scramble with a steady, weekly signal.
How it works: Build it with Lindy or Zapier, scheduled to run every Monday morning. The agent pulls key metrics from your LMS or assessment platform (completion rates, assessment scores, time-on-task, drop-off points), compares this week’s data to a baseline established from previous cohorts, and flags anomalies. You get a health check: “Programme A: all metrics stable. Programme B: Module 4 completion dropped 18% vs baseline — top 3 exit points flagged. Programme C: time-on-task for Module 6 is 2.3x the design estimate — learners may be struggling.”
Starter prompt for the agent builder:
“Every Monday at 8am, read this week’s programme data from [LMS DATA / SPREADSHEET] and compare it to the baselines in [BASELINE SPREADSHEET]. Check completion rates, assessment scores, time-on-task, and drop-off points for each active programme. Flag anything that’s shifted more than 10% from baseline. For each flag, tell me the programme, the metric, and how much it’s changed. Label each programme as STABLE, WATCH, or ATTENTION NEEDED. Send the health check as a CSV to [SLACK / EMAIL].”
To go deeper: cross-reference its metric flags with the agent #4 (above) to explain why something dropped, not just that it dropped
What the V1 Agentic Ecosystem Feels Like
After building these five agents — a focused weekend’s work, or a few afternoons across a week — here’s what your Monday morning looks and feels like:
Every Monday you get a health check across all active programmes. You know what’s stable and what’s shifting before you open your LMS.
Every Friday you get a digest of what learners were confused about this week, with suggested fixes mapped to specific modules.
Every time a survey closes a structured analysis lands in your inbox — not a raw spreadsheet waiting to be processed.
Every time someone saves a new assessment a red-team report surfaces vulnerabilities before any learner sees them.
The first of every month you get a content freshness audit, so stale assets don’t silently undermine your programmes.
Remember: you didn’t actively ask for any of these — the system delivered them. That’s the key shift from “AI as a tool I use” to “AI as a team that works alongside me.” You’ve basically created a small, AI team of “always on” assistants.
Like any assistant, the success of AI agents depends on three things:
The quality of their training — how clearly you instruct them on what to do, when to do it, and what good output looks like.
Some initial testing and feedback — running them on real data, checking the output, and refining the instructions until they’re reliable.
Ongoing review and improvement — checking in periodically, adjusting thresholds, and expanding their scope as you trust them more.
In an agentic ecosystem, your role therefore shifts. You’re no longer the person doing all the analysis, all the auditing, all the quality checks. You’re the person who designs the system that does it for you — and who reviews, redirects, and improves it over time. Less operator, more architect.
The Bigger Agentic Picture
The five agents above are a starting point — but it helps to see where they sit on a bigger map.
The jump from Level 1 to Level 2 is about context — giving AI your materials and guidelines.
The jump from Level 2 to Level 3 is about automation — letting AI work without you initiating.
The jump from Level 3 to Level 4 is about interconnection — letting agents talk to each other.
Each jump changes the economics of your practice. But the biggest single shift is from Level 2 to Level 3 — because that’s where the work you’re not doing starts getting done. And that’s what the five agents in this post are designed to unlock.
Closing Thoughts
The AI agent conversation is moving fast — but most of it is pitched at software engineers, enterprise architects, and startup founders. Almost none of it is speaking to the people who design how humans learn.
That’s a missed opportunity. The tasks that agents are best at — monitoring, pattern-finding, quality-checking, synthesising data, closing feedback loops — are exactly the tasks that instructional designers know they should be doing but rarely have bandwidth for.
The shift isn’t from “doing the work” to “not doing the work.” It’s from “doing everything yourself” to “doing the work that only a human designer can do” — the judgement, the ethics, the politics, the contextual knowledge, the craft — while a team of agents handles the monitoring, the QA, and the pattern-finding that makes your judgement better-informed.
Build the V1. See what it feels like when Monday morning starts with five reports you didn’t ask for but genuinely need.
Happy experimenting!
Phil 👋
PS: Want to learn how to become an AI-first L&D pro? Check out my AI & Learning Design Bootcamp.





