The DOMS™️ AI-Ed Tools Evaluation Rubric

A Practical Checklist for Evaluating the Efficacy & Safety of AI Education Tools

Apr 27, 2023

The rapid growth of AI in education has led to a growing array of tools with varying levels of quality, effectiveness and ethical reliability.

In my experience, it’s hard to cut through the noise and make informed decisions about which tools to experiment with and implement in the classroom, and which to leave well alone. This is only set to become more challenging as more products come to market.

So, I set myself a challenge: is it possible to create a standardised method of comparing and scoring the value of AI technologies for educators?

Through a combination of research, consultation with educators and ed tech experts, and analysis of existing efficacy evaluation models in the education sector, I have identified seven criteria that affect the success and efficacy of AI tools in the classroom:

🎓 Pedagogical Quality

😇 Reliability & Ethics

⛔️ Data Privacy & Security

🚪Accessibility & Inclusion

📈 Scalability & Adaptability

🧱Ease of Integration

💰Cost-effectiveness

Based on this criteria, I created the AI-Ed Tools Evaluation Rubric - a tool to help both teachers and ed tech companies assess the quality, value and risks of using AI tools in educational settings.

The purpose of the framework is to provide a standardised method of comparison between AI technologies by providing a set of criteria to evaluate AI tools across seven dimensions:

My hypothesis is that by using this rubric, we can make informed decisions as educators about the AI tools we adopt in our classrooms.

Disclaimer: This framework is a work in progress and represents my own effort to create a useful tool for me and my community. It is not an official guide, and feedback is always welcome to improve its effectiveness!

The DOMS™️ AI-Ed Tools Evaluation Rubric

A Practical Guide for Evaluating the Efficacy & Safety of AI Education Tools

Pedagogical Quality (1-5)

The AI tool does not engage learners, provide active learning opportunities, problem-solving tasks, or effective feedback and assessment mechanisms.
The AI tool provides minimal engagement and active learning opportunities but lacks in problem-solving tasks and effective feedback and assessment mechanisms.
The AI tool promotes engagement and active learning but has limited problem-solving tasks and moderately effective feedback and assessment mechanisms.
The AI tool effectively promotes engagement, active learning, problem-solving tasks, and provides mostly effective feedback and assessment mechanisms.
The AI tool excellently promotes engagement, active learning, problem-solving tasks, and provides highly effective feedback and assessment mechanisms.

Reliability & Ethics (1-5)

Quality of Data Used (1-5)

The data used to train and operate the AI tool is unreliable and inaccurate, leading to invalid and untrustworthy results.
The data used to train and operate the AI tool has limited reliability and accuracy, resulting in questionable results.
The data used to train and operate the AI tool is moderately reliable and accurate, producing mostly valid and trustworthy results.
The data used to train and operate the AI tool is highly reliable and accurate, with minor issues affecting the validity and trustworthiness of results.
The data used to train and operate the AI tool is of the highest reliability and accuracy, ensuring valid and trustworthy results.

Objectivity and Fairness (1-5)

The AI tool does not ensure objectivity and fairness in its content and decision-making processes, with significant biases present.
The AI tool offers limited objectivity and fairness in its content and decision-making processes, with some biases present.
The AI tool mostly ensures objectivity and fairness in its content and decision-making processes, with minor biases present.
The AI tool ensures high levels of objectivity and fairness in its content and decision-making processes, with only negligible biases present.
The AI tool ensures the highest levels of objectivity and fairness in its content and decision-making processes, effectively minimizing biases.

Transparency and Explainability (1-5)

The AI tool is not transparent in its decision-making processes, and the reasoning behind its recommendations or outputs is unclear.
The AI tool offers limited transparency in its decision-making processes, with some explanation for its recommendations or outputs.
The AI tool is moderately transparent in its decision-making processes and provides reasonable explanations for its recommendations or outputs.
The AI tool is highly transparent in its decision-making processes, with only minor issues in the explanation of recommendations or outputs.
The AI tool is completely transparent in its decision-making processes and provides clear and comprehensive explanations for its recommendations or outputs.

Human Agency and Oversight (1-5)

The AI tool has not been designed to augment human decision-making and may replace or undermine human agency.
The AI tool offers limited support for human decision-making and may occasionally replace or undermine human agency.
The AI tool is designed to moderately augment human decision-making but may still have some issues with replacing or undermining human agency.
The AI tool is highly focused on augmenting human decision-making with only minor issues in maintaining human agency and oversight.
The AI tool is designed to perfectly augment human decision-making, fully supporting human agency and oversight.

Data Privacy & Security (1-5)

The AI tool does not adhere to data protection regulations, lacks secure data storage and usage policies, and is not transparent about data usage and processing.
The AI tool partially adheres to data protection regulations but has limited security policies and little transparency about data usage and processing.
The AI tool mostly adheres to data protection regulations, has moderately secure data storage and usage policies, and is somewhat transparent about data usage and processing.
The AI tool adheres to data protection regulations, has secure data storage and usage policies, but may have minor issues with transparency about data usage and processing.
The AI tool fully adheres to data protection regulations, has highly secure data storage and usage policies, and is completely transparent about data usage and processing.

Accessibility & Inclusion (1-5)

The AI tool is not accessible to learners with diverse needs, does not comply with accessibility guidelines, has not been trained on diverse data sets, tested for bias, or designed to avoid discrimination and promote fairness.
The AI tool has limited accessibility for diverse learners, partially complies with accessibility guidelines, and shows minimal effort in addressing bias and promoting diversity and fairness.
The AI tool is moderately accessible to learners with diverse needs, mostly complies with accessibility guidelines, and has been trained on diverse data sets and tested for bias, but may still have some issues with discrimination and fairness.
The AI tool is highly accessible to learners with diverse needs, complies with accessibility guidelines, and has been trained on diverse data sets and tested for bias, with minor issues in avoiding discrimination and promoting fairness.
The AI tool is fully accessible to learners with diverse needs, complies with accessibility guidelines, has been trained on diverse data sets, tested for bias, and is designed to avoid discrimination and promote diversity and fairness.

Scalability & Adaptability (1-5)

The AI tool cannot scale to different class sizes and does not adapt to various learning environments and contexts.
The AI tool has limited scalability for different class sizes and minimal adaptability to various learning environments and contexts.
The AI tool is moderately scalable to different class sizes and somewhat adaptable to various learning environments and contexts.
The AI tool is highly scalable to different class sizes and mostly adaptable to various learning environments and contexts.
The AI tool can seamlessly scale to different class sizes and adapt to various learning environments and contexts.

Ease of Integration (1-5)

The AI tool is difficult to integrate with existing teaching practices, learning management systems and other educational tools.
The AI tool has limited compatibility with existing teaching practices, learning management systems and other educational tools.
The AI tool is moderately easy to integrate with existing teaching practices, learning management systems and other educational tools but may require some additional effort.
The AI tool is highly compatible and mostly easy to integrate with existing teaching practices, learning management systems and other educational tools.
The AI tool is seamlessly compatible and effortlessly integrates with existing teaching practices, learning management systems and other educational tools.

Cost-effectiveness (1-5)

The AI tool does not offer a good return on investment, considering its total cost of ownership and its impact on learning outcomes and efficiencies.
The AI tool offers limited return on investment, with minimal improvements in learning outcomes and efficiencies relative to its total cost of ownership.
The AI tool offers moderate return on investment, with some improvements in learning outcomes and efficiencies relative to its total cost of ownership.
The AI tool offers a good return on investment, with significant improvements in learning outcomes and efficiencies relative to its total cost of ownership.
The AI tool offers an excellent return on investment, with outstanding improvements in learning outcomes and efficiencies relative to its total cost of ownership.

The DOMS™️ AI-Ed Tools Evaluation Rubric in Action

I’ve used the rubric to score around 20 of the most popular AI-ed tools, with some super interesting results. Here’s what I found when I reviewed ChatGPT 3.5:

ChatGPT 3.5

Pedagogical Quality: 4 - Effectively engages learners with active learning and problem-solving tasks but may have occasional shortcomings in feedback and assessment.
Accessibility & Inclusion: 3 - Moderately accessible, mostly complies with guidelines but may have issues in addressing bias and promoting diversity.
Data Privacy & Security: 4 - Adheres to data protection regulations and has secure policies with minor transparency issues.
Scalability & Adaptability: 4 - Highly scalable for different class sizes and mostly adaptable to various learning environments.
Ease of Integration: 4 - Highly compatible and easy to integrate with most teaching practices and educational tools.
Cost-effectiveness: 4 - Good return on investment with significant improvements in learning outcomes and efficiencies.
Reliability & Ethics:
- Quality of Data Used: 4 - Highly reliable and accurate data with minor issues in validity and trustworthiness.
- Objectivity and Fairness: 3 - Mostly ensures objectivity and fairness, but biases will be present due do limited data set & prompt engineering sensitivity .
- Transparency and Explainability: 4 - OpenAI are transparent about sources and provide detailed explanations of benefits and risks of the tool.
- Human Agency and Oversight: 4 - Highly focused on augmenting human decision-making with minor issues in maintaining human agency.

Download a ready-to-use version of the rubric via Gumroad here. Try it and let me know what you find!

The DOMS™️ AI-Ed Tools Evaluation Rubric

Happy experimenting, Phil 👋

Dr Phil's Newsletter, Powered by DOMS™️ AI

Discussion about this post