AI on Item Response Theory (IRT)

I’m learning about and working on Item Response Theory (IRT). Though I’ve heard of it, I don’t know much about it. Gemini created a useful summary for me:

1. The Core Concept: The “High Jump” Analogy

Think of your math course as a stadium full of high-jump bars.

The Item (The Bar): Every math problem in your course is a bar set at a specific height. A “2+2” bar is 1 inch off the ground. A “Multivariable Calculus” bar is 7 feet high.

The Student (The Athlete): Every student has a “vertical leap” score.

The Prediction: If a student has a 5-foot leap and approaches a 4-foot bar, IRT predicts they have a very high chance of clearing it. If they approach a 6-foot bar, they will likely fail.

The IRT Magic: In old-fashioned testing, if you get 10/10 on an easy test, you look like a genius. In IRT, the system says, “You cleared ten 1-inch bars. That doesn’t mean you’re a genius; it just means we know your leap is at least 1 inch.”

2. How to “Embed” Difficulty in Your Course

In IRT, we give everything a numerical value (usually between -3 and +3, but you can use any scale).

The Step-by-Step Design:

A. Calibrate the “Items” For every item in your math course (e.g., “Integration by Parts”), you need a Difficulty Parameter.

Analogy: Think of this as the “Gravity” of that item.

How to get it: You don’t just guess. You look at how thousands of other students performed. If students who are generally good at math keep failing this specific item, its “Gravity” (Difficulty) score goes up.

B. Create the Student “Skill Profile” Instead of a grade (like an A or B), the student’s profile has a single number representing their Latent Ability.

Analogy: This is the student’s “Power Level.”

How it works: Every time a student interacts with a item, their Power Level shifts. If they solve a “Difficulty 2.0” problem, their Power Level rises toward 2.0. If they fail a “Difficulty 0.5” problem, their Power Level sinks.

3. Making the Prediction (The “Probability Curve”)

IRT doesn’t just say “Yes, they will pass” or “No, they will fail.” It gives you a probability.

Imagine a “S-shaped” curve.

If the student’s Skill is exactly equal to the Item’s Difficulty, the IRT model predicts a 50% chance of success.

This is the “Learning Sweet Spot.” If a student has a 50-70% chance of getting a problem right, they are challenged but not frustrated. This is exactly where you want to point them in your course.

4. Pre-requisites Enhance IRT

Traditional IRT treats questions like they are in a vacuum. But you have “Prerequisites.”

The “Bridge” Metaphor: In a course, you know that “Basic Derivatives” is a bridge you must cross to get to “Optimization Problems.”

Recommendation: If a student fails an “Optimization” item (Difficulty 2.5), your system shouldn’t just say “You’re bad at math.” It should look back across the course to the “Derivative” item (Difficulty 1.2). If the student’s Skill is lower than the Derivative item, the system realizes the “bridge” is broken and recommends they go back and fix that specific item first.

5. Practical Implementation Tips for a Non-Technical Designer

The “Start-up” Problem: When you first build the course, you won’t know how hard the items are.

Solution: Give them a “Community Guess” difficulty (e.g., Algebra = 1, Calculus = 3). As real students use the system, the IRT math will automatically “auto-correct” those numbers based on actual success rates.

The Student “Warm-up”: Don’t give a new student a Calculus problem immediately.

Solution: Give them a “Calibration Test” (3 easy, 3 medium, 3 hard). This quickly finds their “Power Level” so the course can light up the items that are in their “Sweet Spot.”

The “Discrimination” Factor (The “Telling” Question): Some questions are better at testing skill than others.

Analogy: A “Trick Question” is a bad test of skill because even geniuses get it wrong. In IRT, you can label items with a “Discrimination” score. High discrimination means: “If you get this right, it almost certainly means you’ve mastered this level.”

Nano Banana, Close Enough

Dr. Philippa Hardman has written an excellent article on using Google’s Nano Banana AI image generation tool to support learning. The piece outlines six evidence-based use cases that go far beyond simple infographics: visualization, analogy, worked examples, contrasting cases, elaboration, and generation. Each strategy is grounded in decades of cognitive and educational research, and Hardman provides concrete prompts that instructional designers can immediately put to use.

The article also reinforces a critical lesson I’ve learned from my own experiences with AI: often it’s close enough, but it’s critical to review the outputs carefully.

Nano Banana generated 5-step visual guide for tying a bowline knot
Nano Banana's worked example for tying a bowline knot (from Hardman's article)

Take, for example, the worked example image that Hardman includes in their article—a 5-step visual guide for tying a bowline knot. The bowline is a fundamental knot used in countless situations, from sailing to rescue operations to everyday tasks. When tied correctly, it’s reliable and secure. When tied incorrectly, it can fail catastrophically.

The Nano Banana-generated image contains errors in the knot-tying sequence. This isn’t a criticism of Hardman’s work. They are using it as an example of the tool’s capabilities, not as a knot-tying tutorial, but rather a reminder that even when AI produces something that looks professional and well-organized, domain expertise and careful review remain essential. As a sailor, I spotted the mistake immediately.

So yes, use Nano Banana to create worked examples, visualizations, and contrasting cases. But always review the outputs with the same professional rigor you’d apply to any instructional material. Because when it comes to teaching and learning, “close enough” isn’t good enough.

AIxED Recap

I attended the AIxED in Boston on November 21, 2025

We’re all in this together

Education is currently navigating a period of Future Shock. Institutions, administrators, and faculty are struggling to keep pace with the acceleration of AI technology. Nobody has the complete answer. The policy landscape is lagging behind student and faculty usage. However, this struggle is universal. The overwhelming message from the conference is that we are all on this journey together, learning from each other.

Courseware is about to get interesting

Artificial Intelligence has fundamentally changed the value chain of content. The lecture hall model is fading, replaced by a focus on durable skills—competencies like ethical use, critical thinking, and lifelong learning. The core job as educators is shifting. They are no longer content creators simply delivering facts. They are experienced content designers who craft impactful learning environments. The curriculum must evolve to teach students how to engage with this new, AI-accelerated knowledge.

Higher-ed is lagging behind

The practical adoption of AI is hampered by two key areas: administrative inertia and a lack of specific training. Students and faculty are rapidly using general-purpose tools, often far ahead of administrative policy. To close this gap, there is an urgent need for two things. First, comprehensive teacher training to ensure AI is used intentionally and ethically. Second, the development of pedagogy-first tools that integrate data and design learning experiences (like AI tutors or “guided learning modes”) rather than simply replacing existing systems.

AI's Existential Crisis in Education

I recently listened to this episode of Decoder with Nilay Patel about how AI is fueling an existential crisis in education, and it really resonated with me. The episode features interviews with teachers who are grappling with fundamental questions about their role and purpose in an AI-driven world.

One quote that particularly struck me:

“That idea that we don’t really understand AI yet, that a lot of people don’t know how it works, and that we have no long-term data about its effects in the classroom because it’s so new, well, that’s a really big point of contention that we heard from a lot of teachers.”

This captures a fundamental problem: we’re making decisions about integrating AI into education without understanding its long-term effects. We don’t have decades of research like we do with other educational interventions. We’re essentially running a massive, uncontrolled experiment on millions of students.

The podcast also highlights how we might be repeating past mistakes:

“It feels to me like we haven’t learned some key lessons, a lot of them very recent. One of those during the pandemic was the costs of unhuman teaching and learning. I worry that as we did with cell phones and over reliance on one-to-one devices, we’re going to wake up a decade or more from now and realize we jumped on a tech bandwagon that keeps kids tethered to screens, harms them and harms learning.”

The pandemic showed us the limits of screen-based learning, yet we’re now pushing AI tutors that would keep students even more tethered to devices.

The most powerful theme from the podcast is the question teachers keep asking: “What are we even doing here? What’s the point?”

When AI can write essays, solve problems, and answer questions, what’s left for human teachers? The answer, I think, is everything that matters most: understanding the student, building relationships, fostering curiosity, teaching critical thinking (not just problem-solving), and helping students to navigate the world as human beings, not just as test-takers.

1EdTech Standards

1EdTech (formerly IMS Global Learning Consortium) develops technical standards that enable interoperability between educational technology systems. These standards ensure that different platforms—learning management systems, content publishers, assessment tools, and student information systems—can communicate and share data seamlessly. This interoperability is crucial for institutions that use multiple edtech tools, as it eliminates data silos and reduces manual workarounds.

Below is a quick reference guide to the key 1EdTech standards, organized by their primary function and data flow patterns.

Standard Category Layperson's Term Key Function Data Flow
EDU-API Foundational Universal Language Framework 🧱 Defines the secure, consistent structure for all data exchange APIs. Across all 1EdTech Standards
Common Cartridge Content Packaging Digital Course Box 📦 Packages an entire course's structure and content for portability. Publisher → LMS
LTI 1.3 / Advantage Real-time Connection Secure Launch Button 🔗 Securely launches external tools and returns grades immediately. LMS ↔ External Tool
QTI Assessment Format Quiz Blueprint 📝 Ensures assessments and questions are portable and consistently scored. Content Bank ↔ LMS
Caliper Analytics Usage Tracking Data Sensor Language 📊 Collects granular, standardized student activity data (time on task, clickstream). User Activity → Data Warehouse
CASE Alignment Objective Identifier 🎯 Provides unique IDs for skills and objectives to tag content and performance data. State/District → All Systems
Open Badges Credentialing Digital Mini-Certificate 🏅 Issues verifiable, digital credentials based on demonstrated skills. LMS/Tool → Learner
OneRoster Administrative Data Administrator's Bridge 👥 Automatically syncs student, teacher, class, and grade data across systems. SIS ↔ LMS/Applications
Overview of key 1EdTech standards showing their categories, simplified descriptions, functions, and data flow patterns.