Analyzing Qualitative Student Data: a Free and Safe Machine Learning Solution

When cutting-edge is overkill

Dec 22, 2025

Over the course of Semester 1, our MS teachers have written 3,624 notes commending students’ progress and documenting their challenges (both academic and behavioral).

An AI workflow then allowed us to:

🔒 Ensure these notes were anonymous
🏷️ Classify them into relevant categories (type, severity)
🚨 Identify cases of concern for our Monday Student Review Team
✉️ Generate synthetic, weekly parent emails
📊 Conduct data analysis

Reflecting on it at the start of Winter break, I thought: the benefits are obvious, but step 1 is a little bit of a hassle, and step 2 is arguably an over-the-top workaround (as we use a state-of-the-art large language model for a simple classification task, which it was not specifically trained for).

This prompted me to wonder: what alternative approach could provide 100% data protection, at $0 cost, and with equal or greater accuracy?

The answer was a simple app that lets users:

📁 Define custom categories
🏷️ Label a small sample of notes
🤖 Create a machine learning model using logistic regression to classify notes based on their embeddings
📊 Analyze individual, temporal, group, and disaggregated data

Embeddings were the key here. Converting words into vectors (lists of numbers), they capture their meaning (based on their use in the training corpus, and along 384 dimensions) in a form that machines can learn to associate with given labels.

As can be seen in the video:

💻 The app runs entirely locally. All data stays on the user’s computer, where the embedding model is downloaded and the ML model is trained
✅ 60 labeled examples were enough to achieve 83.3% accuracy. Reviewing predictions, the only “error” was an ambivalent commendation classified as a behavior issue
🎯 Adding examples (and an “ambivalent” category) could probably achieve near-perfect classification - although it is unclear what perfection means here, as my own test-retest reliability was on par with the model.

Interestingly, analyses derived from these classifications further supported the accuracy of the model, as it confirmed that:

📈 Grade 7 had the greatest number of in-class behavior challenges
📉 Grade 8 presented more academic issues (missing work) and “outside” behavior issues

While still experimental, this goes to show that:

🛠️ AI innovation is not necessarily about cutting-edge solutions (the latest, most powerful model), but rather about using the right tool for the job (even simple embedding and machine learning models)
💡 Costly and resource-intensive SOTA models (Claude Code, here) might be best used to create these “just right” workflows (rather than to replace them)

The Learning Curve

Discussion about this post

Ready for more?