• Playlab

Model Migration Update

We have migrated to Claude Haiku 4.5. This move provides significant speed and price improvements over Sonnet 4.5, while delivering a performance boost over Haiku 3.5 and Kimi K2.

Timeline

Chronological progression of work from October 2025, documenting key decisions, pivots, and migration milestones in the optimization of Ghana educational apps.

Initial Cost Analysis
October 1, 2025

Identified that apps are too pricey. Intervention Courses - Teacher Planning app is the most used, currently using Claude Sonnet 4.5. Decision made to keep Sonnet 4.5 for this critical app given importance, or potentially move to cheaper model.

Key Decisions:

  • Intervention Courses app stays on Sonnet 4.5 given importance
  • Skip Ghana optimization for now - data was from testing, not real-world usage
Rate Limits and Cost Concerns
October 2, 2025

Sonnet models hitting rate limits and high costs. Goal: get as many apps as possible to Kimi K2 given best cheap tool calling ability. Started reviewing Sonnet conversations and testing same conversations with Kimi.

Doubling Down on Kimi K2 Migration
October 3, 2025

Costs became unbearable. Doubling down on moving all apps to Kimi K2 and improving tool calling. Also addressing issue with Claude Sonnet constantly providing assessments with question B being the main answer.

Reference Structure Migration
October 6, 2025

Moving all apps from current reference structure to Teaghan's general .md file structure. New plan focuses on: cost cutting experiments with lower cost models, testing adding more inputs to reduce conversation length, optimizing workflow to generate lesson plan only once.

Key Decisions:

  • Undo all reference modifications done previously
  • Focus on cost cutting and workflow optimization
RAG Tool Migration
October 8, 2025

Migrated apps from traditional RAG method to 'Search References Tool'. Updated multiple subject-specific apps including Physics, Performing Arts, PEH, Food and Nutrition, Clothing and Textiles, Government, Geography, Biology, and many Ghanaian Language apps.

Multi-Model Testing
October 11, 2025

Testing Intervention, English, Math, and Economics apps across Qwen, Gemini 2.5 Flash: Preview, and Kimi K2. Added disclaimer that MCQs must be double checked. Subject choice requirement removed. Gemini disqualified due to poor tool calling.

Watch Me Work

Deep Dive