Kimi K2
selected
Selected as primary model for production. Best cheap tool calling ability. Requires explicit formatting instructions to match Claude's natural formatting.
Tool Calling Reliability
High - best among cheap models
Strengths
- Excellent tool calling ability
- Low cost
- Good instruction following
- Rated highly for not hallucinating
Weaknesses
- Requires explicit formatting instructions
- Can misinterpret 'GHANA REFERENCES' section as actual references
- May respond as if it read references even when it didn't
- Lower variability needed for better performance
Key Notes
- Tool calls MUST be executed prior to steps where reference content is needed
- Lower variability improves performance
- Requires more explicit prompting around formatting than Claude
Analysis
Kimi K2 was selected as the primary model due to its excellent tool calling capabilities at a low cost. However, it requires careful prompt engineering to match Claude's natural formatting. The model has a tendency to respond as if it read references even when tool calls haven't been executed, requiring explicit workflow instructions.