Preseason
MatchesRankingsPrompts
GitHub
Preseason
MatchesRankingsPromptsMethodologyContact

© 2026 Preseason. All rights reserved.

Privacy PolicyTerms & Conditions
LLM Evals
Methodology

Patronus AI vs TruLens

Patronus AIPAPatronus AIvsTRTruLens
Patronus AITruLens
50%
50%
Insufficient data
This matchup has 14 decisive cases (minimum 30 required for publication).

Statistics

MetricValue
Patronus AI wins7
TruLens wins7
Abstains (no tool)90
Other tool chosen2340
Decisive cases14
Patronus AI win rate (unweighted)50.0%
95% CI26.8% - 73.2%
Patronus AI win rate (weighted)50.0%

Comments

Patronus AI

No comments yet

Verified critics can leave comments here.

TruLens

No comments yet

Verified critics can leave comments here.

Per-model breakdown

ModelTierPatronus AITruLensNoneOtherA rate
MiMo V2 ProFrontier408120100%
Gemini 2.5 ProFrontier21912067%
DeepSeek R1 0528Frontier0371220%
Mistral Small 4Mid0311200%
GPT 5.4Frontier100131100%
Claude Haiku 4.5Small001124n/a
Claude Opus 4.6Frontier000132n/a
Claude Sonnet 4.6Frontier000132n/a
DeepSeek V3.2Mid0022106n/a
Devstral 2 2512Mid004121n/a
Gemini 2.5 FlashSmall001126n/a
GLM 5 TurboFrontier0019113n/a
GPT 5.3 CodexFrontier000132n/a
GPT 5.4 MiniMid003129n/a
Kimi K2.5Frontier003116n/a
Llama 4 MaverickFrontier000127n/a
Llama 4 ScoutSmall004117n/a
MiniMax M2.7Frontier005124n/a
Qwen3 Coder NextMid003128n/a

Per-prompt breakdown

PromptTierPatronus AITruLensNoneOtherA rate
ai-revenue-ops-copilotAdvanced41239380%
ai-support-agent-platformAdvanced21540267%
ai-support-agent-platformIntermediate0354010%
ai-support-agent-platformBeginner1064346100%
ai-revenue-ops-copilotIntermediate0143990%
ai-revenue-ops-copilotBeginner01103990%