Preseason
MatchesRankingsPrompts
Contact
Preseason
MatchesRankingsPromptsMethodologyContact

© 2026 Preseason. All rights reserved.

Privacy PolicyTerms & Conditions
LLM Evals
Methodology

Promptfoo vs Braintrust

PromptfooPRPromptfoovsBraintrustBRBraintrust
PromptfooBraintrust
4%
96%

Leading: Braintrust (96.5%)

Statistics

MetricValue
Promptfoo wins11
Braintrust wins303
Abstains (no tool)36
Other tool chosen650
Decisive cases314
Promptfoo win rate (unweighted)3.5%
95% CI2.0% - 6.2%
Promptfoo win rate (weighted)3.5%

Comments

Promptfoo

No comments yet

Verified critics can leave comments here.

Braintrust

No comments yet

Verified critics can leave comments here.

Per-model breakdown

ModelTierPromptfooBraintrustNoneOtherA rate
GPT 5.3 CodexFrontier051030%
Claude Opus 4.6Frontier046080%
Kimi K2.5Frontier144032%
Claude Haiku 4.5Small044170%
GPT 5.4Frontier0390150%
GLM 5 TurboFrontier43271111%
Claude Sonnet 4.6Frontier0280260%
MiniMax M2.7Frontier0171340%
Mistral Small 4Mid40047100%
MiMo V2 ProFrontier2124967%
GPT 5.4 MiniMid011520%
DeepSeek R1 0528Frontier00252n/a
DeepSeek V3.2Mid00943n/a
Devstral 2 2512Mid00150n/a
Gemini 2.5 FlashSmall00052n/a
Gemini 2.5 ProFrontier00648n/a
Llama 4 MaverickFrontier00053n/a
Llama 4 ScoutSmall00346n/a
Qwen3 Coder NextMid00351n/a

Per-prompt breakdown

PromptTierPromptfooBraintrustNoneOtherA rate
ai-revenue-ops-copilotBeginner2634993%
ai-revenue-ops-copilotAdvanced2631983%
ai-revenue-ops-copilotIntermediate44911108%
ai-support-agent-platformAdvanced15211152%
ai-support-agent-platformIntermediate23741235%
ai-support-agent-platformBeginner039251050%