Preseason
MatchesRankingsPrompts
Contact
Preseason
MatchesRankingsPromptsMethodologyContact

© 2026 Preseason. All rights reserved.

Privacy PolicyTerms & Conditions
LLM Evals
Methodology

Braintrust vs LangSmith

BraintrustBRBraintrustvsLangSmithLALangSmith
BraintrustLangSmith
48%
52%

Leading: LangSmith (52.1%)

Statistics

MetricValue
Braintrust wins303
LangSmith wins330
Abstains (no tool)36
Other tool chosen331
Decisive cases633
Braintrust win rate (unweighted)47.9%
95% CI44.0% - 51.8%
Braintrust win rate (weighted)47.9%

Comments

Braintrust

No comments yet

Verified critics can leave comments here.

LangSmith

No comments yet

Verified critics can leave comments here.

Per-model breakdown

ModelTierBraintrustLangSmithNoneOtherA rate
GPT 5.3 CodexFrontier5130094%
Claude Haiku 4.5Small4431494%
Claude Opus 4.6Frontier46008100%
GPT 5.4Frontier3970885%
Claude Sonnet 4.6Frontier28180861%
Kimi K2.5Frontier44004100%
DeepSeek R1 0528Frontier044280%
GLM 5 TurboFrontier32107576%
Gemini 2.5 ProFrontier042660%
MiniMax M2.7Frontier172311143%
Mistral Small 4Mid0400110%
GPT 5.4 MiniMid1361163%
Qwen3 Coder NextMid0353160%
DeepSeek V3.2Mid0329110%
MiMo V2 ProFrontier1262254%
Llama 4 MaverickFrontier080450%
Devstral 2 2512Mid021480%
Gemini 2.5 FlashSmall010510%
Llama 4 ScoutSmall00346n/a

Per-prompt breakdown

PromptTierBraintrustLangSmithNoneOtherA rate
ai-revenue-ops-copilotIntermediate497314140%
ai-support-agent-platformAdvanced526015646%
ai-support-agent-platformIntermediate377445133%
ai-revenue-ops-copilotBeginner634745457%
ai-revenue-ops-copilotAdvanced634515558%
ai-support-agent-platformBeginner3931257456%