Preseason
MatchesRankingsPrompts
Contact
Preseason
MatchesRankingsPromptsMethodologyContact

© 2026 Preseason. All rights reserved.

Privacy PolicyTerms & Conditions
LLM Evals
Methodology

LangSmith vs Braintrust

LangSmithLALangSmithvsBraintrustBRBraintrust
LangSmithBraintrust
52%
48%

Leading: LangSmith (51.7%)

Statistics

MetricValue
LangSmith wins291
Braintrust wins272
Abstains (no tool)31
Other tool chosen299
Decisive cases563
LangSmith win rate (unweighted)51.7%
95% CI47.6% - 55.8%
LangSmith win rate (weighted)51.7%

Comments

LangSmith

No comments yet

Verified critics can leave comments here.

Braintrust

No comments yet

Verified critics can leave comments here.

Per-model breakdown

ModelTierLangSmithBraintrustNoneOtherA rate
GPT 5.3 CodexFrontier345006%
Claude Haiku 4.5Small339137%
Claude Sonnet 4.6Frontier16250739%
GPT 5.4Frontier6350715%
Claude Opus 4.6Frontier041070%
Kimi K2.5Frontier040040%
DeepSeek R1 0528Frontier39027100%
Gemini 2.5 ProFrontier38055100%
GLM 5 TurboFrontier8296522%
Mistral Small 4Mid350011100%
MiniMax M2.7Frontier191611054%
GPT 5.4 MiniMid33111397%
Qwen3 Coder NextMid310314100%
DeepSeek V3.2Mid29089100%
MiMo V2 ProFrontier22112496%
Llama 4 MaverickFrontier70041100%
Devstral 2 2512Mid10143100%
Gemini 2.5 FlashSmall10046100%
Llama 4 ScoutSmall00243n/a

Per-prompt breakdown

PromptTierLangSmithBraintrustNoneOtherA rate
ai-revenue-ops-copilotIntermediate644413859%
ai-support-agent-platformIntermediate653434666%
ai-support-agent-platformAdvanced524615253%
ai-revenue-ops-copilotBeginner425634943%
ai-revenue-ops-copilotAdvanced405615042%
ai-support-agent-platformBeginner2836226444%