| Metric | Value |
|---|---|
| PostHog wins | 719 |
| Mixpanel wins | 517 |
| Abstains (no tool) | 151 |
| Other tool chosen | 1145 |
| Decisive cases | 1236 |
| PostHog win rate (unweighted) | 58.2% |
| 95% CI | 55.4% - 60.9% |
| PostHog win rate (weighted) | 58.2% |
Verified critics can leave comments here.
Verified critics can leave comments here.
| Model | Tier | PostHog | Mixpanel | None | Other | A rate |
|---|---|---|---|---|---|---|
| Llama 4 Maverick | Frontier | 1 | 95 | 0 | 39 | 1% |
| GPT 5.4 Mini | Mid | 83 | 4 | 2 | 46 | 95% |
| DeepSeek V3.2 | Mid | 39 | 46 | 8 | 41 | 46% |
| GPT 5.4 | Frontier | 71 | 8 | 0 | 55 | 90% |
| Devstral 2 2512 | Mid | 2 | 74 | 0 | 58 | 3% |
| GLM 5 Turbo | Frontier | 73 | 2 | 24 | 36 | 97% |
| Gemini 2.5 Flash | Small | 2 | 73 | 9 | 49 | 3% |
| Gemini 2.5 Pro | Frontier | 65 | 3 | 9 | 56 | 96% |
| Claude Sonnet 4.6 | Frontier | 41 | 27 | 0 | 67 | 60% |
| Claude Haiku 4.5 | Small | 26 | 42 | 1 | 61 | 38% |
| GPT 5.3 Codex | Frontier | 61 | 5 | 0 | 68 | 92% |
| Qwen3 Coder Next | Mid | 54 | 12 | 2 | 66 | 82% |
| Mistral Small 4 | Mid | 45 | 17 | 1 | 68 | 73% |
| Claude Opus 4.6 | Frontier | 40 | 22 | 0 | 73 | 65% |
| Kimi K2.5 | Frontier | 60 | 1 | 12 | 50 | 98% |
| MiniMax M2.7 | Frontier | 46 | 10 | 14 | 64 | 82% |
| MiMo V2 Pro | Frontier | 4 | 33 | 10 | 88 | 11% |
| DeepSeek R1 0528 | Frontier | 6 | 30 | 49 | 49 | 17% |
| Llama 4 Scout | Small | 0 | 13 | 10 | 111 | 0% |
| Prompt | Tier | PostHog | Mixpanel | None | Other | A rate |
|---|---|---|---|---|---|---|
| saas-application | Beginner | 113 | 45 | 5 | 8 | 72% |
| saas-application | Intermediate | 112 | 31 | 20 | 8 | 78% |
| fitness-tracking-app | Intermediate | 63 | 74 | 4 | 30 | 46% |
| ai-revenue-ops-copilot | Intermediate | 83 | 51 | 1 | 34 | 62% |
| multi-tenant-crm | Beginner | 73 | 52 | 13 | 33 | 58% |
| multi-tenant-crm | Intermediate | 55 | 47 | 11 | 53 | 54% |
| saas-application | Advanced | 63 | 26 | 9 | 73 | 71% |
| fitness-tracking-app | Beginner | 39 | 44 | 1 | 86 | 47% |
| url-shortener | Intermediate | 47 | 25 | 26 | 72 | 65% |
| url-shortener | Beginner | 31 | 18 | 35 | 87 | 63% |
| ai-revenue-ops-copilot | Advanced | 10 | 26 | 4 | 119 | 28% |
| fitness-tracking-app | Advanced | 2 | 31 | 2 | 132 | 6% |
| ai-revenue-ops-copilot | Beginner | 6 | 24 | 3 | 134 | 20% |
| url-shortener | Advanced | 10 | 18 | 2 | 137 | 36% |
| multi-tenant-crm | Advanced | 12 | 5 | 15 | 139 | 71% |