Dashboard

Statistics and metrics for Hard Prompts

Overview

Total Prompts
70
Total Responses
1978
AI Models
7
Total Cost
$60.35
Total Generation Time
21h 56m 32s
Total Tokens
7.33M
3.48M in / 3.85M out
Avg Cost per Response
$0.031

Model Analysis Rankings

Distribution of ranking positions across all analyzed prompts

0% 25% 50% 75% 100% 1st 2nd 3rd 4th 5th 6th 7th Gemini 2.5 Pro (2.79) Sonnet 4.5 (3.08) GPT-5 (3.11) Kimi K2 (3.77) Opus 4.1 (4.10) Gemini 3 Pro (4.52) Grok 4 (5.82)
RANKING DISTRIBUTION 0% 25% 50% 75% 100% 1st 2nd 3rd 4th 5th 6th 7th Gemini 2.5 (2.8) Sonnet 4.5 (3.1) GPT-5 (3.1) Kimi (3.8) Opus 4.1 (4.1) Gemini 3 (4.5) Grok 4 (5.8)

Model Breakdown

GPT-5
OpenAI
Responses
296
Total Cost
$9.24
Avg Cost
$0.031
Total Time
4h 21m 52s
Avg Time
53s
Avg Tokens
2.1K in / 2.9K out
Grok 4
xAI
Responses
296
Total Cost
$8.45
Avg Cost
$0.029
Total Time
3h 37m 24s
Avg Time
44s
Avg Tokens
2.1K in / 1.5K out
Sonnet 4.5
Anthropic
Responses
292
Total Cost
$5.37
Avg Cost
$0.018
Total Time
1h 53m 42s
Avg Time
23s
Avg Tokens
2.0K in / 820 out
Gemini 2.5 Pro
Google
Responses
292
Total Cost
$10.01
Avg Cost
$0.034
Total Time
2h 57m 46s
Avg Time
36s
Avg Tokens
1.6K in / 3.2K out
Opus 4.1
Anthropic
Responses
276
Total Cost
$17.38
Avg Cost
$0.063
Total Time
1h 27m 10s
Avg Time
18s
Avg Tokens
1.9K in / 459 out
Kimi K2
Moonshotai
Responses
275
Total Cost
$1.53
Avg Cost
$0.006
Total Time
5h 11m 26s
Avg Time
1m 7s
Avg Tokens
1.2K in / 2.2K out
Gemini 3 Pro
Google
Responses
251
Total Cost
$8.38
Avg Cost
$0.033
Total Time
2h 27m 9s
Avg Time
35s
Avg Tokens
1.2K in / 2.6K out

Prompts Requiring Most AI Effort

Ranked by total output tokens, including both visible responses and internal reasoning/thinking tokens.

1 Financial Suitability Report 224.7K output tokens
2 Complex Program Analysis 208.3K output tokens
3 Multiply 8-Digit Numbers 147.3K output tokens
4 Macbeth Characters 114.5K output tokens
5 AI Deployment: Meridian Bank 107.0K output tokens

Most Expensive Responses

1 Reaction to Anthropic's 'Introspective Awareness' paper
Opus 4.1 - $1.18 Other models avg: $0.13
2 Next 50 years
Opus 4.1 - $0.76 Other models avg: $0.09
3 Reaction to Anthropic's 'Introspective Awareness' paper
Gemini 3 Pro - $0.51 Other models avg: $0.24
4 Complex Program Analysis
Opus 4.1 - $0.41 Other models avg: $0.15
5 Financial Suitability Report
Sonnet 4.5 - $0.41 Other models avg: $0.11