Inference Cost Dashboard

1,016

1,016 Real Prompts

Mar 2 – Mar 29, 2026

$11.84

Total Savings

vs all-Claude baseline

77.7%

Savings Rate

routed $3.40 vs actual $15.24

0.294

Avg Complexity

out of 1.0 scale

$15.24

Actual Spend

what was charged (all Claude)

$3.40

Routed Cost

what gateway would charge

Cost & Volume Timeline

Daily requests and cost — actual vs routed (Mar 2–29 2026)

Actual cost

Routed cost

Request count

03-02

03-03

03-04

03-05

03-07

03-08

03-09

03-10

03-12

03-13

03-14

03-15

03-16

03-19

03-20

03-21

03-22

03-23

03-24

03-25

03-26

03-27

03-28

03-29

Cumulative savings ($0 → $11.84)

Complexity score distribution (0.0 → 1.0 · 20 buckets)

265

156

240

171

0.00.250.500.751.0

Model Routing

Routing distribution (n=1,016)

Haiku 807 prompts (79.4%) $0.0008/req

Sonnet 32 prompts (3.1%) $0.003/req

Claude 177 prompts (17.4%) $0.015/req

Complexity range per model tier

Haiku avg 0.138

Sonnet avg 0.370

Claude avg 0.991

0.00.300.600.851.0

Routed cost split by model tier

Haiku

$0.6456 (19%)

Sonnet

$0.096 (3%)

Claude

$2.655 (78%)

Cost efficiency — what you paid vs what you needed to pay

All-Claude (actual) $15.24

Routed (gateway) $3.40

$11.84 saved (77.7%)

Request Type Analysis

29.8%

summarization

303 prompts

saved $4.18 (92%)

26.8%

general

272 prompts

saved $3.86 (94%)

22.2%

extraction

226 prompts

saved $0.95 (28%)

12.5%

reasoning

127 prompts

saved $1.69 (88%)

8.7%

generation

88 prompts

saved $1.17 (89%)

Savings by request type

summarization

$4.18 (92%)

general

$3.86 (94%)

reasoning

$1.69 (88%)

generation

$1.17 (89%)

extraction

$0.95 (28%)

Average complexity by type

extraction

0.756

summarization

0.281

generation

0.133

reasoning

0.132

general

0.052

Key insight: Extraction has the highest avg complexity (0.756) because prompts include full clinical notes. General queries average just 0.052 — almost all routed to Haiku.

Top 20 highest-cost prompts — actual vs what gateway would charge

#	Prompt	Type	Model	Actual $	Routed $	Would Save
1	Evaluate this You are a senior product engineer and UX systems designer working	extraction	Claude	$0.0150	$0.0150	0%
2	Stop hook feedback: [Preview Required] Code was edited but no dev server is runn	general	Haiku	$0.0150	$0.0008	95%
3	bt13qbmv0 toolu_01GatfEdqu7E	extraction	Sonnet	$0.0150	$0.0030	80%
4	Analyze this You are a senior full-stack engineer working on SignalCare, a healt	extraction	Claude	$0.0150	$0.0150	0%
5	Stop hook feedback: [Verification Required] Code was edited while a preview serv	general	Haiku	$0.0150	$0.0008	95%
6	Do we have auto policy rules in the platform that moves referral state based on	general	Haiku	$0.0150	$0.0008	95%
7	You are a senior full-stack engineer working on SignalCare, a healthcare workflo	extraction	Claude	$0.0150	$0.0150	0%
8	We currently move cases to a queue that belons to a group. So if anyone working	general	Haiku	$0.0150	$0.0008	95%
9	You are a senior full-stack engineer working on SignalCare, a healthcare workflo	extraction	Claude	$0.0150	$0.0150	0%
10	You are a senior full-stack engineer working on SignalCare, a healthcare workflo	summarization	Claude	$0.0150	$0.0150	0%
11	b75cm5rzb toolu_01GsX5DbMuBH	summarization	Haiku	$0.0150	$0.0008	95%
12	I want to enable or disable the various features we last implemented in the UI.	general	Haiku	$0.0150	$0.0008	95%
13	Are you creating a web page in config to do this?	general	Haiku	$0.0150	$0.0008	95%
14	Yes, I wan to make it possible without a redeploy. Is that doable	generation	Haiku	$0.0150	$0.0008	95%
15	Yes, Add the page under System Config. Clearly explain what each feature does wh	reasoning	Haiku	$0.0150	$0.0008	95%
16	b95gsb569 toolu_01L2dUZ4yaWn	summarization	Haiku	$0.0150	$0.0008	95%
17	Update the product rating matrix you built earlier based on what we have now. Sh	reasoning	Haiku	$0.0150	$0.0008	95%
18	Agreed. Create 10 new referrals using the latest template. Create the referrals	generation	Haiku	$0.0150	$0.0008	95%
19	Can you log in to __https://app.signalcare.ai____ what is app__	general	Haiku	$0.0150	$0.0008	95%
20	Yes. Just give me your opinion. About, like, what can we do in terms of layering	general	Sonnet	$0.0150	$0.0030	80%

Activity Patterns

Requests by hour of day (UTC) — peak 8h–20h highlighted

0h6h12h18h23h

Prompt length distribution

<100

297

100-500

297

500-1k

242

1k-5k

5k+

158

297 prompts (<100 chars) are simple queries routed directly to Haiku. 158 prompts (>5k chars) include full clinical notes — these drive extraction cost.

24-hour activity heatmap — hover for exact counts

All 1,016 Prompts

1,016 of 1,016

#	Prompt	Type	Date	Complexity	Chars	Model	Routed $	Actual $	Saved

ROI Calculator

Monthly prompt volume

Savings rate (77.7% from your real data)

$75.00

Current cost (all Claude)

$16.73

Optimized (with routing)

$58.27

Monthly savings

Based on your real 77.7% savings · Haiku $0.0008 · Sonnet $0.003 · Claude $0.015 per request