Billing model
Transparent, per-request pricing. We take 40% of your savings. You keep 60%. STONE saves 0% → you pay 0.
Pricing formula (H.1)
We charge list-price for what actually reaches the provider, plus 40% of the gross savings STONE creates. The customer always nets 60% of the savings.
provider_cost = actual_input_tokens × P_in + output_tokens × P_out
gross_savings = max(0, (original_tokens − actual_input_tokens)) × P_in
customer_savings = gross_savings × 0.60
lexico_profit = gross_savings × 0.40
customer_charge = provider_cost + lexico_profit (credit customer)
// P_in, P_out are list prices per 1M tokens for the selected model.Turn 1 is always passive
The first request in a new conversation (turnNumber = 1) has no accumulated context to restructure. STONE forwards your payload unchanged and the bomb-proof first-turn guard forces gross_savings = 0, lexico_profit = 0. You pay provider_cost only.
Passive mode
When STONE cannot reduce tokens (e.g. a short single-turn prompt, or a payload that grew after restructuring), gross_savings is clamped to 0 and no Lexi fee applies. The response carries X-Lexi-Mode: passive.
Provider errors
Streaming abort
If your client disconnects mid-stream, billing runs on the tokens actually delivered. The invariant customer_charge ≥ provider_cost always holds. You are never undercharged for what the provider produced, never overcharged for what you didn't receive.
invariant: customer_charge ≥ provider_cost (always)Refunds
EU/EEA consumers have 14 days withdrawal right from signup. Manage refund requests at lexico.no/refund. Usage consumed before the refund request is non-refundable per EU Directive 2011/83/EU Art. 16(m).
Rate limits per tier
Limits auto-scale with lifetime spend. No tiers to pick.
| Lifetime spend | Requests / minute | Max API keys |
|---|---|---|
| Free | 120 | 10,000 |
| $5+ | 2,000 | 10,000 |
| $50+ | 10,000 | 10,000 |
| $500+ | 30,000 | 10,000 |
| $1000+ | 60,000 | 10,000 |
Response headers (16)
All 16 canonical X-Lexi headers are exposed on every /v1/chat/completions and /v1/messages response. Full type and unit reference in the headers section of /docs.
X-Lexi-Original-Tokens | Pre-STONE input tokens. |
X-Lexi-Compressed-Tokens | Tokens sent to provider. |
X-Lexi-Tokens-Saved | original − compressed. |
X-Lexi-Compression-Ratio | saved / original (0–1, 4 dp). |
X-Lexi-Provider-Cost-Usd | List-price USD for tokens actually sent. |
X-Lexi-Gross-Savings-Usd | tokens_saved × input_price. |
X-Lexi-Customer-Savings-Usd | gross × 0.60 (your share). |
X-Lexi-Customer-Charge-Usd | provider_cost + lexico_profit. |
X-Lexi-Lexico-Profit-Usd | gross × 0.40 (our share). |
X-Lexi-Customer-Type | credit | api. |
X-Lexi-Balance-After | Credit balance after this charge (USD). |
X-Lexi-Memory-Usage | Tokens in session memory. |
X-Lexi-Recall-Count | Recalled memory count. |
X-Lexi-Mode | first-turn | passive | normal | passthrough. |
X-Lexi-First-Turn | Only when turn 1: true. |
X-Lexi-Request-Id | Correlation UUID — matches X-Request-Id. |