Overview
At a technical level, Lexi is a direct replacement for the OpenAI and Anthropic base URLs. It is fully API-compatible — streaming, tool calls, structured output, and function calling all work without modification.
Lexi uses a share-of-savings billing model. It earns a percentage of what it reduces on each request. On requests where there is nothing to reduce, there is no Lexi fee.
STONE
Lexi sits between your application and the AI provider. Each request passes through STONE (Semantic Token Optimization and Natural Encoding), which restructures the conversation context to use fewer tokens while preserving meaning.
For a full explanation, see the How It Works page.
Getting Started
Three steps to lower token costs.
Create a free account lexisaas.com.
and create an API key from the dashboard
.
lx_live_abc123...
Point your existing OpenAI or Anthropic SDK at api.lexisaas.com instead of the provider endpoint.
const openai = new OpenAI({
baseURL: 'https://api.lexisaas.com/v1',
apiKey: 'lx_live_yourkey:sk-your-openai-key',
});Use your Lexi API key, not your provider key.
That's it. Every request is restructured by STONE, and you only pay 40% of what Lexi saves you.
Authentication
All API requests require an API key passed in the Authorization header as a Bearer token.
Authorization: Bearer {lexi_key}:{provider_key}
// Example with OpenAI:
Authorization: Bearer lx_live_abc123:sk-openai-abc456
// Example with Anthropic:
Authorization: Bearer lx_live_abc123:sk-ant-abc456Endpoints
Lexi mirrors the OpenAI and Anthropic API shapes exactly. Any code written for those APIs works through Lexi without changes.
Proxy Proxy
| Method | Path | Description |
|---|---|---|
| POST | /v1/chat/completions |
Chat completions (OpenAI-compatible) |
| POST | /v1/messages |
Messages (Anthropic-compatible) |
| GET | /v1/models |
List models |
| GET | /v1/models/:model |
Model info |
Account Account
| Method | Path | Description |
|---|---|---|
| GET | /account/overview |
Account overview |
| GET | /account/usage |
Usage records |
| GET | /account/keys |
List API keys |
| POST | /account/keys |
Create API key |
| DELETE | /account/keys/:id |
Revoke API key |
| POST | /account/billing/topup |
Create top-up session |
Health
| Method | Path | Description |
|---|---|---|
| GET | /healthz |
Liveness check |
| GET | /readyz |
Readiness check |
Supported Models
Lexi detects which provider to route to from the model name in your request. Use the same model names you already pass to OpenAI or Anthropic today.
gpt-4o
gpt-4o-mini
gpt-4.1
gpt-4.1-mini
gpt-4.1-nano
gpt-5
gpt-5.2
gpt-5-mini
gpt-5-nano
o1
o3
o3-proclaude-opus-4-6
claude-sonnet-4-6
claude-sonnet-4-5-20250929
claude-haiku-4-5-20251001gemini-3-pro
gemini-3-flash
gemini-2.5-pro
gemini-2.5-flash
gemini-2.0-flashgrok-4
grok-4.1-fast
grok-3
grok-3-minideepseek-chat
deepseek-reasonerllama-4Billing
Lexi charges 40% of the savings it creates. If no tokens are saved, there is no Lexi fee.
Billing formula
// Your provider account is billed:
provider_cost = (restructured_tokens / 1_000_000) × model_input_price
// Your Lexi balance is charged:
lexi_fee = 0.40 × savings
where:
savings = (original_tokens − restructured_tokens) / 1_000_000 × model_input_price
// No savings → lexi_fee = 0. Turn 1 is always passive.Credits
Response Headers
Every proxied response exposes 16 canonical X-Lexi-* headers. Access them from the browser via Access-Control-Expose-Headers (auto-set).
Every proxied response includes headers showing exactly what happened with that request.
Canonical billing + telemetry headers (16)
| Header | Type | Unit | When set | Example |
|---|---|---|---|---|
X-Lexi-Original-Tokens |
integer | tokens | always | 10240 |
X-Lexi-Compressed-Tokens |
integer | tokens | always | 1847 |
X-Lexi-Tokens-Saved |
integer | tokens | always | 8393 |
X-Lexi-Compression-Ratio |
decimal | 0–1 (4 dp) | always | 0.8197 |
X-Lexi-Provider-Cost-Usd |
decimal | USD (F4/F6/F8) | always — 0.0000 if pricing unknown | 0.0185 |
X-Lexi-Gross-Savings-Usd |
decimal | USD (F4/F6/F8) | always — 0 on first-turn or passive | 0.0252 |
X-Lexi-Customer-Savings-Usd |
decimal | USD (F4/F6/F8) | always — customer's 60% share | 0.0151 |
X-Lexi-Customer-Charge-Usd |
decimal | USD (F4/F6/F8) | always — always ≥ provider_cost | 0.0286 |
X-Lexi-Lexico-Profit-Usd |
decimal | USD (F4/F6/F8) | always — LexiCo's 40% share | 0.0101 |
X-Lexi-Customer-Type |
string | enum | always — 'credit' | 'api' | credit |
X-Lexi-Balance-After |
decimal | USD (8 dp) | only when VM1 sync succeeded | 9.97140000 |
X-Lexi-Memory-Usage |
integer | tokens | always — tokens held in session memory | 1847 |
X-Lexi-Recall-Count |
integer | memories | always — recalled memory count this turn | 4 |
X-Lexi-Mode |
string | enum | always — 'first-turn' | 'passive' | 'normal' | 'passthrough' | normal |
X-Lexi-First-Turn |
boolean | 'true' | only on turn 1 of a session | true |
X-Lexi-Request-Id |
string | UUID v4 | always — correlation id for this request | 6b1e… |
Legacy headers (DEPRECATED — migrate to canonical *-Usd set)
| Header | Value |
|---|---|
X-Lexi-Savings-Cents DEPRECATED |
Token savings |
X-Lexi-Margin-Cents DEPRECATED |
Lexi fee for this request |
X-Lexi-Request-Cost-Cents DEPRECATED |
Total cost for this request |
X-Lexi-Balance-Remaining DEPRECATED |
Remaining account balance |
Access-Control-Expose-Headers.
header to track savings per request X-Lexi-* .