Overview
At a technical level, Lexi is a direct replacement for the OpenAI and Anthropic base URLs. It is fully API-compatible — streaming, tool calls, structured output, and function calling all work without modification.
Lexi uses a share-of-savings billing model. It earns a percentage of what it reduces on each request. On requests where there is nothing to reduce, there is no Lexi fee.
STONE
Lexi sits between your application and the AI provider. Each request passes through STONE (Semantic Token Optimization and Natural Encoding), which restructures the conversation context to use fewer tokens while preserving meaning.
For a full explanation, see the How It Works page.
Getting Started
Three steps to lower token costs.
Create a free account lexisaas.com.
and create an API key from the dashboard
.
lx_live_abc123...
Point your existing OpenAI or Anthropic SDK at api.lexisaas.com instead of the provider endpoint.
const openai = new OpenAI({
baseURL: 'https://api.lexisaas.com/v1',
apiKey: 'lx_live_yourkey:sk-your-openai-key',
});Use your Lexi API key, not your provider key.
That's it. Every request is restructured by STONE, and you only pay 40% of what Lexi saves you.
Authentication
All API requests require an API key passed in the Authorization header as a Bearer token.
Authorization: Bearer {lexi_key}:{provider_key}
// Example with OpenAI:
Authorization: Bearer lx_live_abc123:sk-openai-abc456
// Example with Anthropic:
Authorization: Bearer lx_live_abc123:sk-ant-abc456Endpoints
Lexi mirrors the OpenAI and Anthropic API shapes exactly. Any code written for those APIs works through Lexi without changes.
Proxy Proxy
| Method | Path | Description |
|---|---|---|
| POST | /v1/chat/completions |
Chat completions (OpenAI-compatible) |
| POST | /v1/messages |
Messages (Anthropic-compatible) |
| GET | /v1/models |
List models |
| GET | /v1/models/:model |
Model info |
Account Account
| Method | Path | Description |
|---|---|---|
| GET | /account/overview |
Account overview |
| GET | /account/usage |
Usage records |
| GET | /account/keys |
List API keys |
| POST | /account/keys |
Create API key |
| DELETE | /account/keys/:id |
Revoke API key |
| POST | /account/billing/topup |
Create top-up session |
Health
| Method | Path | Description |
|---|---|---|
| GET | /healthz |
Liveness check |
| GET | /readyz |
Readiness check |
Supported Models
Lexi detects which provider to route to from the model name in your request. Use the same model names you already pass to OpenAI or Anthropic today.
gpt-4o
gpt-4o-mini
gpt-4.1
gpt-4.1-mini
gpt-4.1-nano
gpt-5
gpt-5.2
gpt-5-mini
gpt-5-nano
o1
o3
o3-proclaude-opus-4-6
claude-sonnet-4-6
claude-sonnet-4-5-20250929
claude-haiku-4-5-20251001gemini-3-pro
gemini-3-flash
gemini-2.5-pro
gemini-2.5-flash
gemini-2.0-flashgrok-4
grok-4.1-fast
grok-3
grok-3-minideepseek-chat
deepseek-reasonerllama-4Billing
Lexi charges 40% of the savings it creates. If no tokens are saved, there is no Lexi fee.
Billing formula
// Your provider account is billed:
provider_cost = (restructured_tokens / 1_000_000) × model_input_price
// Your Lexi balance is charged:
lexi_fee = 0.40 × savings
where:
savings = (original_tokens − restructured_tokens) / 1_000_000 × model_input_price
// No savings → lexi_fee = 0. Turn 1 is always passive.Credits
Response Headers
Every proxied response includes headers showing exactly what happened with that request.
| Header | Value |
|---|---|
X-Lexi-Original-Tokens |
Tokens before restructuring |
X-Lexi-Compressed-Tokens |
Tokens after restructuring |
X-Lexi-Compression-Ratio |
Compression ratio |
X-Lexi-Savings-Cents |
Token savings |
X-Lexi-Margin-Cents |
Lexi fee for this request |
X-Lexi-Request-Cost-Cents |
Total cost for this request |
X-Lexi-Balance-Remaining |
Remaining account balance |
X-Lexi-Turn |
Conversation turn number in this session |
X-Lexi-Provider |
Detected upstream provider, e.g. |
X-Lexi-Model |
Model name used for this request |
Access-Control-Expose-Headers.
header to track savings per request X-Lexi-* .