Overview

Lexi is a proxy — it sits between your application and the AI service you use. Requests pass through Lexi on the way out; responses come back the same way. Your application doesn't notice the difference. The bill does — and long conversations stay coherent for much longer.

At a technical level, Lexi is a direct replacement for the OpenAI and Anthropic base URLs. It is fully API-compatible — streaming, tool calls, structured output, and function calling all work without modification.

Lexi uses a share-of-savings billing model. It earns a percentage of what it reduces on each request. On requests where there is nothing to reduce, there is no Lexi fee.

shield
Transparency
Every request returns detailed headers showing tokens saved, cost, and compression ratio.

STONE

Lexi sits between your application and the AI provider. Each request passes through STONE (Semantic Token Optimization and Natural Encoding), which restructures the conversation context to use fewer tokens while preserving meaning.

For a full explanation, see the How It Works page.

Getting Started

Three steps to lower token costs.

01
Get your API key

Create a free account lexisaas.com. and create an API key from the dashboard . lx_live_abc123...

02
Change the base URL

Point your existing OpenAI or Anthropic SDK at api.lexisaas.com instead of the provider endpoint.

const openai = new OpenAI({
  baseURL: 'https://api.lexisaas.com/v1',
  apiKey:  'lx_live_yourkey:sk-your-openai-key',
});

Use your Lexi API key, not your provider key.

03
Send requests

That's it. Every request is restructured by STONE, and you only pay 40% of what Lexi saves you.

Authentication

All API requests require an API key passed in the Authorization header as a Bearer token.

Authorization: Bearer {lexi_key}:{provider_key}

// Example with OpenAI:
Authorization: Bearer lx_live_abc123:sk-openai-abc456

// Example with Anthropic:
Authorization: Bearer lx_live_abc123:sk-ant-abc456

Endpoints

Lexi mirrors the OpenAI and Anthropic API shapes exactly. Any code written for those APIs works through Lexi without changes.

Proxy Proxy

MethodPathDescription
POST /v1/chat/completions Chat completions (OpenAI-compatible)
POST /v1/messages Messages (Anthropic-compatible)
GET /v1/models List models
GET /v1/models/:model Model info

Account Account

MethodPathDescription
GET /account/overview Account overview
GET /account/usage Usage records
GET /account/keys List API keys
POST /account/keys Create API key
DELETE /account/keys/:id Revoke API key
POST /account/billing/topup Create top-up session

Health

MethodPathDescription
GET /healthz Liveness check
GET /readyz Readiness check

Supported Models

Lexi detects which provider to route to from the model name in your request. Use the same model names you already pass to OpenAI or Anthropic today.

OpenAI
gpt-4o gpt-4o-mini gpt-4.1 gpt-4.1-mini gpt-4.1-nano gpt-5 gpt-5.2 gpt-5-mini gpt-5-nano o1 o3 o3-pro
Anthropic
claude-opus-4-6 claude-sonnet-4-6 claude-sonnet-4-5-20250929 claude-haiku-4-5-20251001
Google
gemini-3-pro gemini-3-flash gemini-2.5-pro gemini-2.5-flash gemini-2.0-flash
xAI
grok-4 grok-4.1-fast grok-3 grok-3-mini
DeepSeek
deepseek-chat deepseek-reasoner
Meta
llama-4

Billing

Lexi charges 40% of the savings it creates. If no tokens are saved, there is no Lexi fee.

Billing formula

// Your provider account is billed:
provider_cost = (restructured_tokens / 1_000_000) × model_input_price

// Your Lexi balance is charged:
lexi_fee = 0.40 × savings

where:
  savings = (original_tokens − restructured_tokens) / 1_000_000 × model_input_price

// No savings → lexi_fee = 0. Turn 1 is always passive.

Credits

redeem
Credits
You can top up your balance .

Response Headers

Every proxied response exposes 16 canonical X-Lexi-* headers. Access them from the browser via Access-Control-Expose-Headers (auto-set).

Every proxied response includes headers showing exactly what happened with that request.

Canonical billing + telemetry headers (16)

Header Type Unit When set Example
X-Lexi-Original-Tokens integer tokens always 10240
X-Lexi-Compressed-Tokens integer tokens always 1847
X-Lexi-Tokens-Saved integer tokens always 8393
X-Lexi-Compression-Ratio decimal 0–1 (4 dp) always 0.8197
X-Lexi-Provider-Cost-Usd decimal USD (F4/F6/F8) always — 0.0000 if pricing unknown 0.0185
X-Lexi-Gross-Savings-Usd decimal USD (F4/F6/F8) always — 0 on first-turn or passive 0.0252
X-Lexi-Customer-Savings-Usd decimal USD (F4/F6/F8) always — customer's 60% share 0.0151
X-Lexi-Customer-Charge-Usd decimal USD (F4/F6/F8) always — always ≥ provider_cost 0.0286
X-Lexi-Lexico-Profit-Usd decimal USD (F4/F6/F8) always — LexiCo's 40% share 0.0101
X-Lexi-Customer-Type string enum always — 'credit' | 'api' credit
X-Lexi-Balance-After decimal USD (8 dp) only when VM1 sync succeeded 9.97140000
X-Lexi-Memory-Usage integer tokens always — tokens held in session memory 1847
X-Lexi-Recall-Count integer memories always — recalled memory count this turn 4
X-Lexi-Mode string enum always — 'first-turn' | 'passive' | 'normal' | 'passthrough' normal
X-Lexi-First-Turn boolean 'true' only on turn 1 of a session true
X-Lexi-Request-Id string UUID v4 always — correlation id for this request 6b1e…

Legacy headers (DEPRECATED — migrate to canonical *-Usd set)

These headers will be removed after LDC clients migrate. Do not build new integrations on them.
HeaderValue
X-Lexi-Savings-Cents DEPRECATED Token savings
X-Lexi-Margin-Cents DEPRECATED Lexi fee for this request
X-Lexi-Request-Cost-Cents DEPRECATED Total cost for this request
X-Lexi-Balance-Remaining DEPRECATED Remaining account balance
visibility
Tip
Use the Access-Control-Expose-Headers. header to track savings per request X-Lexi-* .
Billing model →
An unhandled error has occurred. Reload X