info Overview

Overview

Lexi is a proxy — it sits between your application and the AI service you use. Requests pass through Lexi on the way out; responses come back the same way. Your application doesn't notice the difference. The bill does — and long conversations stay coherent for much longer.

At a technical level, Lexi is a direct replacement for the OpenAI and Anthropic base URLs. It is fully API-compatible — streaming, tool calls, structured output, and function calling all work without modification.

Lexi uses a share-of-savings billing model. It earns a percentage of what it reduces on each request. On requests where there is nothing to reduce, there is no Lexi fee.

shield

Transparency

Every request returns detailed headers showing tokens saved, cost, and compression ratio.

neurology How it works

STONE

Lexi sits between your application and the AI provider. Each request passes through STONE (Semantic Token Optimization and Natural Encoding), which restructures the conversation context to use fewer tokens while preserving meaning.

For a full explanation, see the How It Works page.

rocket_launch Start

Getting Started

Three steps to lower token costs.

01

Get your API key

Create a free account lexisaas.com. and create an API key from the dashboard . lx_live_abc123...

02

Change the base URL

Point your existing OpenAI or Anthropic SDK at api.lexisaas.com instead of the provider endpoint.

const openai = new OpenAI({
  baseURL: 'https://api.lexisaas.com/v1',
  apiKey:  'lx_live_yourkey:sk-your-openai-key',
});

Use your Lexi API key, not your provider key.

03

Send requests

That's it. Every request is restructured by STONE, and you only pay 40% of what Lexi saves you.

key Auth

Authentication

All API requests require an API key passed in the Authorization header as a Bearer token.

Authorization: Bearer {lexi_key}:{provider_key}

// Example with OpenAI:
Authorization: Bearer lx_live_abc123:sk-openai-abc456

// Example with Anthropic:
Authorization: Bearer lx_live_abc123:sk-ant-abc456

api Endpoints

Endpoints

Lexi mirrors the OpenAI and Anthropic API shapes exactly. Any code written for those APIs works through Lexi without changes.

Proxy Proxy

Method	Path	Description
POST	`/v1/chat/completions`	Chat completions (OpenAI-compatible)
POST	`/v1/messages`	Messages (Anthropic-compatible)
GET	`/v1/models`	List models
GET	`/v1/models/:model`	Model info

Account Account

Method	Path	Description
GET	`/account/overview`	Account overview
GET	`/account/usage`	Usage records
GET	`/account/keys`	List API keys
POST	`/account/keys`	Create API key
DELETE	`/account/keys/:id`	Revoke API key
POST	`/account/billing/topup`	Create top-up session

Health

Method	Path	Description
GET	`/healthz`	Liveness check
GET	`/readyz`	Readiness check

model_training Models

Supported Models

Lexi detects which provider to route to from the model name in your request. Use the same model names you already pass to OpenAI or Anthropic today.

OpenAI

gpt-4o gpt-4o-mini gpt-4.1 gpt-4.1-mini gpt-4.1-nano gpt-5 gpt-5.2 gpt-5-mini gpt-5-nano o1 o3 o3-pro

Anthropic

claude-opus-4-6 claude-sonnet-4-6 claude-sonnet-4-5-20250929 claude-haiku-4-5-20251001

Google

gemini-3-pro gemini-3-flash gemini-2.5-pro gemini-2.5-flash gemini-2.0-flash

xAI

grok-4 grok-4.1-fast grok-3 grok-3-mini

DeepSeek

deepseek-chat deepseek-reasoner

Billing

Lexi charges 40% of the savings it creates. If no tokens are saved, there is no Lexi fee.

Billing formula

// Your provider account is billed:
provider_cost = (restructured_tokens / 1_000_000) × model_input_price

// Your Lexi balance is charged:
lexi_fee = 0.40 × savings

where:
  savings = (original_tokens − restructured_tokens) / 1_000_000 × model_input_price

// No savings → lexi_fee = 0. Turn 1 is always passive.

Credits

redeem

Credits

You can top up your balance .

receipt_long Headers

Response Headers

Every proxied response exposes 16 canonical X-Lexi-* headers. Access them from the browser via Access-Control-Expose-Headers (auto-set).

Every proxied response includes headers showing exactly what happened with that request.

Canonical billing + telemetry headers (16)

Header	Type	Unit	When set	Example
`X-Lexi-Original-Tokens`	integer	tokens	always	`10240`
`X-Lexi-Compressed-Tokens`	integer	tokens	always	`1847`
`X-Lexi-Tokens-Saved`	integer	tokens	always	`8393`
`X-Lexi-Compression-Ratio`	decimal	0–1 (4 dp)	always	`0.8197`
`X-Lexi-Provider-Cost-Usd`	decimal	USD (F4/F6/F8)	always — 0.0000 if pricing unknown	`0.0185`
`X-Lexi-Gross-Savings-Usd`	decimal	USD (F4/F6/F8)	always — 0 on first-turn or passive	`0.0252`
`X-Lexi-Customer-Savings-Usd`	decimal	USD (F4/F6/F8)	always — customer's 60% share	`0.0151`
`X-Lexi-Customer-Charge-Usd`	decimal	USD (F4/F6/F8)	always — always ≥ provider_cost	`0.0286`
`X-Lexi-Lexico-Profit-Usd`	decimal	USD (F4/F6/F8)	always — LexiCo's 40% share	`0.0101`
`X-Lexi-Customer-Type`	string	enum	always — 'credit' \| 'api'	`credit`
`X-Lexi-Balance-After`	decimal	USD (8 dp)	only when VM1 sync succeeded	`9.97140000`
`X-Lexi-Memory-Usage`	integer	tokens	always — tokens held in session memory	`1847`
`X-Lexi-Recall-Count`	integer	memories	always — recalled memory count this turn	`4`
`X-Lexi-Mode`	string	enum	always — 'first-turn' \| 'passive' \| 'normal' \| 'passthrough'	`normal`
`X-Lexi-First-Turn`	boolean	'true'	only on turn 1 of a session	`true`
`X-Lexi-Request-Id`	string	UUID v4	always — correlation id for this request	`6b1e…`

Legacy headers (DEPRECATED — migrate to canonical *-Usd set)

These headers will be removed after LDC clients migrate. Do not build new integrations on them.

Header	Value
`X-Lexi-Savings-Cents` DEPRECATED	Token savings
`X-Lexi-Margin-Cents` DEPRECATED	Lexi fee for this request
`X-Lexi-Request-Cost-Cents` DEPRECATED	Total cost for this request
`X-Lexi-Balance-Remaining` DEPRECATED	Remaining account balance

visibility

Tip

Use the Access-Control-Expose-Headers. header to track savings per request X-Lexi-* .

Billing model →