Documentation
Smart AIPI provides both OpenAI and Anthropic-compatible APIs. Use our service with any existing OpenAI or Anthropic SDK, tool, or app by simply changing the base URL. Zero code changes required.
OpenAI Base URL
https://api.smartaipi.com/v1
Anthropic Base URL
https://api.smartaipi.com
CLI & MCP Tools
Smart AIPI provides two npm packages to help developers work with their accounts:
Install Both
npm install -g smart-aipi @smart-aipi/mcp
Install Individually
npm install -g smart-aipi
npm install -g @smart-aipi/mcp
Authentication
All API requests require an API key. The same key works with both authentication methods:
Authorization: Bearer YOUR_API_KEY
x-api-key: YOUR_API_KEY
Migration Guide
Migrating takes less than a minute. Our API is 100% compatible with both OpenAI and Anthropic endpoints.
From OpenAI
Get your Smart AIPI API key
Sign up and create an API key from your dashboard.
Change the base URL
Replace https://api.openai.com/v1 with https://api.smartaipi.com/v1
Update your API key
Use your Smart AIPI key instead of your OpenAI key. That's it!
From Anthropic
Get your Smart AIPI API key
Sign up and create an API key from your dashboard.
Change the base URL
Replace https://api.anthropic.com with https://api.smartaipi.com
Update your API key
Use your Smart AIPI key instead of your Anthropic key. Your existing Claude model names work as-is.
Your existing code, SDKs, and apps using the Anthropic Messages API will work without any code changes. Claude model names (claude-opus-4-6, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001) are automatically routed to high-performance GPT-5.4 models. Every response includes an x-actual-model header showing the real backend model for full transparency.
Chat Completions
Create a chat completion for the provided messages and model. This is the main endpoint for interacting with language models.
Request Body Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID to use (e.g., "gpt-5.4", "gpt-5.3-codex")%!(EXTRA string=gpt-5.4, string=gpt-5.3-codex) |
| messages | array | Yes | Array of message objects with role and content |
| temperature | number | No | Sampling temperature (0-2). Higher = more random. Default: 1 |
| max_tokens | integer | No | Maximum tokens to generate in the response |
| top_p | number | No | Nucleus sampling. Consider tokens with top_p probability. Default: 1 |
| frequency_penalty | number | No | Penalize tokens based on frequency (-2 to 2). Default: 0 |
| presence_penalty | number | No | Penalize tokens based on presence (-2 to 2). Default: 0 |
| stop | string/array | No | Stop sequences. Up to 4 sequences where generation stops. |
| stream | boolean | No | Enable streaming responses via SSE. Default: false |
Message Roles
system- system - Sets the behavior/persona of the assistantuser- user - Messages from the userassistant- assistant - Previous responses from the assistant
Code Examples
from openai import OpenAI
client = OpenAI(
base_url="https://api.smartaipi.com/v1",
api_key="your-api-key"
)
response = client.chat.completions.create(
model="gpt-5.4",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
temperature=0.7,
max_tokens=1000
)
print(response.choices[0].message.content)
Streaming
Enable streaming to receive tokens as they're generated via Server-Sent Events (SSE). This provides a better user experience for long responses.
Set "stream": true in your request to enable streaming.
Streaming Example
from openai import OpenAI
client = OpenAI(
base_url="https://api.smartaipi.com/v1",
api_key="your-api-key"
)
# Enable streaming
stream = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Write a poem"}],
stream=True
)
# Process chunks as they arrive
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Reasoning Effort
Control the depth of reasoning with the reasoning_effort parameter for GPT-5 models.
Smart AIPI defaults to reasoning.effort = "high" for all Responses API requests.
After extensive testing, we found that high is the best reasoning effort for practical usage - it provides deep, reliable tool-use and analysis without the latency cost of xhigh. You can override this by explicitly setting reasoning effort in your request.
Smart AIPI defaults store = false for all Responses API and WebSocket requests.
The upstream API requires store: false for GPT-5.4 and newer models. If you explicitly set store: true in your requests, you will receive a "Store must be set to false" error. Remove it or set it to false.
Supported Models
Responses API
gpt-5.4, gpt-5.3, gpt-5.1, gpt-5, and all Codex variants
Chat Completions only
gpt-5.4-nano — reasoning support is only available on /v1/chat/completions .
Reasoning Effort Levels
none- No reasoning. Skips thinking entirely.low- Minimal reasoning. Great for simple tasks.medium- Balanced speed and depth.high- Deep analysis for complex problems. (Smart AIPI default)xhigh- Extra high. Maximum reasoning depth for the hardest problems.
Fast Mode (Priority Processing)
Use service_tier: "priority" for priority processing with lower latency. This is what the /fast command in Codex CLI uses. It does not change reasoning depth - you get the same quality, faster.
Pricing: Priority processing is billed at 1.5x the standard rate. For example, GPT-5.4 output normally costs $3.75/1M tokens - with priority it costs $5.625/1M tokens.
response = client.responses.create(
model="gpt-5.4",
input="Refactor this function",
service_tier="priority" # priority processing, lower latency
)
Chat Completions API
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Analyze this code for bugs..."}],
reasoning_effort="high" # or "xhigh" for maximum depth
)
Responses API
# Reasoning is set via the "reasoning" object
response = client.responses.create(
model="gpt-5.4",
input=[{"role": "user", "content": "Refactor this function..."}],
reasoning={"effort": "high"} # defaults to "high" on Smart AIPI
)
Image Generation
Generate images from text prompts using our image generation endpoint.
/v1/images/variations ) are not yet supported. Image edits are available via /v1/images/edits.
Request Body Parameters
| Parameter | Type | Description |
|---|---|---|
| prompt | string | Text description of the image to generate (required) |
| model | string | "gpt-image-1.5" (frontier), "gpt-image-1", or "gpt-image-1-mini". Default: "gpt-image-1.5"%!(EXTRA string=gpt-image-1.5, string=gpt-image-1, string=gpt-image-1-mini, string=gpt-image-1.5) |
| n | integer | Number of images to generate (1-10). Default: 1 |
| size | string | "1024x1024", "1536x1024", or "1024x1536". Default: "1024x1024"%!(EXTRA string=1024x1024, string=1024x1792, string=1792x1024, string=1024x1024) |
| quality | string | "low", "medium", "high", or "auto". Default: "medium" |
Available Models
gpt-image-1.5- Frontier model, highest quality (default)gpt-image-latest- Alias for gpt-image-1.5gpt-image-1- Full quality image generationgpt-image-1-mini- Faster, smaller images
Example
response = client.images.generate(
model="gpt-image-1.5",
prompt="A futuristic city at sunset, cyberpunk style",
size="1024x1024",
n=1
)
# Response contains base64-encoded image
image_b64 = response.data[0].b64_json
# Save to file
import base64
with open("output.png", "wb") as f:
f.write(base64.b64decode(image_b64))
Image Edits
Edit existing images using a text prompt. Accepts both JSON (base64 string) and multipart/form-data (file upload), so OpenAI SDKs work out of the box.
Request Body Parameters
| Parameter | Type | Description |
|---|---|---|
| prompt | string | Text description of the desired edit (required) |
| image | string | Base64-encoded image to edit (required) |
| model | string | "gpt-image-1.5" (default), "gpt-image-1", or "gpt-image-1-mini"%!(EXTRA string=gpt-image-1.5, string=gpt-image-1, string=gpt-image-1-mini) |
| n | integer | Number of edited images to generate (1-10). Default: 1 |
| size | string | "1024x1024", "1536x1024", or "1024x1536". Default: "1024x1024" |
| quality | string | "low", "medium", "high", or "auto". Default: "medium" |
Example
import base64
response = client.images.edit(
model="gpt-image-1.5",
image=open("input.png", "rb"),
prompt="Change the background to a sunset beach",
size="1024x1024",
n=1
)
# Save the edited image
edited_b64 = response.data[0].b64_json
with open("edited.png", "wb") as f:
f.write(base64.b64decode(edited_b64))
Realtime API with WebSocket
WebSockets are a broadly supported API for realtime data transfer, and a great choice for connecting to the Smart AIPI Realtime API in server-to-server applications.
In a server-to-server integration, your backend system connects via WebSocket directly to the Realtime API. Use a standard API key to authenticate the connection, since the token is only available on your secure backend server.
Connect via WebSocket
Below are several examples of connecting via WebSocket. In addition to using the WebSocket URL, you will need to pass an authentication header using your API key.
import WebSocket from "ws";
const url = "wss://api.smartaipi.com/v1/responses";
const ws = new WebSocket(url, {
headers: {
Authorization: "Bearer " + process.env.SMARTAIPI_API_KEY,
},
});
ws.on("open", function open() {
console.log("Connected to server.");
// Send a response.create event
ws.send(JSON.stringify({
type: "response.create",
response: {
model: "gpt-5.4",
store: false,
input: [{ role: "user", content: "Hello!" }],
stream: true,
},
}));
});
ws.on("message", function incoming(message) {
const event = JSON.parse(message.toString());
console.log(event.type, event);
});
Sending and Receiving Events
Sessions are managed using client-sent and server-sent JSON events over the WebSocket connection. Wrap your request in a response.create envelope:
| Event | Direction | Description |
|---|---|---|
| response.create | Client | Send a request (wraps your model, input, and parameters) |
| response.created | Server | Session accepted, processing started |
| response.output_text.delta | Server | Streaming text chunk |
| response.completed | Server | Terminal event with full response and usage |
| response.failed | Server | Terminal event indicating an error |
"store": false in the response envelope. The connection persists across multiple turns - send additional response.create frames without reconnecting.
List Models
Retrieve a list of all available models. Use this endpoint to dynamically discover which models are available for your account.
Response Format
{
"object": "list",
"data": [
{
"id": "gpt-5.4",
"object": "model",
"created": 1700000000,
"owned_by": "smart-aipi"
},
{
"id": "gpt-5.3-codex",
"object": "model",
"created": 1700000000,
"owned_by": "smart-aipi"
},
// ... more models
]
}
Example
from openai import OpenAI
client = OpenAI(
base_url="https://api.smartaipi.com/v1",
api_key="your-api-key"
)
# List all available models
models = client.models.list()
for model in models.data:
print(model.id)
Available Models
GPT-5 Series
- gpt-5.4-pro
- gpt-5.4
- gpt-5.4-mini
- gpt-5.4-nano completions only
Codex Series
- gpt-5.3-codex
- gpt-5.2-codex
- gpt-5.2
- gpt-5.1
Anthropic Messages API
Full Anthropic Messages API compatibility. Use any Anthropic SDK, Claude Code, or app that speaks the Anthropic protocol - just change the base URL.
Supported Features
- ✓ Streaming (full Anthropic SSE event protocol)
- ✓ Tool use / function calling
- ✓ System messages (string and array formats)
- ✓ Image inputs (base64 and URL)
- ✓ Token counting (
/v1/messages/count_tokens) - ✓ Extended thinking / reasoning effort
Code Examples
from anthropic import Anthropic
client = Anthropic(
base_url="https://api.smartaipi.com",
api_key="your-api-key"
)
response = client.messages.create(
model="claude-sonnet-4-5-20250929",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.content[0].text)
Model Mapping
Claude model names are accepted and automatically routed to GPT-5.4 with tiered reasoning effort. Every response includes an x-actual-model header showing the real backend model.
| Claude Model | Backend | Reasoning | Best For |
|---|---|---|---|
| claude-opus-4-6 | gpt-5.4 | High | Complex reasoning, architecture |
| claude-sonnet-4-5-20250929 | gpt-5.4 | Medium | Daily coding, balanced quality |
| claude-haiku-4-5-20251001 | gpt-5.4 | Low | Fast responses, simple tasks |
Claude Code
Use Claude Code with Smart AIPI as the backend. Full tool use, streaming, and agentic capabilities supported.
Automatic Setup
npx smart-aipi claude
This configures ~/.claude/settings.json and ~/.claude.json automatically. If you already have Claude Code set up with an Anthropic account, it will show you the manual config instead of overwriting.
Manual Setup
Add to ~/.claude/settings.json:
{
"env": {
"ANTHROPIC_BASE_URL": "https://api.smartaipi.com",
"ANTHROPIC_API_KEY": "sk-your-key",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "claude-haiku-4-5-20251001"
},
"model": "opus"
}
Then add "hasCompletedOnboarding": true to ~/.claude.json to skip the setup wizard.
After editing settings, restart Claude Code for changes to take effect. If you already have Claude Code connected to a real Anthropic account, setting these env vars will override that connection - use a project-level .claude/settings.json to keep both.
What Each Setting Does
"model": "opus"— Main model. Uses claude-opus-4-6 (high reasoning) for all primary tasks.ANTHROPIC_DEFAULT_HAIKU_MODEL— Background model. Uses claude-haiku-4-5 (low reasoning) for fast background tasks like file indexing.- Switch between models mid-session with
/model sonnet,/model opus, or/model haiku.
| Alias | Claude Model | Backend | Reasoning |
|---|---|---|---|
| opus | claude-opus-4-6 | gpt-5.4 | High |
| sonnet | claude-sonnet-4-5-20250929 | gpt-5.4 | Medium |
| haiku | claude-haiku-4-5-20251001 | gpt-5.4 | Low |
OpenCode
Use Smart AIPI as your OpenCode backend via the OpenAI-compatible SDK.
Config Setup
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"smart-aipi": {
"npm": "@ai-sdk/openai-compatible",
"name": "Smart AIPI",
"options": {
"baseURL": "https://api.smartaipi.com/v1",
"apiKey": "YOUR_API_KEY"
},
"models": {
"gpt-5.4": {
"name": "GPT-5.4",
"reasoning": true,
"limit": { "context": 400000, "output": 128000 }
},
"gpt-5.3-codex": {
"name": "GPT-5.3 Codex",
"reasoning": true,
"limit": { "context": 400000, "output": 128000 }
},
"gpt-5.2-codex": {
"name": "GPT-5.2 Codex",
"reasoning": true,
"limit": { "context": 400000, "output": 128000 }
},
"gpt-5-codex": {
"name": "GPT-5 Codex",
"reasoning": true,
"limit": { "context": 400000, "output": 128000 }
}
}
}
},
"model": "smart-aipi/gpt-5.4"
}
Codex CLI
Use Smart AIPI with OpenAI's Codex CLI tool. Three files need to be configured.
Automatic Setup
npx smart-aipi codex
This configures all three files below automatically. After running, add model_reasoning_effort to your config (see step 2).
Manual Setup
Configure these three files:
1. API Key — ~/.codex/auth.json
{
"auth_mode": "apikey",
"OPENAI_API_KEY": "sk-your-key"
}
2. Model & Reasoning — ~/.codex/config.toml
Always include model_reasoning_effort - required for custom models.
Without it, Codex defaults to no reasoning and will not work properly. Use "high" for best results.
model = "gpt-5.4"
model_reasoning_effort = "high"
Valid reasoning levels: low, medium, high (recommended), xhigh.
3. Environment Variables — ~/.zshrc
# Smart AIPI
export OPENAI_API_KEY="sk-your-key"
export OPENAI_BASE_URL="https://api.smartaipi.com/v1"
After editing your shell profile, restart your terminal or run source ~/.zshrc for changes to take effect.
Then use Codex normally:
codex "fix this bug"
Codex WebSocket Speed Guide
WebSocket mode is usually faster for agentic coding flows with lots of tool calls. Instead of repeatedly reconnecting over HTTP and re-sending full request envelopes, Codex keeps one live connection and sends incremental turns, which reduces continuation overhead.
Enable WebSockets in Codex
Use the WebSocket v2 feature flag in ~/.codex/config.toml: ~/.codex/config.toml:
model = "gpt-5.4"
model_reasoning_effort = "high"
[features]
responses_websockets_v2 = true
CLI equivalent:
codex --enable responses_websockets_v2
Older builds may use the legacy flag:
[features]
responses_websockets = true
Known Homebrew Behavior
On some Homebrew-distributed builds, WebSocket turns can stall during long-running tasks and then silently fall back to HTTP. We saw this especially in complex tool-call loops. Rebuilding from the open-source code with the fixes below improved both stability and speed.
Post-Merge: Fix the Open-Source WebSocket Bug
After merging your branch, use this checklist to ensure the fix is in your local binary:
# 1) Pull merged main
git checkout main
git pull --ff-only
# 2) Confirm websocket flags exist
codex features list | rg responses_websockets
# 3) Build and install
cd codex-rs
cargo build --release
install -m 0755 target/release/codex ~/.local/bin/codex-beta
# 4) Run with websocket enabled
CODEX_RS_RESPONSES_WS=true \
OPENAI_BASE_URL=https://api.smartaipi.com/v1 \
~/.local/bin/codex-beta --enable responses_websockets_v2
If you need to patch manually, verify these code-level fixes are present:
# A) Build websocket request with IntoClientRequest so required headers are set
let mut request = url.as_str().into_client_request()?;
request.headers_mut().extend(headers);
# B) Ensure TLS roots are enabled for tokio-tungstenite in Cargo.toml
tokio-tungstenite = { version = "...", features = ["rustls-tls-native-roots"] }
Cursor & Cline
Use Smart AIPI with Cursor IDE or Cline VS Code extension.
Cursor
- Open Cursor Settings
- Go to
Modelstab - Click
+ Add Model - Set Base URL:
https://api.smartaipi.com/v1 - Enter your API key
- Model:
gpt-5.4
Cline (VS Code)
- Open Cline settings in VS Code
- Select
OpenAI Compatible - Base URL:
https://api.smartaipi.com/v1 - Enter your API key
- Model:
gpt-5.4
Chat
Use Smart AIPI Chat at chat.smartaipi.com. for text, image, video, code, search, and voice workflows.
Features
- ✓ Chat — General-purpose text conversations with Smart AIPI models.
- ✓ Images — Generate and edit images from prompts.
- ✓ Video — Generate videos from text or image inputs.
- ✓ Code — Code assistance and editing in the browser.
- ✓ Search — Use web search for grounded answers.
- ✓ Voice — Talk to the model with voice input and responses.
Getting Started
- 1. Open chat.smartaipi.com
- 2. Sign in or create an account.
- 3. Start chatting with text, images, video, code, search, or voice.
Billing: Usage is billed against your Smart AIPI credits.
API Tester
Test the API directly from your browser: