Documentation

Smart AIPI provides both OpenAI and Anthropic-compatible APIs. Use our service with any existing OpenAI or Anthropic SDK, tool, or app by simply changing the base URL. Zero code changes required.

OpenAI Base URL

https://api.smartaipi.com/v1

Anthropic Base URL

https://api.smartaipi.com

CLI & MCP Tools

Smart AIPI provides two npm packages to help developers work with their accounts:

smart-aipi — A CLI tool to programmatically manage your account, API keys, and usage from the terminal.
@smart-aipi/mcp — An MCP server that gives AI agents direct access to your Smart AIPI account.

Install Both

Terminal
npm install -g smart-aipi @smart-aipi/mcp

Install Individually

CLI only
npm install -g smart-aipi
MCP only
npm install -g @smart-aipi/mcp

Authentication

All API requests require an API key. The same key works with both authentication methods:

OpenAI style (Authorization header)
Authorization: Bearer YOUR_API_KEY
Anthropic style (x-api-key header)
x-api-key: YOUR_API_KEY

Migration Guide

Migrating takes less than a minute. Our API is 100% compatible with both OpenAI and Anthropic endpoints.

From OpenAI

1

Get your Smart AIPI API key

Sign up and create an API key from your dashboard.

2

Change the base URL

Replace https://api.openai.com/v1 with https://api.smartaipi.com/v1

3

Update your API key

Use your Smart AIPI key instead of your OpenAI key. That's it!

From Anthropic

1

Get your Smart AIPI API key

Sign up and create an API key from your dashboard.

2

Change the base URL

Replace https://api.anthropic.com with https://api.smartaipi.com

3

Update your API key

Use your Smart AIPI key instead of your Anthropic key. Your existing Claude model names work as-is.

Your existing code, SDKs, and apps using the Anthropic Messages API will work without any code changes. Claude model names (claude-opus-4-6, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001) are automatically routed to high-performance GPT-5.4 models. Every response includes an x-actual-model header showing the real backend model for full transparency.

Chat Completions

Create a chat completion for the provided messages and model. This is the main endpoint for interacting with language models.

POST /v1/chat/completions

Request Body Parameters

Parameter Type Required Description
model string Yes Model ID to use (e.g., "gpt-5.4", "gpt-5.3-codex")%!(EXTRA string=gpt-5.4, string=gpt-5.3-codex)
messages array Yes Array of message objects with role and content
temperature number No Sampling temperature (0-2). Higher = more random. Default: 1
max_tokens integer No Maximum tokens to generate in the response
top_p number No Nucleus sampling. Consider tokens with top_p probability. Default: 1
frequency_penalty number No Penalize tokens based on frequency (-2 to 2). Default: 0
presence_penalty number No Penalize tokens based on presence (-2 to 2). Default: 0
stop string/array No Stop sequences. Up to 4 sequences where generation stops.
stream boolean No Enable streaming responses via SSE. Default: false

Message Roles

  • system - system - Sets the behavior/persona of the assistant
  • user - user - Messages from the user
  • assistant - assistant - Previous responses from the assistant

Code Examples

from openai import OpenAI

client = OpenAI(
    base_url="https://api.smartaipi.com/v1",
    api_key="your-api-key"
)

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ],
    temperature=0.7,
    max_tokens=1000
)

print(response.choices[0].message.content)

Streaming

Enable streaming to receive tokens as they're generated via Server-Sent Events (SSE). This provides a better user experience for long responses.

Set "stream": true in your request to enable streaming.

Streaming Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.smartaipi.com/v1",
    api_key="your-api-key"
)

# Enable streaming
stream = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "Write a poem"}],
    stream=True
)

# Process chunks as they arrive
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Reasoning Effort

Control the depth of reasoning with the reasoning_effort parameter for GPT-5 models.

Smart AIPI defaults to reasoning.effort = "high" for all Responses API requests.

After extensive testing, we found that high is the best reasoning effort for practical usage - it provides deep, reliable tool-use and analysis without the latency cost of xhigh. You can override this by explicitly setting reasoning effort in your request.

Smart AIPI defaults store = false for all Responses API and WebSocket requests.

The upstream API requires store: false for GPT-5.4 and newer models. If you explicitly set store: true in your requests, you will receive a "Store must be set to false" error. Remove it or set it to false.

Supported Models

Responses API

gpt-5.4, gpt-5.3, gpt-5.1, gpt-5, and all Codex variants

Chat Completions only

gpt-5.4-nano — reasoning support is only available on /v1/chat/completions .

Reasoning Effort Levels

  • none - No reasoning. Skips thinking entirely.
  • low - Minimal reasoning. Great for simple tasks.
  • medium - Balanced speed and depth.
  • high - Deep analysis for complex problems. (Smart AIPI default)
  • xhigh - Extra high. Maximum reasoning depth for the hardest problems.

Fast Mode (Priority Processing)

Use service_tier: "priority" for priority processing with lower latency. This is what the /fast command in Codex CLI uses. It does not change reasoning depth - you get the same quality, faster.

Pricing: Priority processing is billed at 1.5x the standard rate. For example, GPT-5.4 output normally costs $3.75/1M tokens - with priority it costs $5.625/1M tokens.

response = client.responses.create(
    model="gpt-5.4",
    input="Refactor this function",
    service_tier="priority"              # priority processing, lower latency
)

Chat Completions API

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "Analyze this code for bugs..."}],
    reasoning_effort="high"  # or "xhigh" for maximum depth
)

Responses API

# Reasoning is set via the "reasoning" object
response = client.responses.create(
    model="gpt-5.4",
    input=[{"role": "user", "content": "Refactor this function..."}],
    reasoning={"effort": "high"}  # defaults to "high" on Smart AIPI
)

Image Generation

Generate images from text prompts using our image generation endpoint.

Current Limitations: Image variations ( /v1/images/variations ) are not yet supported. Image edits are available via /v1/images/edits.
POST /v1/images/generations

Request Body Parameters

Parameter Type Description
prompt string Text description of the image to generate (required)
model string "gpt-image-1.5" (frontier), "gpt-image-1", or "gpt-image-1-mini". Default: "gpt-image-1.5"%!(EXTRA string=gpt-image-1.5, string=gpt-image-1, string=gpt-image-1-mini, string=gpt-image-1.5)
n integer Number of images to generate (1-10). Default: 1
size string "1024x1024", "1536x1024", or "1024x1536". Default: "1024x1024"%!(EXTRA string=1024x1024, string=1024x1792, string=1792x1024, string=1024x1024)
quality string "low", "medium", "high", or "auto". Default: "medium"

Available Models

  • gpt-image-1.5 - Frontier model, highest quality (default)
  • gpt-image-latest - Alias for gpt-image-1.5
  • gpt-image-1 - Full quality image generation
  • gpt-image-1-mini - Faster, smaller images

Example

response = client.images.generate(
    model="gpt-image-1.5",
    prompt="A futuristic city at sunset, cyberpunk style",
    size="1024x1024",
    n=1
)

# Response contains base64-encoded image
image_b64 = response.data[0].b64_json

# Save to file
import base64
with open("output.png", "wb") as f:
    f.write(base64.b64decode(image_b64))

Image Edits

Edit existing images using a text prompt. Accepts both JSON (base64 string) and multipart/form-data (file upload), so OpenAI SDKs work out of the box.

Note: Only single-image edits are supported. Multi-image inputs and mask fields are not currently available.
POST /v1/images/edits

Request Body Parameters

Parameter Type Description
prompt string Text description of the desired edit (required)
image string Base64-encoded image to edit (required)
model string "gpt-image-1.5" (default), "gpt-image-1", or "gpt-image-1-mini"%!(EXTRA string=gpt-image-1.5, string=gpt-image-1, string=gpt-image-1-mini)
n integer Number of edited images to generate (1-10). Default: 1
size string "1024x1024", "1536x1024", or "1024x1536". Default: "1024x1024"
quality string "low", "medium", "high", or "auto". Default: "medium"

Example

import base64

response = client.images.edit(
    model="gpt-image-1.5",
    image=open("input.png", "rb"),
    prompt="Change the background to a sunset beach",
    size="1024x1024",
    n=1
)

# Save the edited image
edited_b64 = response.data[0].b64_json
with open("edited.png", "wb") as f:
    f.write(base64.b64decode(edited_b64))

Realtime API with WebSocket

WebSockets are a broadly supported API for realtime data transfer, and a great choice for connecting to the Smart AIPI Realtime API in server-to-server applications.

In a server-to-server integration, your backend system connects via WebSocket directly to the Realtime API. Use a standard API key to authenticate the connection, since the token is only available on your secure backend server.

WSS wss://api.smartaipi.com/v1/responses

Connect via WebSocket

Below are several examples of connecting via WebSocket. In addition to using the WebSocket URL, you will need to pass an authentication header using your API key.

import WebSocket from "ws";

const url = "wss://api.smartaipi.com/v1/responses";
const ws = new WebSocket(url, {
  headers: {
    Authorization: "Bearer " + process.env.SMARTAIPI_API_KEY,
  },
});

ws.on("open", function open() {
  console.log("Connected to server.");

  // Send a response.create event
  ws.send(JSON.stringify({
    type: "response.create",
    response: {

      model: "gpt-5.4",
      store: false,
      input: [{ role: "user", content: "Hello!" }],
      stream: true,
    },
  }));
});

ws.on("message", function incoming(message) {
  const event = JSON.parse(message.toString());
  console.log(event.type, event);
});

Sending and Receiving Events

Sessions are managed using client-sent and server-sent JSON events over the WebSocket connection. Wrap your request in a response.create envelope:

Event Direction Description
response.create Client Send a request (wraps your model, input, and parameters)
response.created Server Session accepted, processing started
response.output_text.delta Server Streaming text chunk
response.completed Server Terminal event with full response and usage
response.failed Server Terminal event indicating an error
Important: WebSocket requests must include "store": false in the response envelope. The connection persists across multiple turns - send additional response.create frames without reconnecting.

List Models

Retrieve a list of all available models. Use this endpoint to dynamically discover which models are available for your account.

GET /v1/models

Response Format

{
    "object": "list",
    "data": [
        {
            "id": "gpt-5.4",
            "object": "model",
            "created": 1700000000,
            "owned_by": "smart-aipi"
        },
        {
            "id": "gpt-5.3-codex",
            "object": "model",
            "created": 1700000000,
            "owned_by": "smart-aipi"
        },
        // ... more models
    ]
}

Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.smartaipi.com/v1",
    api_key="your-api-key"
)

# List all available models
models = client.models.list()

for model in models.data:
    print(model.id)

Available Models

GPT-5 Series

  • gpt-5.4-pro
  • gpt-5.4
  • gpt-5.4-mini
  • gpt-5.4-nano completions only

Codex Series

  • gpt-5.3-codex
  • gpt-5.2-codex
  • gpt-5.2
  • gpt-5.1

Anthropic Messages API

Full Anthropic Messages API compatibility. Use any Anthropic SDK, Claude Code, or app that speaks the Anthropic protocol - just change the base URL.

POST /v1/messages

Supported Features

  • Streaming (full Anthropic SSE event protocol)
  • Tool use / function calling
  • System messages (string and array formats)
  • Image inputs (base64 and URL)
  • Token counting (/v1/messages/count_tokens)
  • Extended thinking / reasoning effort

Code Examples

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.smartaipi.com",
    api_key="your-api-key"
)

response = client.messages.create(
    model="claude-sonnet-4-5-20250929",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.content[0].text)

Model Mapping

Claude model names are accepted and automatically routed to GPT-5.4 with tiered reasoning effort. Every response includes an x-actual-model header showing the real backend model.

Claude Model Backend Reasoning Best For
claude-opus-4-6 gpt-5.4 High Complex reasoning, architecture
claude-sonnet-4-5-20250929 gpt-5.4 Medium Daily coding, balanced quality
claude-haiku-4-5-20251001 gpt-5.4 Low Fast responses, simple tasks

Claude Code

Use Claude Code with Smart AIPI as the backend. Full tool use, streaming, and agentic capabilities supported.

Automatic Setup

One command
npx smart-aipi claude

This configures ~/.claude/settings.json and ~/.claude.json automatically. If you already have Claude Code set up with an Anthropic account, it will show you the manual config instead of overwriting.

Manual Setup

Add to ~/.claude/settings.json:

~/.claude/settings.json
{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.smartaipi.com",
    "ANTHROPIC_API_KEY": "sk-your-key",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "claude-haiku-4-5-20251001"
  },
  "model": "opus"
}

Then add "hasCompletedOnboarding": true to ~/.claude.json to skip the setup wizard.

After editing settings, restart Claude Code for changes to take effect. If you already have Claude Code connected to a real Anthropic account, setting these env vars will override that connection - use a project-level .claude/settings.json to keep both.

What Each Setting Does

  • "model": "opus" — Main model. Uses claude-opus-4-6 (high reasoning) for all primary tasks.
  • ANTHROPIC_DEFAULT_HAIKU_MODEL — Background model. Uses claude-haiku-4-5 (low reasoning) for fast background tasks like file indexing.
  • Switch between models mid-session with /model sonnet, /model opus, or /model haiku.
Alias Claude Model Backend Reasoning
opus claude-opus-4-6 gpt-5.4 High
sonnet claude-sonnet-4-5-20250929 gpt-5.4 Medium
haiku claude-haiku-4-5-20251001 gpt-5.4 Low

OpenCode

Use Smart AIPI as your OpenCode backend via the OpenAI-compatible SDK.

Config Setup

~/.config/opencode/opencode.json
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "smart-aipi": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Smart AIPI",
      "options": {
        "baseURL": "https://api.smartaipi.com/v1",
        "apiKey": "YOUR_API_KEY"
      },
      "models": {
        "gpt-5.4": {
          "name": "GPT-5.4",
          "reasoning": true,
          "limit": { "context": 400000, "output": 128000 }
        },
        "gpt-5.3-codex": {
          "name": "GPT-5.3 Codex",
          "reasoning": true,
          "limit": { "context": 400000, "output": 128000 }
        },
        "gpt-5.2-codex": {
          "name": "GPT-5.2 Codex",
          "reasoning": true,
          "limit": { "context": 400000, "output": 128000 }
        },
        "gpt-5-codex": {
          "name": "GPT-5 Codex",
          "reasoning": true,
          "limit": { "context": 400000, "output": 128000 }
        }
      }
    }
  },
  "model": "smart-aipi/gpt-5.4"
}

Codex CLI

Use Smart AIPI with OpenAI's Codex CLI tool. Three files need to be configured.

Automatic Setup

One command
npx smart-aipi codex

This configures all three files below automatically. After running, add model_reasoning_effort to your config (see step 2).

Manual Setup

Configure these three files:

1. API Key — ~/.codex/auth.json

~/.codex/auth.json
{
  "auth_mode": "apikey",
  "OPENAI_API_KEY": "sk-your-key"
}

2. Model & Reasoning — ~/.codex/config.toml

Always include model_reasoning_effort - required for custom models.

Without it, Codex defaults to no reasoning and will not work properly. Use "high" for best results.

~/.codex/config.toml
model = "gpt-5.4"
model_reasoning_effort = "high"

Valid reasoning levels: low, medium, high (recommended), xhigh.

3. Environment Variables — ~/.zshrc

~/.zshrc (or ~/.bashrc)
# Smart AIPI
export OPENAI_API_KEY="sk-your-key"
export OPENAI_BASE_URL="https://api.smartaipi.com/v1"

After editing your shell profile, restart your terminal or run source ~/.zshrc for changes to take effect.

Then use Codex normally:

terminal
codex "fix this bug"

Codex WebSocket Speed Guide

WebSocket mode is usually faster for agentic coding flows with lots of tool calls. Instead of repeatedly reconnecting over HTTP and re-sending full request envelopes, Codex keeps one live connection and sends incremental turns, which reduces continuation overhead.

Observed improvement: In our tool-heavy coding runs, a rebuilt open-source Codex with WebSocket fixes reduced end-to-end run time by roughly 30-40% compared to HTTP continuation mode.

Enable WebSockets in Codex

Use the WebSocket v2 feature flag in ~/.codex/config.toml: ~/.codex/config.toml:

~/.codex/config.toml
model = "gpt-5.4"
model_reasoning_effort = "high"

[features]
responses_websockets_v2 = true

CLI equivalent:

terminal
codex --enable responses_websockets_v2

Older builds may use the legacy flag:

legacy fallback
[features]
responses_websockets = true

Known Homebrew Behavior

On some Homebrew-distributed builds, WebSocket turns can stall during long-running tasks and then silently fall back to HTTP. We saw this especially in complex tool-call loops. Rebuilding from the open-source code with the fixes below improved both stability and speed.

Post-Merge: Fix the Open-Source WebSocket Bug

After merging your branch, use this checklist to ensure the fix is in your local binary:

terminal
# 1) Pull merged main
git checkout main
git pull --ff-only

# 2) Confirm websocket flags exist
codex features list | rg responses_websockets

# 3) Build and install
cd codex-rs
cargo build --release
install -m 0755 target/release/codex ~/.local/bin/codex-beta

# 4) Run with websocket enabled
CODEX_RS_RESPONSES_WS=true \
OPENAI_BASE_URL=https://api.smartaipi.com/v1 \
~/.local/bin/codex-beta --enable responses_websockets_v2

If you need to patch manually, verify these code-level fixes are present:

websocket client fixes
# A) Build websocket request with IntoClientRequest so required headers are set
let mut request = url.as_str().into_client_request()?;
request.headers_mut().extend(headers);

# B) Ensure TLS roots are enabled for tokio-tungstenite in Cargo.toml
tokio-tungstenite = { version = "...", features = ["rustls-tls-native-roots"] }

Cursor & Cline

Use Smart AIPI with Cursor IDE or Cline VS Code extension.

Cursor Cursor

  1. Open Cursor Settings
  2. Go to Models tab
  3. Click + Add Model
  4. Set Base URL: https://api.smartaipi.com/v1
  5. Enter your API key
  6. Model: gpt-5.4

Cline Cline (VS Code)

  1. Open Cline settings in VS Code
  2. Select OpenAI Compatible
  3. Base URL: https://api.smartaipi.com/v1
  4. Enter your API key
  5. Model: gpt-5.4

Chat

Use Smart AIPI Chat at chat.smartaipi.com. for text, image, video, code, search, and voice workflows.

Features

  • Chat — General-purpose text conversations with Smart AIPI models.
  • Images — Generate and edit images from prompts.
  • Video — Generate videos from text or image inputs.
  • Code — Code assistance and editing in the browser.
  • Search — Use web search for grounded answers.
  • Voice — Talk to the model with voice input and responses.

Getting Started

  1. 1. Open chat.smartaipi.com
  2. 2. Sign in or create an account.
  3. 3. Start chatting with text, images, video, code, search, or voice.

Billing: Usage is billed against your Smart AIPI credits.

API Tester

Test the API directly from your browser:

Message sent

We'll get back to you within 2 business days.

Contact Support

Have a question or need help? Send us a message and we'll get back to you within 2 business days.