What is GPT-5.4 Mini?

GPT-5.4 Mini is OpenAI's latest efficient model. It scores 60% on Terminal-Bench 2.0 for coding tasks while costing only $0.75 per million input tokens — making it the best price-to-performance small model available.

What is GPT-5.4 Nano?

GPT-5.4 Nano is OpenAI's smallest and cheapest frontier model at $0.20 per million input tokens. It's designed for high-throughput, low-cost tasks like classification, summarization, routing, and lightweight agentic workflows.

How much does GPT-5.4 Mini cost on Smart AIPI?

On Smart AIPI, GPT-5.4 Mini costs $0.1875/1M input, $0.01875/1M cached input, and $1.125/1M output — 75% less than OpenAI direct pricing.

How much does GPT-5.4 Nano cost on Smart AIPI?

On Smart AIPI, GPT-5.4 Nano costs $0.05/1M input, $0.005/1M cached input, and $0.3125/1M output — 75% less than OpenAI direct pricing.

Can GPT-5.4 Mini do coding?

Yes. GPT-5.4 Mini scores 60% on Terminal-Bench 2.0, outperforming Gemini 3 Flash (47.7%) by a wide margin. It's strong enough for coding subagents, code review, and parallelized development tasks.

Should I use GPT-5.4 Mini or Nano for agents?

Use Mini for tasks that require reasoning, coding, or tool use — like parallelized subagents in Codex. Use Nano for lightweight tasks like routing, classification, summarization, and simple agentic loops that don't involve code generation.

Can GPT-5.4 Mini handle computer use tasks?

Yes. GPT-5.4 Mini delivers decent accuracy on computer use and browser automation tasks at much faster speeds than full-size models, making it ideal for UI testing and web scraping agents.

GPT-5.4 Mini and Nano: The New Kings of Small Models

Q: Does GPT-5.4 Nano support the Responses API?

Not yet. GPT-5.4 Nano currently only works with the Chat Completions endpoint (/v1/chat/completions). The Responses API is not yet supported for this model.

TL;DR: GPT-5.4 Mini and Nano are live on Smart AIPI. Mini scores 60% on Terminal-Bench 2.0 at $0.75/1M input. Nano costs just $0.20/1M. Both are available at 75% off. Use gpt-5.4-mini and gpt-5.4-nano in any API call.

OpenAI just dropped two new models in the 5.4 family, and they change the economics of AI development. GPT-5.4 Mini brings near-frontier coding ability at a fraction of the price. GPT-5.4 Nano is the cheapest frontier model ever released.

Both are live on Smart AIPI right now.

The Numbers

Here's how the new models compare on price and performance:

Price vs coding ability chart showing GPT-5.4 Mini at 60% Terminal-Bench 2.0 score, GPT-5.4 Nano at 46%, and Gemini 3 Flash at 47.7%

Model	Terminal-Bench 2.0	Input /1M	Output /1M	Smart AIPI Output
GPT-5.4 Mini	60.0%	$0.75	$4.50	$1.125
GPT-5.4 Nano	46.2%	$0.20	$1.25	$0.3125
Gemini 3 Flash	47.7%	$0.50	$3.00	N/A
GPT-5.4 (full)	72.1%	$2.50	$15.00	$3.75

GPT-5.4 Mini beats Gemini 3 Flash on coding by 12+ points while costing only 50% more on input. And on Smart AIPI, that gap narrows further — you're paying $0.1875/1M input instead of Gemini's $0.50.

When to Use Which Model

The 5.4 family now gives you three tiers to match your workload:

GPT-5.4 Mini: The Subagent Workhorse

This is the model you want for parallelized coding tasks. If you're running Codex, Claude Code, or any agentic coding tool that spawns subagents, Mini is the sweet spot.

The subagent cost play:

Run your main agent on gpt-5.4 with high reasoning for architecture decisions
Spawn subagents on gpt-5.4-mini with high reasoning for implementation tasks
Each subagent costs ~70% less than running on the full model, while still scoring 60% on Terminal-Bench
10 parallel subagents on Mini costs less than 3 on GPT-5.4 — and finishes faster

Mini also excels at computer use and browser automation. It delivers solid accuracy on UI interaction tasks at much faster speeds than GPT-5.4, making it the right choice for web scraping agents, automated testing, and any workflow where you need the model to navigate screens and click buttons without burning through your budget.

Codex CLI tip: Set model = "gpt-5.4-mini" in ~/.codex/config.toml with model_reasoning_effort = "high" for a fast, capable coding assistant that won't drain your credits.

GPT-5.4 Nano: The Everyday Workhorse

At $0.20/1M input (or $0.05 on Smart AIPI), Nano is practically free. It's not a coding model — but it doesn't need to be. Use it for everything else:

Classification and routing — decide which model or tool to invoke
Summarization — condense documents, conversations, search results
Data extraction — pull structured fields from unstructured text
Basic agentic tasks — simple multi-step workflows that don't involve code generation
Conversation titles — we use it ourselves for generating chat titles
Content moderation — fast, cheap content filtering at scale

Nano handles tasks that previously required GPT-4.1 Mini or GPT-5 Mini but at a fraction of the cost. For high-throughput pipelines processing millions of requests, the savings are massive.

Note: GPT-5.4 Nano currently only supports the Chat Completions endpoint (/v1/chat/completions). The Responses API is not yet available for this model. If you're using tools like Codex CLI that require the Responses endpoint, use Mini instead.

The Smart Model Stack

Here's the pattern we recommend for production AI applications:

Task	Model	Why
Architecture, complex reasoning	gpt-5.4	Best-in-class quality for critical decisions
Coding subagents, code review, computer use	gpt-5.4-mini	60% Terminal-Bench at 70% less cost than 5.4
Routing, classification, summaries, extraction	gpt-5.4-nano	Near-free at $0.05/1M input on Smart AIPI
Image generation and editing	gpt-image-1.5	Frontier image quality
Video generation	sora-2	Up to 20s video from text or image

Quick Start

Both models work with any OpenAI SDK. Just set the model ID:

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.smartaipi.com/v1",
    api_key="your-api-key"
)

# Fast coding with Mini
response = client.chat.completions.create(
    model="gpt-5.4-mini",
    messages=[{"role": "user", "content": "Refactor this function to use async/await"}]
)

# Cheap classification with Nano
response = client.chat.completions.create(
    model="gpt-5.4-nano",
    messages=[{"role": "user", "content": "Classify this support ticket: ..."}]
)

Pricing on Smart AIPI

As always, 75% off OpenAI direct pricing:

Model	Input /1M	Cached Input /1M	Output /1M
GPT-5.4 Mini	$0.1875	$0.01875	$1.125
GPT-5.4 Nano	$0.05	$0.005	$0.3125

With prompt caching (automatic, no setup needed), long-running agents with consistent system prompts see 30-50% cache hit rates — making Mini even cheaper for agentic workloads.

Get Started

Both models are available right now. No waitlist, no special access needed.

Sign up for a free Smart AIPI account (includes $5 free credits)
Set your base URL to https://api.smartaipi.com/v1
Use model gpt-5.4-mini or gpt-5.4-nano
Or try them instantly at chat.smartaipi.com

The era of expensive AI is over. Build more, spend less.