Pi-Mono on Smart AIPI: The Cheapest Way to Run a Lean Coding Agent — With Grill-Me, Superpowers, and Caveman

Pi-Mono is Mario Zechner's minimalist coding harness — four tools (read, write, edit, bash), full YOLO mode, MIT licensed, and provider-agnostic. Point it at Smart AIPI for pay-as-you-go GPT-5.5 access, pair with grill-me for task creation, superpowers for full methodology, and the caveman skill to crush output token costs by ~75%. The lean agent + lean API combination shipping today.

S
Smart AIPI Team
8 min read ·
Pi-Mono on Smart AIPI: The Cheapest Way to Run a Lean Coding Agent — With Grill-Me, Superpowers, and Caveman

TL;DR: Pi-Mono is Mario Zechner's minimalist coding agent — four tools (read, write, edit, bash), MIT licensed, deliberately tiny core. It speaks OpenAI-compatible, so point it at https://api.smartaipi.com/v1, run GPT-5.5 on low reasoning, and pair it with three skills that fix the gaps Pi intentionally leaves: grill-me for task creation, superpowers for engineering methodology, and the caveman skill to cut output tokens by roughly 75%. Stacked on Smart AIPI's pay-as-you-go pricing and 95–98% cache hit rate, the per-turn bill drops by a compounding margin you can verify against your own usage page.

The lean-agent movement has a flagship. Pi-Mono is the monorepo behind pi, a terminal coding harness from Mario Zechner (creator of libGDX) that strips a coding agent back to the essentials and pushes everything else into extensions. As Zechner put it in his Mastra talk "I Hated Every Coding Agent, So I Built My Own": established harnesses are spaceships with 80% of functionality most developers never touch, and their system prompts mutate every release — breaking the personal workflows you carefully built on top.

Pi's response is radical minimalism: four tools, full YOLO mode, no batteries included. Hackability over completeness. On Smart AIPI's pay-as-you-go OpenAI-compatible endpoint, that minimalism translates directly into the cheapest serious agentic coding setup shipping today.

What Pi-Mono Actually Is

Pi-mono is the monorepo (github.com/badlogic/pi-mono, mirrored at earendil-works/pi); the CLI binary is just pi. The repo decomposes into a handful of focused packages:

  • pi-agent-core — the runtime: tool calling, state, the actual agent loop.
  • pi-ai — a unified provider library that talks Anthropic, OpenAI-compatible, Bedrock, Vertex, Mistral, Groq, Cerebras, xAI, DeepSeek, OpenRouter, Hugging Face, Ollama, and roughly a dozen others. Crucially, it can rewrite context to switch providers mid-session without losing state.
  • pi-tui — a differential-rendering terminal UI.
  • pi-coding-agent — the user-facing CLI that wires the pieces together.

The default tool surface is four tools, period:

Tool What it does Why this is enough
readRead a file from diskFrontier models are already trained on this exact schema
writeWrite or create a fileMechanical I/O — no reasoning required
editApply an in-place patch to a fileSearch/replace + diffs are the universal edit primitive
bashRun any shell commandEverything else (grep, tests, git, package managers) is a shell command

Anything more — subagents, plan mode, permission gates, MCP servers, sandboxing — lives in pi-skills or third-party extensions, opt-in. Pi runs in full YOLO mode by default: no permission prompts, unrestricted filesystem and shell. That's a deliberate choice. If you want safety rails, you install them.

Why Pi-Mono and Smart AIPI Are a Perfect Pair

Pi's pi-ai library treats every provider as an OpenAI-shaped or Anthropic-shaped endpoint behind a uniform interface. Smart AIPI's gateway exposes the entire OpenAI surface — chat, responses, embeddings, images, audio, anthropic-shaped messages — through a single base URL with a single API key. The two designs slot together cleanly:

  • One key, every model. The same Smart AIPI API key gets you GPT-5.5, GPT-5.4, GPT-5.4-mini, GPT-5.3-codex, GPT-Image-2, and Anthropic-compatible Claude routing — and Pi can switch between them mid-session.
  • Pay-as-you-go pricing. No subscription, no quota, no throttle. You pay for the tokens you actually compute. Pi's tiny system prompt means every turn pays less for standing context than a feature-heavy harness.
  • End-to-end prompt cache. Smart AIPI preserves OpenAI's prompt-cache key verbatim through the gateway, so your repo + agents.md + history stay in cache across long sessions. Real-world cache hit rates are 95–98%, dropping cached input to $0.125 per 1M tokens — 10× cheaper than uncached.
  • WebSocket session preservation. Long pi-mono sessions stay on one warm cache instead of paying full input price each turn.

The Three Skills That Complete Pi-Mono

Pi deliberately ships nothing for task planning, no opinionated engineering methodology, and no output-compression layer. That's not a bug — it's the whole point. The community has filled all three gaps with composable skills that snap into Pi via the /skill:name invocation pattern. The three worth installing on day one:

1. Grill-Me — High-Reasoning Task Creation

Where: github.com/mattpocock/skills, by Matt Pocock.

Grill-me does exactly what the name suggests: the model interrogates you about your task until every constraint is unambiguous. Files in scope. Success criteria. Edge cases. What "done" means. Typically three to five short turns. You answer; the skill returns a tight, executable spec.

The economic move is to run grill-me on high reasoning for the spec phase only — typically three to five short turns — then drop to low reasoning for the execution phase (30 to 100 turns of cheap, mechanical work). High reasoning pays off where it's load-bearing; low reasoning carries the rest. Pi-mono's four-tool loop is uniquely well-suited to this pattern because there's nothing in the way: once the spec is locked in, the model has a clear path through read/write/edit/bash to ship it.

2. Superpowers — Engineering Methodology as Skills

Where: github.com/obra/superpowers, by Jesse Vincent.

Superpowers is "a complete software development methodology for your coding agents, built on top of a set of composable skills." TDD scaffolding, code-review playbooks, debugging procedures, refactoring patterns, branch-management discipline. It's the explicit engineering process that Pi deliberately doesn't bake into core.

It's also harness-agnostic — the same skill pack works in Claude Code, Cursor, Copilot CLI, Gemini, Codex, and Pi. Install once, take it with you across harnesses. Pi's plug-and-play skill loader means you can layer the entire methodology onto the minimal core without touching pi's source.

3. Caveman — The Output-Token Killer

Where: Matt Pocock's skills repo (productivity section).

Caveman switches the agent into ultra-compressed communication mode. Drop articles. Drop filler. Reply in telegraphic bullets. No formal English, no apologies, no preamble. Reported real-world token reduction on prose-heavy turns is roughly 75%.

Why this dominates the bill: on GPT-5.5, output tokens cost $7.50 per 1M versus $0.125 per 1M for cached input — a 60× ratio. Compressing output is the single highest-leverage thing you can do to a coding-agent invoice. Grill-me + low reasoning already reduced output. Caveman compresses what's left. Pair this with Smart AIPI's cache and the marginal cost of an additional turn approaches the floor of the model's compute envelope.

The Compounding Cost Story

Pricing on Smart AIPI for GPT-5.5:

GPT-5.5 (per 1M tokens) Smart AIPI Pi-Mono effect
Input (uncached)$1.25Tiny system prompt means lower first-turn cost
Input (cached prefix)$0.12595–98% hit rate across long sessions
Output$7.50Caveman cuts ~75%; low reasoning compounds

The shape of a pi-mono turn on GPT-5.5 low reasoning, with grill-me spec already done and caveman engaged:

  • Cached input prefix (repo + agents.md + history) — typically the largest token bucket per turn — runs at the cached rate, not the uncached rate, every time the cache hits.
  • New uncached input (tool results, the latest user message) — usually a tiny fraction of the input, charged at the full rate.
  • Output — compressed by caveman, minimal reasoning tokens because effort is low. This is the line item most worth shrinking because it's billed at the highest per-token rate.

Pay-as-you-go means you only pay for what you compute, and these three levers compound. Your own dashboard will show you the actual per-turn numbers for your repo and prompt shape — no cap, no throttle, no monthly minimum.

Mid-Session Provider Switching: Pi's Sleeper Feature

Pi-ai keeps a portable context representation and can rewrite it to fit whichever provider you're currently pointed at. Practical translation: when GPT-5.5 hits a hard ambiguity, you can /model claude-sonnet-4.6 mid-task for a second opinion without losing the conversation. Both endpoints sit behind your single Smart AIPI key — Anthropic-compatible Claude responses come through /v1/messages, OpenAI-compatible flows through /v1/responses (or /v1/chat/completions), and Pi handles the translation transparently.

For long, high-stakes refactors this pattern is genuinely useful: grill-me on GPT-5.5 high for the spec, GPT-5.5 low for execution, Claude Sonnet 4.6 for a single architectural review pass, back to GPT-5.5 low to finish. One harness, one API key, one bill.

Setup in 60 Seconds

# 1. install pi
curl -fsSL https://pi.dev/install.sh | sh
# or:  npm install -g @earendil-works/pi-coding-agent

# 2. point at smart aipi
export OPENAI_BASE_URL=https://api.smartaipi.com/v1
export OPENAI_API_KEY=sk-your-smartaipi-key

# 3. run pi in your repo
cd ~/code/your-project
pi

# 4. inside pi, install skills
/skill:install matt-pocock/skills/grill-me
/skill:install obra/superpowers
/skill:install matt-pocock/skills/caveman

# 5. spec your task
/skill grill-me

# 6. drop to low reasoning and execute
/model gpt-5.5 --reasoning-effort low
/skill caveman

Exact skill-install syntax depends on which skill loader you're running — check the README of each skill repo for current invocation. The shape is the same: drop the skill folder into your config and invoke it.

Why Lean Agent + Lean API Compounds

Feature-rich harnesses load a big system prompt on every turn, run extra deliberation passes, and emit verbose explanations the user didn't ask for. Each of those is a tax. Pi-mono cuts the harness side of the tax (small system prompt, four-tool loop, no over-engineered planning layer); caveman + low reasoning cut the model side (compressed output, minimal reasoning tokens); Smart AIPI cuts the provider side (10× cached input, no subscription floor). None of these alone is a 10× win, but multiplied together they meaningfully change what an extended agentic project costs to run.

Pi's worldview — "adapt the agent to your workflow, not the other way around" — pairs with Smart AIPI's worldview — "pay only for what you actually compute." If you've been hitting subscription caps or paying for harness features you never touch, the combination is worth a single afternoon of trial. Sign up, generate a key, set OPENAI_BASE_URL, install pi, install the three skills, ship something.

Get Started — $5 Free On Sign-Up

Every new account gets $5 in free API credits — no credit card required.

$5 is enough headroom to install pi, hook up the three skills, grill-me a real task end-to-end, and run an extended pi-mono session before deciding whether to top up.

  1. Sign up at smartaipi.com/signup (free $5, no credit card)
  2. Create an API key in the dashboard
  3. Install pi: curl -fsSL https://pi.dev/install.sh | sh
  4. Set OPENAI_BASE_URL=https://api.smartaipi.com/v1 and OPENAI_API_KEY to your new key
  5. Install grill-me + caveman and superpowers
  6. Run pi, grill-me your next task on high reasoning, drop to low + caveman, ship it

Same minimalist harness you'd run on a subscription — now on a pricing model that scales with what you actually compute, with the cache hit rate that makes long agentic sessions almost free.

Pi-Mono Coding Agent Lean Agent GPT-5.5 Skills Grill-Me Superpowers Caveman Prompt Caching OpenAI Compatible
S
Written by
Smart AIPI

OpenAI-compatible API gateway. Access frontier AI models at 75% less cost.

Start for free

Message sent

We'll get back to you within 2 business days.

Contact Support

Have a question or need help? Send us a message and we'll get back to you within 2 business days.