Tools 8 min read

AI Spending Cap Simulator: Test 3-Level Budget Controls

FairStack Team February 13, 2026

An AI agent with API access and no spending limit is a billing event waiting to happen. One misconfigured loop, one runaway prompt chain, and your balance drops by hundreds of dollars before the error surfaces. Most AI platforms offer no protection against this. FairStack’s 3-level spending cap system lets you set hard budget limits at the organization, project, and API key level — and this simulator lets you test those limits before you deploy.

Simulator

[Interactive component: Set org cap, project cap, API key cap. Run scenarios that trigger cap enforcement. See exactly which cap fires and when.]

The simulator models FairStack’s 3-level spending cap system with four pre-built scenarios and a custom mode. Set your caps, run the scenario, and watch how budget enforcement stops overspending at the exact threshold you define.

Why Spending Caps Matter for AI Agents

AI agents are autonomous. That is the point. But autonomy without budget constraints is a design flaw, not a feature.

The agent billing problem has three parts:

1. Agents generate in loops. A content pipeline that generates 10 images per article, processes 50 articles, and retries failures can execute 500+ generation calls in a single run. At $0.05 per image, that is $25. At $0.40 per video clip, the same loop costs $200. Without caps, the agent has no reason to stop.

2. Errors compound silently. A bug that sends malformed prompts to a video model still incurs GPU charges. The model runs, produces garbage output, and the agent retries — generating more garbage and more charges. By the time a human notices, the balance is drained.

3. Shared API keys share risk. If three developers share one API key, one developer’s runaway script affects everyone’s budget. Without per-key caps, there is no isolation between workloads.

No other AI generation platform solves all three problems. Replicate has no spending caps. fal.ai has no spending caps. ElevenLabs has usage limits tied to subscription tiers, not configurable caps. Runway has no programmatic budget enforcement.

FairStack built spending caps as a core infrastructure feature, not an afterthought.

How FairStack’s 3-Level System Works

FairStack enforces spending limits at three nested levels. Each level is independent — the tightest cap at any level wins.

Organization Cap ($500/month)
└── Project Cap ($100/month)
    └── API Key Cap ($20/month)

Level 1: Organization Cap

The top-level safety net. Applies to all spending across the entire organization — web app, API, MCP server, every project, every key.

# Set organization spending cap via API
curl -X PATCH https://api.fairstack.ai/v1/org/settings \
  -H "Authorization: Bearer fs_admin_key" \
  -d '{"spending_cap_monthly_micro": 500000000}'
  # 500,000,000 microdollars = $500/month

When the org cap is hit, every generation request across the organization returns a 429 SPENDING_CAP_REACHED error. No generation proceeds. No credits are deducted.

Use case: Company-wide maximum. “We will never spend more than $500/month on AI generation, regardless of which team or project is generating.”

Level 2: Project Cap

Isolates budget by workload. Each project within an organization can have its own monthly spending limit.

# Create a project with a $100/month cap
curl -X POST https://api.fairstack.ai/v1/projects \
  -H "Authorization: Bearer fs_admin_key" \
  -d '{
    "name": "Marketing Content Pipeline",
    "cap_monthly_micro": 100000000,
    "cap_per_request_micro": 500000
  }'
  # $100/month cap + $0.50 max per single generation

Project caps include two controls:

Monthly cap: Total spending limit for the project per billing cycle
Per-request cap: Maximum cost for any single generation. Prevents expensive model calls (a $1.20 Sora 2 Pro generation) from eating the budget in one shot.

Use case: Budget isolation between teams. “Marketing can spend $100/month. Engineering R&D can spend $200/month. Neither team’s spending affects the other’s cap.”

Level 3: API Key Cap

The most granular control. Each API key can have its own spending limits, independent of the project and org caps.

# Create an API key with a $20/month cap
curl -X POST https://api.fairstack.ai/v1/api-keys \
  -H "Authorization: Bearer fs_admin_key" \
  -d '{
    "name": "content-agent-prod",
    "cap_total_micro": 20000000,
    "cap_per_request_micro": 100000,
    "allowed_modalities": ["image", "voice"]
  }'
  # $20 total cap + $0.10 max per request + only image and voice allowed

API key caps add a third control beyond monthly and per-request:

Total cap: Lifetime spending limit for this key (not monthly — total)
Per-request cap: Maximum cost per generation
Modality restrictions: Limit which generation types the key can access (image, voice, video, music)

Use case: Agent-specific limits. “This agent can spend $20 total on image and voice generation. It cannot generate video. No single generation can cost more than $0.10.”

How the Three Levels Interact

Caps are evaluated in order: API key cap first, then project cap, then org cap. The tightest limit at any level blocks the request.

Request: Generate an image ($0.046)

Check 1: API key cap
  └── Per-request: $0.046 < $0.10 limit ✅
  └── Total spent: $15.20 < $20.00 limit ✅

Check 2: Project cap
  └── Per-request: $0.046 < $0.50 limit ✅
  └── Monthly spent: $87.30 < $100.00 limit ✅

Check 3: Org cap
  └── Monthly spent: $412.50 < $500.00 limit ✅

Result: Generation proceeds. Credits deducted after completion.

If any check fails:

Request: Generate a video ($0.345)

Check 1: API key cap
  └── Per-request: $0.345 > $0.10 limit ❌ BLOCKED

Result: 429 SPENDING_CAP_REACHED
  "error": "per_request_cap_exceeded",
  "cap_type": "api_key",
  "cap_limit_micro": 100000,
  "request_cost_micro": 345000

The response tells the agent exactly which cap was hit and why, so it can adjust its behavior programmatically.

Pre-Built Scenarios

Try these in the simulator to see cap enforcement in action.

Scenario 1: Content Pipeline

Setup: Marketing team running a blog content pipeline that generates hero images, social cards, and voice narration for each article.

Cap	Limit
Org	$500/month
Project: “Blog Content”	$150/month
API Key: “content-bot”	$30/month, $0.10/request

Run: Agent processes 20 articles, generating 3 images ($0.023 each) and 1 voice narration ($0.05) per article. Total cost: $2.78. All caps hold comfortably.

Then: Agent encounters a bug and enters a retry loop on video generation at $0.345/clip. The per-request API key cap ($0.10) blocks the video call immediately. Zero budget wasted on the bug.

Scenario 2: Multi-Team Isolation

Setup: Engineering org with three teams sharing one FairStack account.

Cap	Limit
Org	$1,000/month
Project: “Product Photos”	$300/month
Project: “Voice Assistant”	$200/month
Project: “R&D Experiments”	$100/month

Run: Product Photos team generates 5,000 images in a batch ($115 total). Voice Assistant runs normally ($45/month). R&D experiments hit their $100 cap mid-month. R&D stops generating; the other two projects continue unaffected.

Scenario 3: Agent Budget Guardrails

Setup: AI agent with MCP access processing customer requests autonomously.

Cap	Limit
Org	$200/month
Project: “Customer Service”	$50/month
API Key: “cs-agent”	$5/day (via total cap rotation), $0.05/request, image-only

Run: Agent handles 100 customer image requests per day at $0.005-0.04 each. The per-request cap prevents the agent from accidentally calling expensive models (Imagen 4 at $0.046 would be blocked by the $0.05 cap, Nano Banana Pro at $0.10 would be blocked). The daily rotation cap limits exposure to $5/day regardless of volume.

Scenario 4: Cost Simulation Before Execution

Setup: Developer building a pipeline that estimates costs before running.

# Step 1: Estimate the batch cost
curl -X POST https://api.fairstack.ai/v1/estimate \
  -H "Authorization: Bearer fs_your_api_key" \
  -d '{
    "model": "flux-schnell",
    "params": {"prompt": "product photo"},
    "batch_size": 500
  }'

# Response
{
  "per_request": {
    "estimated_cost_micro": 3450,
    "estimated_cost_display": "$0.0035"
  },
  "batch": {
    "total_cost_micro": 1725000,
    "total_cost_display": "$1.73"
  },
  "caps": {
    "api_key_remaining_micro": 18275000,
    "project_remaining_micro": 98275000,
    "org_remaining_micro": 498275000,
    "would_exceed_cap": false
  }
}

The estimate endpoint returns remaining budget at all three cap levels, so the agent knows before executing whether the batch will fit within its limits.

No Other Platform Does This

Feature	FairStack	Replicate	fal.ai	ElevenLabs	Runway
Organization-level cap	Yes	No	No	No	No
Project-level cap	Yes	No	No	No	No
API key-level cap	Yes	No	No	No	No
Per-request cap	Yes	No	No	No	No
Modality restrictions per key	Yes	No	No	No	No
Cost estimation before execution	Yes	No	No	No	No
Cap hit details in error response	Yes	N/A	N/A	N/A	N/A

Replicate, fal.ai, ElevenLabs, and Runway all rely on the same model: set a credit card limit and hope for the best. None offer programmatic, hierarchical budget enforcement. None return remaining budget in cost estimates. None let you restrict which modalities a specific API key can access.

For developers building autonomous AI systems, this is not a convenience feature. It is a production requirement.

API Code Example: Simulate Before You Execute

The complete flow for an agent that checks its budget before generating:

import requests

FAIRSTACK_API = "https://api.fairstack.ai/v1"
API_KEY = "fs_your_api_key"
headers = {"Authorization": f"Bearer {API_KEY}"}

def generate_with_budget_check(model: str, prompt: str, max_cost: float = 0.10):
    """Generate only if estimated cost is within budget."""

    # Step 1: Estimate cost
    estimate = requests.post(f"{FAIRSTACK_API}/estimate", headers=headers, json={
        "model": model,
        "params": {"prompt": prompt}
    }).json()

    cost = estimate["per_request"]["estimated_cost_micro"] / 1_000_000

    # Step 2: Check against local budget
    if cost > max_cost:
        return {"error": f"Estimated cost ${cost:.4f} exceeds local limit ${max_cost}"}

    # Step 3: Check against platform caps
    if estimate["caps"]["would_exceed_cap"]:
        return {"error": "Would exceed platform spending cap",
                "details": estimate["caps"]}

    # Step 4: Generate
    result = requests.post(f"{FAIRSTACK_API}/image/generate", headers=headers, json={
        "model": model,
        "prompt": prompt
    }).json()

    return result

# Usage
result = generate_with_budget_check("flux-schnell", "a product photo of headphones")

This pattern — estimate, check, generate — is the standard for any agent that needs cost control. FairStack provides the infrastructure; your agent provides the logic.

FAQ

Can I change caps after setting them?

Yes. Caps can be updated at any time via the API or the web dashboard. Changes take effect immediately for the next generation request.

What happens when a cap resets?

Organization and project caps reset at the start of each billing cycle (monthly). API key total caps do not reset — they are lifetime limits. To create rolling daily or weekly caps, rotate API keys programmatically or use the project monthly cap as a proxy.

Do caps apply to the web app too?

Organization and project caps apply to all generation methods: web app, REST API, and MCP server. API key caps apply only to API and MCP requests (the web app uses session auth, not API keys).

Can an agent query its remaining budget?

Yes. The /v1/estimate endpoint returns remaining budget at all three cap levels in every response. Agents can also call /v1/org/usage for a full usage summary.

What is the MCP server?

MCP (Model Context Protocol) is a standard for AI agents to interact with external tools. FairStack’s MCP server lets Claude, GPT, and other AI agents generate images, voice, video, and music with full spending cap enforcement. Read the agent documentation for setup instructions.

Try the simulator with your own cap configuration. Set your org, project, and API key limits, then run scenarios to see exactly when and how caps fire. When you are ready, create an account and configure real caps in under 5 minutes. Or read the agent documentation for the full MCP and API setup guide.