Track, secure, and optimize every agent run with absolute visibility and control — not just individual LLM calls.
pip install stateloom
They miss the bigger picture. Your agent makes 50+ calls per run — each invisible to the last.
Session-aware middleware groups calls into meaningful workflows. Resume crashed runs, enforce budgets, contain rogue agents.
Runs on your laptop or inside your VPC. Your data never passes through a third-party proxy.
Free and open source for individual developers. Enterprise features for teams at scale.
Session-scoped cost tracking with per-model breakdown. Hard budget enforcement stops runaway agents before they drain your wallet.
Detect emails, credit cards, SSNs, API keys. 32 heuristic injection patterns, NLI classifier, and Llama-Guard support.
Temporal-like checkpointing. Agent crashes mid-run? Restart and resume from cache — no repeated API calls, no wasted spend.
Run Ollama models locally. Intelligent routing sends simple requests to local models and complex ones to the cloud.
Exact-match and semantic caching. Loop detection catches spinning agents. Circuit breaker with automatic provider failover.
Shadow-test candidates against production models. A/B experiments with built-in metrics. Multi-agent consensus (vote, debate, self-consistency).
Auto-patches OpenAI, Anthropic, Gemini, Cohere, Mistral, and LiteLLM. Mix providers freely in a single session.
Session viewer with waterfall traces, cost breakdown, security controls, observability charts, and WebSocket streaming — all at localhost:4782.
Point Claude Code or Gemini CLI at StateLoom. Get full session tracking, PII scanning, and budget enforcement with zero code changes.
import stateloom
import anthropic
stateloom.init()
claude = anthropic.Anthropic()
with stateloom.session("customer-report", budget=2.0, durable=True) as s:
research = claude.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Key trends in AI governance 2025"}],
)
# If this crashes, restart → resumes from cache. No repeated calls.
# Budget enforcement stops the run if it exceeds $2.
print(f"Cost: ${s.total_cost:.2f} | {s.total_tokens} tokens")
A centralized control plane to govern, secure, and optimize your entire AI workforce.
See how StateLoom can secure and optimize your AI infrastructure. We'll walk you through a live demo tailored to your use case.