Anthropic's launch-your-agent: Claude Code Agent Deployment From Zero to Live in One Session

Three months after you see the demo, most AI agents are still running on localhost.
That gap — between "I built an agent" and "my agent is running in production" — is where most AI SaaS ambitions stall. Infrastructure choices pile up. The eval pipeline never gets built. The scheduled deployment stays on the roadmap.
Anthropic's launch-your-agent is a direct attack on that pattern. It's a Claude Code skill — a structured capability package you clone, open in Claude Code, and kick off with /launch-your-agent. What comes out isn't a prototype or a proof-of-concept. It's a live Claude Managed Agent running in your own Anthropic Console account, with claude code agent deployment handled end-to-end, an eval against success criteria you defined, and a scheduled run configured if your task happens more than once.
This is the fastest documented path from agent idea to agent in production. Here's the full picture.
What Claude Managed Agents Are (Before You Dive In)
Context first, because "managed" is doing real work in that name.
Anthropic launched Claude Managed Agents (CMA) on April 9, 2026, alongside Claude Cowork GA. The short version: Anthropic runs the infrastructure. You define the agent — model, system prompt, tools, success criteria. Sandboxing, credential isolation, session persistence, and failure recovery are handled.
A CMA has four building blocks:
| Concept | What It Is |
|---|---|
| Agent | The model, system prompt, tools, MCP servers, and skills |
| Environment | Where sessions run — Anthropic-managed cloud sandbox or self-hosted |
| Session | A running agent instance executing a specific task |
| Events | Messages exchanged between your app and the agent (user turns, tool results, status updates) |
The API surface is managed-agents-2026-04-01. Every request needs that beta header; the official SDK sets it automatically. The ant CLI (installable via brew install anthropics/tap/ant) gives you a command-line path to the same primitives.
What makes CMA structurally different from rolling your own harness is the failure model. Container crashes become retriable tool errors. Session state persists independently from the harness. Credentials live in vaults external to the sandbox — the code Claude generates never touches API keys directly. That's weeks of infrastructure work you skip entirely.
How the launch-your-agent Skill Works
The repo has 380 stars, 84 forks, and is Apache 2.0 licensed. Clone it, open it in Claude Code, and run /launch-your-agent. The skill runs four phases:
Phase 1: Interview
Claude asks structured questions: What task should this agent handle? What does success look like? How often does it run? What tools does it need?
The interview output is a build sheet — a human-readable spec that maps your answers to CMA primitives. Nothing deploys until you've reviewed it. The companion file interview-to-config.md in the repo shows exactly how each interview answer maps to an API parameter, so you're not guessing.
Phase 2: Stage & Launch
The skill translates the build sheet into exact API payloads and deploys your agent to your own Console account. One agent, one environment, one first session — live, not local.
Your my-agent/ folder after this step:
- Build sheet (human-readable spec)
- Exact API payloads used for deployment
- Resumable launch script (if something interrupted, pick up where you left off)
- Eval scaffold for Phase 3
Phase 3: Grade & Iterate
This is where most agent projects skip steps and regret it later.
The skill runs your agent against a real task and evaluates the output against the rubric you wrote in Phase 1. It connects directly to Outcomes — a CMA feature that shipped May 2026 — which assigns a separate grader agent to score sessions against your success criteria.
If the agent underperforms, the skill guides iteration: tighten the prompt, add a tool, narrow the scope. You're not staring at logs trying to infer what went wrong. The grader tells you.
Phase 4: Run Without You
If your task recurs, the skill configures a scheduled deployment. Daily report, weekly data pull, recurring customer workflow — the agent runs, logs its session to the event stream, and surfaces results. You check results; you don't babysit execution.
The companion /wrap-up command gives you a status recap and upgrade suggestions when you're done.
The Architecture Underneath: Brain, Hands, Session
Understanding why CMA performs the way it does requires looking at what Anthropic deconstructed in building it.
Traditional agent frameworks couple three things that should be independent:
The brain — Claude plus the harness logic that decides what to do next.
The hands — the execution environment: code sandbox, bash, file system, MCP servers, web search.
The session — the durable record of everything that happened.
When those are coupled in one container, a crash in the hands takes down the brain and loses the session. An environment update requires redeploying everything. Scaling means spinning up more monoliths.
CMA decouples them. The brain is a stateless service. The hands are any execution environment reachable through a uniform execute(name, input) → string interface — Anthropic-managed cloud sandboxes or your own self-hosted infrastructure. The session is an append-only event log that lives outside both, accessible via getSession, emitEvent, and getEvents primitives.
The performance impact from Anthropic's own rollout of this architecture:
- p50 time-to-first-token: ~60% reduction
- p95 time-to-first-token: >90% reduction
That p95 number matters. Tail latency in agent workflows compounds — every slow tool call delays downstream reasoning. A >90% cut at the 95th percentile means agents that previously hung are now completing reliably.
What Shipped in May 2026: Three New Primitives
Five weeks after the April launch, Anthropic added three features that change what founders can build:
Dreaming (research preview): The agent reviews its own past sessions, extracts patterns, and curates memories. Anthropic describes it as: "Together, memory and dreaming form a robust memory system for self-improving agents." For recurring workflows, this means an agent that gets better at your specific task without you re-prompting it.
Outcomes: Write a rubric describing what success looks like. A separate grader agent evaluates every session against it. This is what launch-your-agent's Phase 3 uses under the hood. It's the built-in eval loop that production agents need and most DIY frameworks skip.
Multiagent Orchestration: A lead agent breaks a job into pieces and delegates to specialist agents, each with its own model, prompt, and tools. Netflix has deployed this for its platform team. For SaaS builders, this unlocks the difference between an agent that does one task and an agent that runs a workflow.
List Your AI Agent Tool on SaaSCity
Building an agent product, framework wrapper, or infrastructure tool on top of Claude Managed Agents? Get it visible to the founders and engineers who are actively looking for what you're shipping.
SaaSCity.io is a directory for modern SaaS and AI tools. Your listing doesn't just become a static page — your product is visualized as a building in our interactive 3D digital city.
- 📈 Increase Domain Rating: Earn a permanent dofollow backlink that moves your SEO numbers.
- 🚀 Find Your Early Adopters: Reach a community of founders, developers, and buyers actively exploring AI tooling.
- 🆓 100% Free to List: Submit in under 2 minutes, no strings.
List your AI agent product today and get found by the people building with these tools.
What This Actually Means for Founders Building SaaS
Three things worth saying plainly:
The gap between concept and production is now a skill file. A technical founder with Claude Code and an Anthropic API key can go from zero to a graded, deployed, scheduled agent in a single session. The infrastructure decisions — sandbox, credentials, session persistence, eval loop — are already made. You start at the differentiation layer.
Evals are no longer optional. The launch-your-agent skill builds the grading step into the deployment workflow. That's not accidental. Most agent products fail not at launch, but at the iteration that should come right after. Outcomes gives you a structured signal on whether your agent is working before your users find out it isn't.
Multiagent orchestration changes the scope of what's buildable. Single-agent products do one thing. A lead agent that delegates to specialists does a workflow. Netflix deployed this for platform-scale operations. The same primitive is available to the solo founder building an automated research tool or a customer-facing AI worker.
If you're newer to Claude Code and haven't worked with skills before, the Claude Code for beginners guide covers authentication, your first project, and how the slash command system works. If you're looking for concrete product ideas to build on this infrastructure, 15 Claude Code project ideas with copy-paste prompts covers the builds that map well to CMA's capabilities. And if you're working out the full cost picture for agentic workflows — API keys, token plans, usage limits — the AI agent coding plan comparison breaks down what you'll spend month-to-month.
The actual quickstart is four commands:
git clone https://github.com/anthropics/launch-your-agent
cd launch-your-agent
claude
# then type: /launch-your-agent
Cost: cents per run. Requirements: Claude Code authenticated, an API key from platform.claude.com, a local .env file. The key never enters the chat.
Where This Is Heading
The session event log — the append-only record that persists across harness restarts and environment changes — is designed as an OS-style abstraction. Anthropic's engineering docs describe the goal as "stable interfaces that outlast specific implementations, enabling programs as yet unthought of."
That phrase is doing more work than it looks like. The same session log that feeds a single scheduled agent today is the same surface a multiagent orchestrator will read tomorrow, and that a dreaming agent will mine for patterns next month. The primitives aren't feature releases. They're the accumulating substrate for agents that will handle things we haven't scoped yet.
The founders moving now aren't picking between frameworks. They're deciding which workflow to give their agent first — then grading it, iterating, and scheduling it. launch-your-agent is the template for that loop. The rest is product work.
SaaSCity.io covers AI tools, developer infrastructure, and the founder ecosystem. Explore the SaaSCity directory to discover what's shipping right now — or list your own product.
Get your SaaS in front of founders
List your product on the SaaSCity live city map — a permanent listing, real discovery, and a backlink from a high-DR directory. Free to start; upgrade for a dofollow link and a building on the map.


