Code with Claude SF: Managed Agents and the Build-vs-Buy Call
Anthropic shipped Managed Agents, Dreaming, and Outcomes at Code with Claude SF on May 6, 2026. Should you rent their harness or build your own? Here is the decision matrix by team size and use case.
Hrishikesh Baidya
May 6, 202614 min read
0%
Anthropic ran Code with Claude San Francisco on May 6, 2026. Five things shipped: Managed Agents (the harness becomes the product), Dreaming (between-session memory consolidation), Outcomes (self-grading rubrics), multi-agent orchestration, and Claude Finance with 10 pre-built agents. The honest read is that Anthropic has decided the harness is the product — not the model. For Indian teams that have spent the last 18 months building their own agent harness on top of the API, this is a build-vs-buy fork. Here is the matrix we are using with clients this week to decide.
## TL;DR — When does Anthropic Managed Agents make sense?
Buy Managed Agents if you're a team under 8 engineers AND your use case is well-bounded (customer support, finance ops, document workflows) AND you'll run agents long enough that runtime ($0.08/session-hour) is cheaper than DIY infrastructure. Build your own harness if you have ≥10 engineers, multi-cloud requirements, sub-second latency targets, or you need vendor-portable abstractions (run the same harness on Anthropic, OpenAI, and Gemini).
$0.08
Per Session-Hour Runtime Fee
10
Pre-Built Claude Finance Agents
5h × 2
Pro/Max Rate Limit Doubled
220K+
SpaceX Colossus 1 GPUs Partnership
## Why this matters now (May 2026)
Three product moves landed in the same week. First, Anthropic Managed Agents bills on three axes — tokens (same as before), session runtime ($0.08/hour billed to the millisecond), and tool-triggered costs (web search $10/1,000 queries) — which is a new billing dimension for any team that was on pure-token pricing. Second, Dreaming (hippocampal memory consolidation between sessions) makes long-running agents materially more capable, but only if you stay inside Anthropic's harness. Third, OpenAI countered the same week with their open-source Agents SDK supporting seven sandbox providers — no first-party runtime fee. The market just split into "buy the integrated stack" vs "build the portable stack."
## What Managed Agents actually is
Managed Agents is a beta API where you define an agent (tools, prompts, guardrails), and Anthropic runs the execution environment — long-running sessions, sandboxed code execution, scoped permissions, end-to-end tracing, and MCP-based tool connections. You write the agent spec; they run the loop.
🏗️
Managed runtime
Anthropic runs the agent loop, not you. No EC2, no Lambda, no container orchestration. Trade-off: you don't control the runtime layer.
💤
Dreaming (memory consolidation)
Between sessions, the agent reviews past work, pulls patterns, writes new memory entries. Like a brain replaying the day during sleep. Materially better on repeating jobs.
🎯
Outcomes (self-grading rubric)
A separate evaluator agent scores the worker agent's output against a written rubric and tells it what to fix. Closed-loop quality improvement.
🔀
Multi-agent orchestration
A lead agent fans work out to specialist subagents running in parallel. Useful for research, multi-document analysis, code refactors across many files.
## The build-vs-buy decision matrix
We use this with clients after a 60-minute discovery call. Pick the row that matches your team size, then check whether the use case is bounded.
Team size
Bounded use case
Multi-vendor needed
Verdict
Solo founder / 1-3 eng
Yes
No
Buy Managed Agents
Solo founder / 1-3 eng
No (custom)
No
Buy + customize via MCP
4-8 engineers
Yes
No
Buy Managed Agents
4-8 engineers
Yes
Yes (also GPT)
Build portable harness
9-25 engineers
Mixed
Likely
Build (with MCP for portability)
25+ engineers
No
Yes
Build (multi-cloud, multi-model)
Any size — finance/treasury
Yes
No
Buy Claude Finance (pre-built)
Any size — <5 sec latency SLA
Any
Any
Build (managed adds latency)
## The cost math: when does $0.08/session-hour actually matter
A "session" in Managed Agents is the time a workflow is alive — from invocation to completion. If your agent runs for 12 minutes per session, that is $0.016 (~₹1.35) per session in runtime, on top of token costs. Multiply by your daily session count.
For our Hyderabad SaaS client running ~3,000 sessions/day averaging 8 minutes each: 400 session-hours/day × $0.08 = $32/day = ~₹2,720/day = ~₹81,600/month in runtime fees alone, on top of ~₹4.2L/month in token costs.
For comparison, their self-built harness on AWS ECS with the same workload costs ~₹14,000/month in compute + storage. Managed Agents is ₹67,000/month more expensive for this workload — but it removes ~30 hours/month of engineering time on container ops, retries, monitoring. At their engineer cost (~₹3,500/hour fully loaded), that is ₹1.05L/month of recovered engineering time.
Net: Managed Agents costs ₹67K more in infrastructure but saves ₹1.05L in engineering time. The buy decision wins by ~₹38K/month for this client. Different clients flip the other way at different scales.
The crossover point in our experience sits between 9 and 12 engineers. Below that, buy wins. Above that, build wins because per-engineer ops savings stop compounding.
## The 3 hidden gotchas before you buy
Things we have learned across 5 client deployments on Managed Agents beta.
Gotcha 1: Session billing is granular but unpredictable. A "session" can stretch from 90 seconds to 6 hours depending on what the agent does. Budget projections need P95 session length, not average. We've seen 30% overruns on initial estimates because finance teams plan on averages.
Gotcha 2: Tool costs stack fast. Web search at $10/1,000 queries adds up when an agent does 8–15 searches per session. A 1,000-session/day workload with 10 searches each is $100/day in web search alone — ₹2.5L/month in addition to token + runtime.
Gotcha 3: Vendor lock-in is real. The Dreaming memory consolidation, the Outcomes rubric loop, the multi-agent orchestration — these are Anthropic-specific. You cannot easily port them to GPT-5.5 or Gemini. If you decide in 18 months to multi-vendor, you'll rewrite the agent layer. Build your portable abstractions early or accept the lock-in honestly.
The OpenAI counter: OpenAI's Agents SDK supports seven sandbox providers, has no first-party runtime fee, and remains open source. Same week as Code with Claude SF, this is the explicit "build" alternative. For multi-vendor strategies, OpenAI's SDK + your own runtime is the path. Per The New Stack's analysis: "Anthropic, OpenAI, Google and Microsoft agree the harness is the product. They disagree on the price."
## When to buy Claude Finance specifically
Claude Finance ships with 10 pre-built agents — accounts payable, accounts receivable, financial close, treasury, expense review, and others. For an Indian SMB or mid-market company running a small finance team, this is genuinely valuable. The agents come with audit logging, dual-control workflows, and compliance hooks that you would otherwise build yourself.
When does it make sense? If you process ≥1,000 invoices/month, run a finance team of 3–8 people, and use Tally or Zoho Books, the pre-built agents save weeks of integration work. The cost runs ~₹85,000–₹2.4L/month per agent depending on volume, but replaces 0.5–1 full-time analyst per agent.
When does it NOT make sense? If your accounting stack is unusual (Marg, Busy, or custom ERPs common in Indian manufacturing), the pre-built integrations don't help. If your volume is <500 invoices/month, the FTE math doesn't work.
## The 8-step path to a buy/build decision
This is the workflow we run with clients in week one of an engagement.
1
Count engineers who would touch the agent layer
Not total team size — engineers who would actually build/maintain/operate the agent harness. Most teams overcount here.
2
Estimate sessions/day and avg session duration
Pull from logs if you have an existing system. For greenfield, estimate from user volume × interactions/user/day. Use P95 duration for cost math, not mean.
3
Project tool-use costs separately
Web search, code execution, file system, custom tools. Each has its own billing. Project monthly cost per tool type.
4
Check multi-vendor requirement honestly
Are you actually multi-vendor today, or do you say you are? If GPT-5.5 is <15% of total spend, you're not really multi-vendor — single-vendor is OK.
Pick one workflow. Build it both ways if you have time. Most teams pilot Managed Agents because it's faster to ship — that bias is real.
7
Measure on quality, latency, cost, engineer time
All four. Most decisions go wrong because someone optimized for just one (usually cost).
8
Commit to the choice for at least 6 months
Switching is expensive. If you choose Managed Agents, commit. If you build, commit. Decision-flipping kills more agent projects than wrong-initial-choice.
## When NOT to use Managed Agents (3 honest cases)
Sub-second latency budgets. Managed Agents adds ~200–800ms of orchestration overhead vs a tightly tuned self-hosted harness. For voice agents or real-time UX, this is a deal-breaker. Build.
Heavy custom tooling outside MCP. If you depend on dozens of internal tools that don't have MCP servers and never will, you'll spend more time writing MCP adapters than you'd save on managed runtime. Build.
Data sovereignty / on-premise requirement. Some BFSI and government clients in India require all execution inside their own VPC. Managed Agents executes in Anthropic's cloud. Hard no for those workloads. Build.
## Real example: a Pune CFO's call we just took
A 110-person Pune SaaS company processes ~1,800 customer-success conversations a month. They wanted an agent to draft follow-up emails, log activities in their CRM, and flag at-risk accounts. 4-engineer team. No multi-vendor today.
Decision: buy Managed Agents. Reasons: (a) team too small to operate a harness, (b) use case bounded — three clear workflows, (c) Anthropic's Dreaming feature was genuinely useful for "remember the customer's history" use case, (d) cost projection ₹2.4L/month vs ₹3.8L/month all-in for DIY (engineer time dominated).
3-week pilot validated the call. Now in production at ~₹2.2L/month, ~92% draft quality, ~30 hrs/month of CSM time freed. Worth it. Different client at 25-engineer size with multi-vendor strategy = different decision.
## The checklist before signing the Managed Agents contract
You have under 10 engineers AND a bounded use case
You don't need sub-second latency
You're not committed to multi-vendor (single-vendor is the honest answer)
Your data residency requirements allow Anthropic's cloud execution
You've projected costs including session runtime AND tool-use AND tokens
You've allocated 15–25% budget cushion for first-month overruns
You have a fallback plan if the beta has reliability issues
Your finance team understands "session-hour" billing (not per-API-call)
## FAQ
### What is the actual price of Managed Agents in May 2026?
Standard model token rates (Opus 4.7: $5/M input, $25/M output), plus $0.08/session-hour runtime, plus tool costs (web search $10/1,000 queries). Custom tools vary. Budget 15-30% above raw token cost for managed overhead.
### Can I use Managed Agents with GPT-5.5?
No. Managed Agents is Anthropic-specific. For multi-vendor, use OpenAI's Agents SDK with your own runtime (or use both side-by-side for different workloads).
### What's Dreaming actually doing?
A scheduled process between agent sessions that reviews past sessions, pulls patterns, writes new memory entries. Materially improves repeating-job quality. Only useful if you run the same agent repeatedly on similar tasks.
### Is the Outcomes self-grading rubric worth it?
For high-stakes workflows (finance, compliance, customer-facing), yes — the additional eval cost (~30% more tokens per task) catches quality regressions. For exploratory workflows, no — adds latency without proportional value.
### Does Managed Agents support MCP?
Yes — MCP is the connection layer for tools. You define your tools as MCP servers; Managed Agents calls them. This is the only portable piece — your MCP servers work the same way if you later migrate to a DIY harness.
### What's the SLA?
Anthropic publishes 99.9% for the API. Managed Agents is beta as of May 6, 2026; no formal SLA yet. Build retry logic, expect occasional regressions.
### How does this compare to LangChain or CrewAI?
LangChain/CrewAI are open-source frameworks you self-host. Managed Agents is a fully hosted product. If you're already on LangChain and it works, switching to Managed Agents trades portability for managed-ops convenience. Most LangChain teams should evaluate but not rush to switch.
Want Help Picking the Right Agent Platform?
We run a 60-minute build-vs-buy workshop with your team — your workloads, your cost projections, your multi-vendor situation. Output: a written recommendation (buy / build / hybrid) with 6-month cost model in INR. Typical engagement: ₹85,000 fixed for the workshop + report. Suitable if you have an active agent project, are deciding between Anthropic Managed Agents, OpenAI Agents SDK, and self-built harness, and want a vendor-neutral read.