Build momentum is real. So are the costs nobody's modeling.
Issue #6

Build momentum is real. So are the costs nobody's modeling.

The build wave is legitimate. Operators are shipping faster than vendors can sell. But three hidden trade-offs are catching teams off guard: maintenance economics, token subsidies with expiration dates, and model-version fragility that just got concrete.

By Victor Sowers — 15 years scaling B2B SaaS GTM

AI Build EconomicsOpus 4.7Token CostsContext PortabilityModel RiskMaintenance·3 deep dives·~6 min read

The Signal

    The Shift

    Last Wednesday, Anthropic shipped a model update to Opus. Workflows that worked on Tuesday stopped working on Thursday. Most power users are now re-architecting guardrails and rewriting prompts. Anthropic didn't try to maintain backwards compatibility (subsequent apologies aside, this is a major vendor choice and a major vendor risk).

    The scramble is so acute because the build momentum is so real. Good moment to sanity check our collective assumptions within the "SaaS is dead / build everything" movement.

    Three things to think harder about:

    • One: maintenance is infrastructure-grade, not software-grade. Jason Lemkin's Agents #001 post-mortem: vibe-coded apps need daily maintenance, not quarterly, not just when something breaks.
    • Two: the token subsidy has an expiration date. OpenAI lost ~$5 billion on $3.7B in revenue last year. Some power users generated $35K in compute costs on $200/month plans, a 175x subsidy. WEKA calls it: the subsidized agent era ends by close of 2026.
    • Three: the contracts already know. Sub-1-year contracts have tripled from 4% to 13% since 2023. The market is hedging even as builders accelerate.

    The teams pulling ahead budget for what comes after the build: maintenance, context infrastructure, and resilience when the model underneath you changes overnight.

    1

    Opus 4.7 — What Actually Broke

    Based on: Vibe Coding / Medium

    Key takeaway: Fully autonomous agents in production right now are a bet on stability that doesn't exist.

    On April 16, Anthropic deliberately broke backwards compatibility in Opus 4.7. They removed thinking.budget_tokens, temperature, top_p, and top_k from the API entirely — existing code that passed those parameters doesn't degrade, it throws errors. The new tokenizer consumes up to 1.35x more input tokens. And the model behaves differently: it pushes back on instructions, hedges where it used to comply, and resists corrections. Developers were calling it "legendarily bad" within 24 hours. The DAAF Guide called it "a crucial reality check for anyone building with AI in 2026."

    I run my entire operation on Claude Code. I'm not arguing against building on Claude. I'm arguing for a resilience layer between your workflow and whatever model powers it.

    If you built your workflow around Opus 4.6's specific behavior (how it handled long contexts, the cadence of its responses, how it reasoned through multi-step tasks), then this update is a migration, not a feature drop. Your prompts are coupled to a version, and your team's sense of "good output" is calibrated to a model that just changed underneath them.

    The teams that adapted in hours had three things: version-controlled prompts, eval suites they could run against the new model before deploying, and a human who could tell them whether the output actually changed.

    What broke for me personally? I run long-running agents where a planning skill maps out a multi-phase plan, a human reviews it, and an execution agent chains together dozens of skills and tools over hours of unsupervised work. Tight guardrails, tight exit criteria. These ran reliably for months on Opus 4.6. On Opus 4.7, the agents stopped following instructions. They assumed they already knew how to do things instead of reading the plan. They argued about approach mid-execution instead of building. Hallucination increased, output quality dropped, and workflows that completed autonomously for months became unusable overnight. We couldn't run our core operation for days.

    If your AI workflows broke last week, check these three things:

    • Prompt versioning: Are your prompts version-controlled, or coupled to a model you don't control?
    • Eval suite: Can you run your critical workflows against a new model *before* deploying?
    • Human QA checkpoint: Is there a person who can tell you whether the output changed before your customers can?
    2

    The Hidden Costs of Building

    Based on: Revenue Operations Alliance

    Key takeaway: Most teams stop at speed and call it done. The teams that succeed move through speed, then effectiveness, then operating rhythm.

    Widely cited MIT research puts generative AI pilot failure rates at 95%. Purpose-built solutions, where teams scope tightly before building, succeed roughly two-thirds of the time. The gap is in how teams scope before they build, not which tool they use.

    The Revenue Operations Alliance framed it: "Executives don't fund time. They fund outcomes." The teams that succeed move through three phases: speed, then effectiveness, then operating rhythm where the solution is embedded enough that adoption becomes unavoidable. Most teams stop at speed and call it done.

    Most teams can't see what their agents are doing. One developer built a 3D visualization of agent cognition after his agents ran up $200 in a single afternoon. Most teams have no equivalent. If your AI vendor can't tell you when a sequence goes quiet and why it stopped, you have a liability problem, not just an observability one.

    Then there's the token cost illusion. Per-token costs are falling, but total inference spend is exploding. Agentic usage turned thousands of tokens per session into millions. Ben Thompson's compute economics piece explains why: reasoning models reintroduce real marginal costs into a stack everyone assumed would trend toward zero.

    Meanwhile, the incumbents aren't standing still. Kyle Poyar's four strategic paths for SaaS maps where legacy vendors are heading. Median public SaaS trades at 4.1x revenue, the lowest in a decade. But those same companies sit on 80-95% revenue retention and the data assets AI startups need. The build-vs-buy question has a second half most teams skip: will your vendor build it faster, with your data already inside?

    3

    Context Portability

    Based on: GTM Engineer School

    Key takeaway: Building is staying distributed, but context management is recentralizing. The data lake concept is re-emerging.

    Who owns the ICP definition that every AI tool in your stack depends on? When the market shifts, who updates it, and who notifies the four people who built workflows on the old version? Nobody has figured out the governance model yet. Including us.

    Zach Vidibor at GTM Engineer School calls this the "strategy compression problem." Leadership builds nuanced strategy. By the time it reaches the frontline through enablement decks and tribal knowledge, the signal has degraded to a lossy copy of the original intent. Context engineering is an infrastructure answer to what most companies treat as a training problem.

    Every team has the same models. You can swap Claude for GPT for Gemini. What you can't swap is your first-party data: ICP definitions, competitive positioning, deal history, institutional knowledge about how your buyers actually buy. Infrastructure vendors already see it. Semantic and context layers are becoming a category, not a feature.

    You want individual contributors building at the edges: a PM pushing copy changes, a RevOps analyst prototyping a scoring model, an AE building their own research workflow. That's good. But who governs the context those builds depend on?

    Building is staying distributed, but context management is recentralizing. The data lake concept is re-emerging. AI data teams are forming. If someone on your leadership team is about to propose one, ask them what "context governance" means concretely before greenlighting it.

    Reading Corner

    • AEO: How to Make AI Recommend Your Product 89% of B2B buyers are using AI during purchasing (Forrester). "The new homepage is a ChatGPT prompt." GTM Strategist lays out Answer Engine Optimization.
    • The Slow Decay of Growth Hello Operator maps the PLG decay curve that caught Ramp, Notion, Airtable, Figma, Miro, and Canva. The useful part: companies that bent the curve back up and how.
    • The SaaS Empire Strikes Back Kyle Poyar's four strategic paths for legacy SaaS responding to AI. Pair with the build economics deep dive above.
    • Klaviyo at $1.2B ARR Trading at 4-5x forward revenue with 32% growth, 110% NRR, and expanding margins. The market is pricing in AI disruption that hasn't materialized.
    • Hybrid Outbound Governance Chaos A sales manager documents the governance mess in real time: ownership gaps, conflicting metrics, disputed attribution.
    • Salesforce Posted a $350K CS Leadership Role The job reads less like customer success and more like revenue engineering: platform telemetry fluency, consumption-based P&L ownership.

    Get the verdict every Wednesday.

    The AI x GTM briefing for operators. Free forever.

    One email per week. Unsubscribe anytime. No spam, ever.