The pilot era is over. Agentic AI is shipping to production across travel, insurance, and fintech — and the companies moving fast on orchestration are pulling ahead. Here's what the architecture shift really means.
Agentic AI Is Going to Production — And Most Companies Aren't Ready
For the past two years, the dominant narrative around AI in enterprise has been the pilot. Controlled experiments. Sandboxed demos. Carefully scoped proofs-of-concept that never quite made it past the innovation committee.
That era is ending.
In the past 90 days alone: Revolut launched AIR, a financial assistant that freezes cards, manages eSIMs, and checks insurance coverage in a single conversation. Almosafer deployed autonomous travel booking agents in production. BlaBlaCar and Trainline embedded directly into ChatGPT as live booking surfaces. C&R Software shipped an AI-native debt collection platform built from the ground up around agent orchestration.
These aren't pilots. They're products. And the architecture behind them is fundamentally different from the chatbots that came before.
The Architecture Shift Nobody Talks About Enough
A traditional chatbot is a lookup engine with a personality. It maps an intent to a response. The loop ends there.
An agent is different. It maps an intent to a plan — a sequence of tool calls, decisions, and actions that execute across systems until the goal is reached.
The technical difference seems subtle. The operational difference is enormous.
When Revolut's AIR handles "I'm in Tokyo next week, set up my eSIM and check my insurance," it's not returning a help article. It's: parsing intent across two separate domains, querying the user's current plan and destination, triggering eSIM provisioning via an internal API, cross-referencing the travel insurance policy for coverage gaps, and surfacing a consolidated response — and potentially taking action.
That's a multi-step, multi-system workflow executed in a single conversational turn. The model isn't the product. The orchestration layer is.
Why Production Is Harder Than the Demo
Every team that has shipped an agentic system in production will tell you the same thing: the demo works. The edge cases don't.
The core challenges aren't model quality — they're reliability engineering:
Tool call failure handling. When step 3 of a 7-step agent workflow fails, what happens? Does the system retry? Rollback? Surface a graceful error? Most agent frameworks have weak answers to this.
State management across turns. Session continuity is trivial in demos and brutal in production. Maintaining coherent context across multiple turns, user interruptions, and system timeouts requires explicit architecture decisions — not default behavior.
Latency under load. A single LLM call takes 1-3 seconds. An agentic workflow with 5 tool calls, parallelised or sequential, can take 10-30 seconds. At scale, this is a UX and infrastructure problem simultaneously.
Hallucinated actions. The scarier failure mode isn't a wrong answer — it's a wrong action. An agent that confidently books the wrong flight, freezes the wrong account, or submits an incorrect claim is a liability, not a feature. Production-grade agents need guardrails, confirmation steps, and audit trails baked in by design.
Compliance surface expansion. Every tool call is a new data flow. In regulated industries — insurance, fintech, healthcare — each API integration potentially triggers GDPR, PSD2, HIPAA, or sector-specific obligations. The compliance surface of an agentic system is an order of magnitude larger than a chatbot.
The Orchestration Layer Is the Moat
The model is a commodity. GPT-4o, Claude 3.5, Gemini Flash — at this point, the capability delta between frontier models for most enterprise tasks is marginal. What isn't commoditised is the orchestration layer sitting above them.
This is why Microsoft shipped Agent Framework. Why Google open-sourced Scion. Why Anthropic invested heavily in tool use and function calling. The model wars are largely won. The orchestration wars are just starting.
The companies building defensible positions right now are doing three things:
1. Designing for composability. Agents that can call other agents. Workflows that can be assembled from modular, tested components. The monolithic "AI assistant" is being replaced by agent graphs — directed systems where specialised sub-agents handle discrete tasks and a coordinator manages the overall flow.
2. Building vertical context into the orchestration. A generic agent framework knows nothing about insurance policy structure, travel booking constraints, or debt collection compliance. The companies winning in specific verticals are encoding that domain knowledge into the orchestration layer itself — not relying on prompt engineering to carry it.
3. Instrumenting everything. You can't improve what you can't observe. Production agentic systems need tracing at the tool call level, not just the conversation level. Which tools failed? Which plans were abandoned mid-execution? Which agent paths correlate with high user satisfaction? This telemetry is the product roadmap.
The Vertical Race Is On
The most important thing happening right now isn't at the model layer — it's the race to own agentic infrastructure in specific verticals before the market consolidates.
In travel: autonomous booking agents are moving from experimental to expected. The companies that own the agentic layer between traveller intent and supplier inventory will capture distribution in a way that OTAs couldn't.
In insurance: claims handling, policy lookup, and underwriting support are being rebuilt around agent workflows. The carriers moving fast are cutting handling time by 60%+. The ones waiting for the "right moment" are watching their combined ratios diverge.
In fintech: Revolut AIR is the obvious example, but TesaPay, 9fin, and a wave of Series B companies are building conversational financial action into the core product — not as a feature, but as the interface.
The window to establish a defensible position in any of these verticals is measured in months, not years.
What Separates the Companies That Will Win
It's not the model they're using. It's not even the orchestration framework.
It's the feedback loop.
The agents shipping to production today are learning from every execution. Which plans succeed? Which tool sequences fail? Where do users drop off? Where do they re-engage? This data — at scale, in production — is the training signal that makes the next version of the agent dramatically better.
Companies still in pilot mode aren't generating this data. They're falling further behind not just in deployment, but in the compounding advantage that production data creates.
The pilot era gave everyone roughly equal standing. The production era will create winners and losers fast — and the gap will be very difficult to close.
The question isn't whether agentic AI will transform your vertical. It already is.
The question is whether you're the one building the agent — or the one the agent is replacing.
Antoine Paillusseau
CEO, FCB.ai
