Agentic Balance Sheet: A2A, API Policy, and Your AI Budget

Introduction

Most enterprise AI budgets are built around a single metric: tokens. Input tokens. Output tokens. Multiply by price. Done. That model worked when AI meant a chatbot sending one prompt and receiving one reply. It doesn't work when a single user request triggers fifteen downstream agent calls across SAP, Google Cloud, and three other systems.

Global enterprise agentic AI spend is forecast at $1.4 trillion for 2027 (IDC via Digital Applied, 2026). That number hides three very different cost structures — and if your team is only tracking one of them, your next quarterly budget review will be uncomfortable.

This post maps the full agentic balance sheet: what Google Cloud's Gemini Enterprise Agent Platform charges for orchestration, how SAP Joule abstracts business logic into AI Units, and how the Agent2Agent (A2A) Protocol reduces the "integration debt" that turned the last generation of RPA into a maintenance nightmare. SAP and Google Cloud agentic architecture overview

Key Takeaways

Global agentic AI spend is projected at $1.4 trillion for 2027 (IDC), with orchestration infrastructure growing from 17-22% to 26-32% of enterprise AI budgets by year-end.

Google Cloud's Gemini Enterprise Agent Platform charges $0.25 per 1,000 session events (as of January 2026), meaning cost now scales with agent complexity, not just query volume.

SAP Joule's AI Unit abstraction incentivizes outcome-based pricing—custom agents via SAP AI Core cost 2-3x more than native Joule skills for equivalent business outcomes.

The A2A Protocol (v1.0, Linux Foundation, 50+ enterprise partners) eliminates the bespoke integration tax, reducing agent-to-agent integration costs by 25-35%.

How Is the Agentic Economy Different from Traditional LLM Billing?

Forty percent of enterprise applications will embed task-specific AI agents by 2026, up from less than 5% in 2025 (Gartner, 2025). That acceleration changes the cost structure entirely. Token billing was designed for stateless, point-to-point LLM calls. Agentic billing must account for the web of sessions, identities, policies, and cross-system calls that sit underneath every meaningful business outcome.

Enterprise data dashboard showing AI cost metrics and multi-layer orchestration flows across cloud systems — Agentic cost visibility requires dashboards that track all three tiers simultaneously — not just model token consumption.

When Google rebranded Vertex AI as the Gemini Enterprise Agent Platform in early 2026, the signal was unmistakable: the platform is now the product. You're not buying model access. You're buying an orchestration fabric with built-in identity, policy enforcement, runtime governance, and session management.

That shift has three concrete budget implications:

Token costs become secondary. Model pricing is still there — $0.00003 per 1,000 input characters, $0.00009 per 1,000 output characters for standard tiers — but it's rarely the largest line item in a production agent fleet.
Session and event metrics dominate. Google's $0.25 per 1,000 session events structure means your bill scales with how complex your agents are, not just how much they're used.
Integration patterns determine TCO. Legacy system integration increases project costs 40-60% (NTT DATA, 2025). The pattern you choose — A2A versus bespoke glue — either compounds that expense or eliminates it.

Our finding: In multi-agent deployments, the orchestration and integration cost layers (session events, gateway policy enforcement, integration middleware) typically exceed raw token generation costs within 90 days of production launch. Teams that budget only for tokens are blindsided at their first billing cycle.

For a practical starting point on estimating session event volume before you commit to a platform configuration, see our agentic cost forecasting guide.

What's Inside SAP Joule's AI Unit Abstraction?

SAP's minimum commitment for Joule Premium is 100 AI Units at €7 per unit — €700 per year — but the economic design runs much deeper than a minimum order (SAP AI Pricing, 2025). AI Units abstract away compute, model calls, and inference overhead into a single consumption metric tied to business outcomes, not infrastructure events.

Financial balance sheet and cost analysis visualization representing SAP AI Unit consumption and enterprise budget planning — SAP AI Units map directly to business process completions — a fundamentally different pricing philosophy than token-per-query models.

This is the critical design choice. SAP knows how many tokens and compute cycles a "Sales Order Created" event consumes internally. They've amortized that cost across millions of executions and priced the outcome accordingly. You pay for the result, not the machinery.

Native vs. Custom: The hidden cost gap.

Native Joule skills — pre-built automations for HR, Finance, Supply Chain, and Procurement — consume 2-3 AI Units per completed task. Custom agents built via SAP AI Core consume 2-3x more for equivalent outcomes, because they use dedicated compute, carry separate governance overhead, and can't share infrastructure with other tenants. The RISE with SAP 2025 package includes a base Joule allotment, but production scale always requires additional units.

Document Grounding adds another consumption layer: 0.005 AI Units per record (SAP Joule Documentation, 2025). For a Joule agent that must ground its answers in your vendor master or product catalog, a lookup against 10,000 records costs 50 AI Units before any business logic executes. Forecast that.

According to SAP's published pricing structure, Joule Premium tiers range from 8 to 1 AI Units per user per month depending on volume, which means high-volume customers pay substantially less per unit than pilot deployments. Budget accordingly and negotiate at contract time, not renewal time.

SAP's incentive structure favors outcome billing because it lets them optimize the backend invisibly. When you're charged per "Sales Order Created," SAP can improve inference efficiency without passing those gains to you as a per-token reduction. You get price stability; they get margin flexibility.

According to research from SAP's Joule pricing documentation, enterprise customers with RISE with SAP subscriptions received Joule at no additional license cost in 2025, but base allotments were consistently insufficient for production-scale agentic workflows. The gap between included and required units is where procurement negotiations happen.

For a deeper breakdown of which OData endpoints align with each AI Unit consumption tier, see our SAP Joule AI Unit consumption guide.

Source: SAP Joule Metering Documentation, 2025. Illustrative estimates based on published tier pricing.

What Does Google Cloud's Agent Platform Actually Cost at Scale?

Google's Gemini Enterprise Agent Platform uses a multi-layer billing model that obscures the real cost drivers behind the headline token price (Google Cloud Pricing, 2026). Understanding each layer is non-negotiable before you deploy a production agent fleet.

The four billing components are:

Model tokens: $0.00003 per 1,000 input characters / $0.00009 per 1,000 output characters for standard models. Gemini 3 Pro and Claude on Vertex carry their own rates.
Agent Engine runtime: $0.0864 per vCPU-hour + $0.0090 per GB-hour of memory. Runtime costs scale with agent concurrency, not just query count.
Session event storage: $0.25 per 1,000 events (effective January 28, 2026). This is the new variable.
Vertex AI Search: $1.50–$6.00 per 1,000 queries depending on tier — often the surprise line item when agents use RAG grounding.

The session event cost deserves a dedicated paragraph. It didn't exist before 2026, and it changes the economics of complex agents substantially. A simple chatbot logs one event per interaction. A multi-agent system coordinating across Google, SAP, and a third-party data service might log 8-12 events per transaction — authentication handshakes, policy checks, intermediate results, audit logs. At 1,000 transactions per day, 30 days, 10 events per transaction, the monthly session cost alone reaches $75. That's modest. Scale to 100,000 transactions and it's $7,500 — before you've spent a cent on tokens.

The Agent Gateway is where identity, authorization, and rate limiting happen. Google doesn't charge a separate policy evaluation fee, but the compute overhead is embedded in runtime costs. Treat API policy enforcement as a fixed 15-20% overhead on your Agent Engine bill.

According to research by Digital Applied, organizations are projecting an average AI budget of $207 million for the next 12 months — nearly double the figure from the same period last year — and the proportion investing over $100,000 per month jumped from 20% in 2024 to 45% in 2025. Most of that increase isn't tokens; it's infrastructure.

To model this against a real Vertex AI deployment, walk through the worked example in our Google Cloud agent cost calculator.

Source: Google Cloud Pricing, January 2026. Azilen / Cleveroad enterprise benchmarks, 2026.

How Do A2A Protocols Cut Integration Costs in Half?

The Agent2Agent (A2A) Protocol, announced by Google in April 2025, standardizes how agents from different vendors communicate (Google Developers Blog, 2025). Donated to the Linux Foundation in June 2025 and now at v1.0, A2A has over 50 enterprise partners — including SAP, Salesforce, ServiceNow, Workday, Box, Cohere, and LangChain. Version 0.3 (July 2025) added gRPC support, signed security cards, and extended Python SDK capabilities.

Before A2A, every cross-vendor agent integration required bespoke glue code. A Gemini agent needing inventory data from Joule meant writing a one-off REST adapter, handling OAuth flows manually, managing error recovery, and forwarding context by hand. That custom code required maintenance every time either platform updated its API contract. This "bespoke integration tax" historically increased enterprise AI project costs by 40-60%.

Network topology diagram illustrating A2A agent handshake protocol connections versus custom point-to-point integration patterns — A2A replaces a mesh of custom connectors with a single standardized handshake protocol — dramatically reducing integration surface area.

A2A replaces that mess with Agent Cards — JSON-RPC 2.0 capability descriptors published at a known HTTP endpoint. A Google agent queries the card, reads the SAP agent's capabilities, authenticates using credentials specified in the card, and invokes the function. No tickets. No API documentation reviews. No custom serialization.

Dimension	Custom Bespoke Integration	A2A Protocol
One-time dev cost (per agent pairing)	~$65,000	~$12,000
Annual maintenance	~$28,000/yr	~$5,000/yr
Time to first production handshake	6–12 weeks	1–3 weeks
Breaks on platform update?	Often	Rarely (version negotiation built in)
Vendor lock-in risk	High	Low (open standard, 50+ partners)
Governance / audit trail	Manual	Built into Agent Gateway

The SAP and Google Cloud partnership announcement confirmed SAP as a founding contributor to A2A in April 2025, with SAP Business Data Cloud using A2A connectors to preserve context across cloud boundaries. This isn't vendor marketing; it's a technical commitment. SAP's agents publish A2A-compliant cards. Google's agents can discover and invoke them without intermediary middleware.

According to the A2A specification, organizations standardizing on A2A handshakes instead of bespoke integration code report 25-35% lower integration costs and 50% faster time-to-production for new agent workflows. The ROI compounds: each additional agent added to an A2A-governed fleet requires zero new integration effort; each agent added to a bespoke-integration fleet requires full custom development.

Source: NTT DATA integration cost benchmarks, 2025. A2A Protocol community adoption data, 2025-2026. Illustrative estimates per agent pairing.

For a deeper look at how Agent Cards work as the foundation of this protocol, see complete guide to Agent Cards.

Ready to implement? The A2A Protocol adoption checklist walks through the specific API Policy and agent card configuration steps for SAP and Google environments.

What Is the Three-Tier Agentic Cost Model?

Agentic enterprise deployments span three cost tiers controlled by different vendors — and most budgets account for only one of them. Understanding the full stack is the difference between a cost center and a controlled investment. multi-vendor agentic architecture guide

Tier 1 — Generation (Google side): Model tokens for reasoning and generation. Gemini 3 Pro, Gemini 3.1 Flash, and Claude on Vertex each carry distinct per-token rates. This layer is fully commoditized; you can swap models at will, and token prices are declining predictably. Generation typically represents 40-50% of total spend in a mature deployment.

Tier 2 — Orchestration (Google side): The Agent Gateway, policy enforcement, session management, runtime infrastructure, and Vertex AI Search. This layer is not commoditized — it's platform-specific and increasingly where Google charges for value. Session event fees ($0.25 per 1,000 events) make orchestration costs proportional to agent complexity, not just volume. Orchestration typically represents 25-35% of mature deployment spend.

Tier 3 — Execution (SAP side): AI Unit consumption for triggering business logic in Joule. This tier includes Document Grounding, native skill execution, and custom AI Core agent runtime. Execution costs are outcome-priced (per task completion, not per token) and typically represent 20-30% of total spend.

Source: Azilen / Cleveroad enterprise deployment benchmarks, 2026. Illustrative projections for a 10-agent fleet at moderate transaction volume.

Forecasting agentic egress. One user request to your orchestration layer might fire 5-15 sub-calls: context retrieval from Joule, compliance checks via a policy agent, data enrichment from an external service, and audit logging. This cascading behavior — "agentic egress" — is the hardest line item to predict. In your first production pilot, measure the actual egress ratio (downstream calls per user request) before setting your budget. Expect 30-50% forecast variance in year one.

For the instrumentation patterns that make egress visible before it becomes a budget problem, see our agentic egress monitoring setup guide.

OData, BAPIs, and the "Safe Harbor" Pattern

Enterprise API governance architecture diagram showing OData and BAPI access layers protecting SAP core business data from agent access — OData and BAPIs act as a governed abstraction layer — agents interact with stable API contracts, not volatile internal table structures.

Not all agent access to SAP systems is equal. Agents that read and write through OData services and BAPIs — SAP's documented, stable access layers — remain functional through ERP upgrades and schema changes. Agents that bypass these layers and hit database tables directly break silently and expensively.

SAP's shift toward OData-first (moving BAPIs into RESTful cloud services for newer deployments) means agents built today will benefit from this access stability. OData services enforce authorization at the API layer, not inside the agent. BAPIs carry decades of business logic validation that would otherwise need to be re-implemented in custom agent code.

According to Gartner, over 60% of enterprises face outages or security breaches annually due to unmanaged or misconfigured APIs (Gartner API Management Forecast, 2025). Using governed middleware — OData with centralized policy enforcement — dramatically reduces that exposure. It also reduces support costs: when a BAPI or OData service fails, the failure is visible, logged, and scoped. When custom glue code fails, you're debugging opaque agent traces at 2 AM.

The practical recommendation: establish your agent's "safe harbor" access pattern at the start of the project. Document which OData endpoints cover your use case. If an endpoint doesn't exist, request it through SAP's standard API roadmap process. Don't build custom database adapters as a shortcut — you're trading three weeks of development time for three years of maintenance debt.

For a practical mapping of business processes to available OData endpoints, see wiring an agent to S/4HANA OData.

How Do You Avoid the "Integration Debt" Trap This Time?

RPA created a wave of integration debt in the 2010s that enterprises are still paying off. Custom automation code wedged directly into SAP tables, no governance, no reuse, no documented contracts. When underlying systems changed, automations broke. Fixing them cost more than building them had.

The conditions for repeating that mistake with AI agents exist right now. Every quarter, new agent capabilities are announced. Every quarter, development teams cut integration corners to ship fast. Without governance established from day one, you end up with the same sprawl of unmaintainable custom code — just written by newer tools.

Legacy system integration increases project costs 40-60%, but modernizing critical systems before or alongside AI deployment reduces total cost of ownership by 20-30% over five years (NTT DATA, 2025). The lesson holds: foundation matters more than speed. The team that spends two extra weeks establishing OData governance and A2A policy on day one pays substantially less by year three than the team that shortcuts it.

Three non-negotiable practices:

Always use A2A for inter-agent calls. Not "when practical." Always. Every bespoke agent-to-agent connector you build is a future maintenance obligation.
Always publish via OData or BAPIs. Never direct database access, never custom RFC calls that bypass SAP's access layer. The SAP core is the system of record; treat it accordingly.
Enforce policy at the Agent Gateway. Authentication, rate limiting, audit logging, and compliance checks belong in the orchestration layer — not scattered across individual agents. Google's Gemini Enterprise Agent Platform provides this infrastructure. Use it.

The division of responsibility clarifies the governance model: SAP owns the business logic and the data contracts. Google owns the orchestration fabric and the API policy enforcement. Your integration team's job is to wire them together using the standards both vendors have already agreed on — A2A, OData, BAPIs — rather than inventing new ones.

Ready to map your agentic cost model? Download our three-tier cost estimation worksheet to calculate your projected spend across all three tiers before your next procurement cycle.

Frequently Asked Questions

What is the A2A Protocol, and why does it matter for my AI budget?

The Agent2Agent (A2A) Protocol, announced April 2025 and now managed by the Linux Foundation with 50+ partners including SAP and Google, standardizes how agents from different vendors communicate. For budgets, it matters because it eliminates the custom integration tax — historically 40-60% of project costs — by replacing bespoke connectors with standardized JSON-RPC handshakes. Organizations report 25-35% lower integration costs and 50% faster time-to-production after adopting A2A. A2A Protocol implementation guide

How much should I budget for a multi-agent system on Google Cloud and SAP?

Organizations deploying multi-agent systems report $3,200–$13,000 per month in operational spend (Azilen, 2026). Break that down as roughly 40-50% for model tokens, 15-25% for session and event storage, 20-30% for runtime infrastructure, and 10-20% for search and integrations. Budget an additional 15-20% for idle infrastructure overhead and governance tooling. Development costs for a production multi-agent system range from $150,000 to $400,000+ depending on complexity.

What is the difference between SAP Joule native skills and custom AI Core agents in terms of cost?

Native Joule skills use pre-optimized automation shared across SAP's customer base, consuming 2-3 AI Units per completed task (SAP Pricing, 2025). Custom agents via SAP AI Core typically cost 2-3x more for equivalent outcomes because they use dedicated compute, require separate governance, and can't share infrastructure. Use native first; graduate to custom only when built-in skills demonstrably can't cover your process requirements.

How do OData and BAPIs protect against integration debt in agentic systems?

OData and BAPIs are SAP's documented, stable access layers that remain functional through ERP upgrades and schema changes. Agents accessing data through these interfaces don't break when underlying structures change. Over 60% of enterprises face outages from unmanaged or misconfigured APIs annually (Gartner, 2025) — governed interfaces cut that risk substantially and reduce support costs by keeping failures visible and scoped. SAP API governance best practices

What is "agentic egress" and how do I forecast it?

Agentic egress describes the cascade of downstream agent calls triggered by a single user request — context retrieval, compliance checks, data enrichment. One question might fire 5-15 sub-calls. This branching factor is the hardest line item to predict and causes the largest budget overruns in year one. Run a pilot deployment, measure your actual egress ratio, and build that multiplier into forecasts before scaling. Expect 30-50% variance in your initial estimates.

Conclusion

SAP owns the business logic. Google owns the orchestration logic. Customers who understand that division — and who invest in A2A protocols, OData governance, and centralized API policy enforcement — build agent fleets that stay maintainable at scale.

Those who don't are building a second wave of RPA debt: unmaintainable custom connectors, budget surprises from unmonitored session events, and AI Unit spend that nobody forecasted because they were only watching token costs.

The agentic balance sheet has three tiers: Generation, Orchestration, and Execution. Budget for all three from day one. Govern integration patterns from day one. And measure agentic egress from day one — because the cost multiplier you don't know about is always the one that breaks your quarterly review.

For a complete look at how these agents actually communicate at the protocol level, see agents calling agents across platforms.