The Death of the Sidecar: Implementing Ztunnels in Istio Ambient Mesh
IT

Are proxy sidecars eating your Kubernetes compute budget? Step into the sidecar-less future with Ambient Mesh Ztunnels, which use the HBONE protocol for node-level, high-performance zero-trust routing.
In the rapidly evolving ecosystem of cloud-native infrastructure, few technologies have seen as dramatic a shift in operational philosophy as the service mesh. For years, the industry relied heavily on the “sidecar” model—a dedicated proxy injected into every single Kubernetes pod. This paradigm brought essential capabilities like mutual TLS (mTLS), observability, and granular traffic control. However, as cluster sizes grew and enterprise adoption accelerated, the architectural flaws of the sidecar model became impossible to ignore: it consumed massive amounts of CPU and memory, complicated application lifecycles, and forced infrastructure and application teams into an uncomfortable, tightly coupled marriage.
The response to these problems has arrived. Istio Ambient Mesh, which reached General Availability in Istio 1.24 in November 2024 with ztunnel, waypoints, and all APIs marked Stable by the Istio Technical Oversight Committee, is now the production default for new Kubernetes service mesh deployments. At the heart of this transformation is the Ztunnel (Zero Trust Tunnel)—a node-level proxy that fundamentally changes how we secure and route Kubernetes traffic.
In this guide we explore the mechanics of Istio Ambient Mesh and the Ztunnel, the specifics of the HBONE protocol, how the Istio CNI handles traffic redirection, and where the project is heading through 2026.
The Era of the Sidecar: Why It Had to Die
To understand the design of Ambient Mesh, we first need to understand the pain of the model it replaces. The traditional service mesh data plane relied on a proxy—typically Envoy—running as a secondary container inside every application pod.
While this provided excellent isolation and per-pod context, the overhead it imposed was steep:
Compute Resource Bloat. Every sidecar requires its own baseline CPU and memory allocation. In a microservices environment with hundreds or thousands of pods, these idle proxy resources accumulate quickly. Organizations found that their service mesh infrastructure was consuming as much compute budget as the actual business logic it was serving.
Lifecycle Coupling and Operational Friction. Because the sidecar is physically injected into the pod, the mesh lifecycle is tied to the application lifecycle. Upgrading the proxy or rotating certificates typically required a rolling restart of the entire application fleet, forcing platform engineers to coordinate mesh upgrades with application developers.
The “First Packet” Problem. During pod initialization, race conditions occurred where the application container would start before the sidecar proxy was ready, resulting in dropped initial connections and complex init-container workarounds.
The Layer 7 Tax. Traditional sidecars process both Layer 4 (TCP/mTLS) and Layer 7 (HTTP/gRPC routing). Many applications need only the baseline security of mTLS, yet they were forced to pay the computational overhead of full L7 parsing on every connection.
The conclusion was clear: the sidecar model was not sustainable at scale. The solution required separating the foundational infrastructure requirement (security and identity) from the application-specific requirement (advanced L7 traffic management).
Istio Ambient Mesh: Architecture Overview
Istio Ambient Mesh addresses these pain points through a philosophy of transparency and non-intrusiveness. It removes sidecars entirely, splitting the service mesh data plane into two distinct, independently scalable layers:
- The Secure Transport Layer (Layer 4): Handled by Ztunnel, which provides mTLS, SPIFFE-based workload identity, L4 authorization policies, and network observability.
- The L7 Traffic Management Layer: Handled by optional Waypoint Proxies, deployed only when complex HTTP routing, retries, per-route RBAC, JWT validation, or rate-limiting are required.
By moving mTLS into a shared infrastructure component, Ambient Mesh allows platform teams to enable cluster-wide zero-trust security without modifying application pods, without injecting sidecars, and without forcing application restarts. Enrolling a workload is a single command:
kubectl label namespace default istio.io/dataplane-mode=ambient
This label triggers the Istio CNI node agent to configure redirection for all pods in that namespace—no pod restart, no mutation webhook, no init containers.
Deconstructing the Ztunnel: The Node-Level Zero Trust Proxy
The cornerstone of Ambient Mesh is the Ztunnel (Zero Trust Tunnel). Unlike the feature-rich Envoy sidecars of the past, Ztunnel is a purpose-built, highly optimized Layer 4 proxy written in Rust. The initial ambient mesh implementation used Envoy for ztunnel, but the Istio team found that Envoy’s rich L7 feature set—exactly what makes it great for gateways and waypoints—was wasted in the L4-only ztunnel role, and that bending Envoy to ztunnel’s specific requirements was impractical. The purpose-built Rust implementation, announced in February 2023, resolved this.
Architecture and Deployment
Ztunnel runs as a Kubernetes DaemonSet, meaning exactly one instance is deployed per node, shared by all workloads on that node. Its responsibilities are confined to the foundational elements of zero trust:
- Mutual TLS (mTLS): Encrypting traffic in transit between workloads using AES-GCM, a cipher optimized for modern hardware.
- SPIFFE Identity Management: Ztunnel acts as a CA client, requesting and managing short-lived X.509 certificates with SPIFFE identities from Istiod on behalf of every co-located workload. Each workload’s identity follows the format
spiffe://<trust-domain>/ns/<namespace>/sa/<service-account>. - L4 Authorization Policies: Enforcing “who can talk to whom” rules based on workload identity.
- L4 Observability: Exporting standard TCP metrics and access logs, including per-connection source and destination SPIFFE identities.
Ztunnel also communicates with Istiod as an xDS client, receiving purpose-built xDS configuration specifically tailored for its L4-only role—distinct from the full xDS configuration pushed to Envoy sidecars or waypoints.
Performance and Efficiency
Because Ztunnel operates strictly at Layer 4 and is written in Rust, its resource profile is remarkably lean. According to official Istio performance benchmarks, a single Ztunnel processing 1,000 requests per second consumes approximately 0.06 vCPU and 12 MB of memory. A typical Ztunnel instance uses 30–50 MB of memory at idle, with minimal CPU.
The Ztunnel team has shipped continuous performance improvements in each quarterly release, including migration to rustls (a high-performance, safety-focused TLS library), reduction of data copying on outbound traffic, dynamic tuning of buffer sizes for active connections, and migration to AWS-LC—a cryptography library optimized for modern hardware.
These improvements have compounded significantly. Compared to the sidecar model, ztunnel-only ambient mode uses approximately 1% of the CPU and 1% of the memory of an equivalent sidecar deployment in production benchmarks. Even with Waypoint proxies deployed for L7-needing services, total CPU drops to around 15% of the sidecar baseline and memory to around 10%.
In a 1,000-pod cluster running on 100 nodes:
| Model | Proxy Count | CPU Overhead | Memory Overhead |
|---|---|---|---|
| Sidecar (Envoy per pod) | 1,000 | ~100 vCPU | ~128 GB |
| Ambient L4 only (Ztunnel DaemonSet) | 100 | ~6–7 vCPU | ~1.2–5 GB |
| Ambient with Waypoints (20% of services) | 100 + ~few waypoints | ~15–20 vCPU | ~13–15 GB |
The reduction in allocated resource requests alone—before measuring actual utilization—represents a 90% reduction for L4-only ambient and approximately 80% when waypoints are deployed for a subset of services.
In terms of latency, official Istio 1.23 benchmarks show that two ztunnel hops (client-side and server-side) add approximately 0.17 ms at the 90th percentile and 0.20 ms at the 99th percentile over baseline for HTTP/1.1 traffic at 1,000 requests per second with mTLS enabled.
Preserving Per-Workload Identity
A common concern about node-level proxies is that they collapse per-pod identity into a single “node identity.” Ztunnel explicitly avoids this. Even though it is shared across all pods on a node, it performs certificate management on behalf of each individual workload.
When Pod A on Node 1 communicates with Pod B on Node 2:
- The Ztunnel on Node 1 requests a unique X.509 certificate for Pod A’s Service Account from Istiod, presenting Pod A’s SPIFFE identity.
- Istiod verifies that the Ztunnel is authorized to act on behalf of Pod A (via Kubernetes RBAC on the pod’s service account).
- The mTLS handshake uses Pod A’s specific SPIFFE certificate. The destination Ztunnel on Node 2 validates this exact identity against Pod B’s L4 AuthorizationPolicy.
The result is strict, cryptographic zero-trust identity at the per-workload level—without the per-pod proxy overhead.
The HBONE Protocol: Standards-Based Secure Tunneling
To securely transport raw TCP traffic across nodes without exposing complexity to applications, Istio Ambient Mesh uses HBONE (HTTP-Based Overlay Network Environment)—an Istio-specific tunneling protocol built from three open standards composed together:
- HTTP CONNECT (RFC 7540) to establish the tunnel connection
- HTTP/2 to multiplex multiple application connection streams over a single secured tunnel and carry stream-level metadata
- mTLS to encrypt and mutually authenticate the tunnel
By convention, ztunnel and other HBONE-aware proxies listen on TCP port 15008. This has a practical implication for operators: if you have existing NetworkPolicy objects that restrict inbound ports on ambient-enrolled pods, you must add an explicit exception allowing port 15008 or HBONE traffic will be blocked.
Why Not Raw mTLS?
Using HTTP/2 CONNECT as the carrier rather than a raw mTLS connection provides specific technical advantages. HTTP/2 multiplexing allows a single mTLS connection between two ztunnel instances to carry traffic for many different pod-to-pod connections, substantially reducing connection overhead at scale. The HTTP CONNECT request also carries the destination pod’s IP and port in the :authority header, and the source workload’s SPIFFE identity is conveyed in the TLS client certificate presented during the handshake.
HBONE is also interoperable: sidecar-mode Envoy proxies can speak HBONE, which enables coexistence of ambient and sidecar workloads in the same cluster during a gradual migration.
The HBONE Packet Flow
When Pod A sends a plaintext TCP connection to a service:
- Interception: The Istio CNI redirects outbound traffic from Pod A’s network namespace to the local Ztunnel before it reaches standard routing tables.
- Encapsulation: The local Ztunnel wraps the TCP stream inside an HTTP/2
CONNECTrequest carrying the destination IP:port in:authority. - mTLS Handshake: The local Ztunnel initiates an mTLS connection to the destination node’s Ztunnel on port 15008, presenting Pod A’s SPIFFE certificate.
- Decapsulation: The destination Ztunnel verifies Pod A’s identity against Pod B’s L4
AuthorizationPolicy, unwraps the HTTP/2 envelope, and delivers the original TCP stream to Pod B.
From Pod A’s perspective: it sent a normal TCP connection. From the network’s perspective: the traffic traversed a multiplexed, mTLS-encrypted, identity-authenticated HBONE tunnel.
Traffic Redirection: Istio CNI, iptables, GENEVE, and eBPF
Getting traffic from an application pod into the node-level Ztunnel without modifying the application is a significant engineering challenge. Istio solves it through the Istio CNI node agent, which monitors pod lifecycle events and configures redirection rules dynamically.
The Default Mechanism: iptables + GENEVE
In the default configuration, the Istio CNI uses a combination of iptables rules and GENEVE (Generic Network Virtualization Encapsulation) overlay tunnels to bridge pod network namespaces to the Ztunnel. The Ztunnel pod exposes pistioin and pistioout interfaces connected to the node’s istioin and istioout interfaces via these tunnels. Traffic received on the inbound tunnel is directed to ztunnel port 15008 (HBONE) or 15006 (plaintext); outbound traffic from pods is directed to port 15001.
TPROXY (Linux transparent proxy) marks incoming packets from the tunnels and routes them to ztunnel’s inbound and outbound ports, preserving the original source IP and port so that upstream policy enforcement sees the real workload addresses.
The eBPF Alternative
Istio has added an optional eBPF-based traffic redirection mode. When enabled, an eBPF program is compiled into the Istio CNI component and attached to traffic control (TC) ingress and egress hooks on the relevant network interfaces. The CNI watches pod events and attaches or detaches this eBPF program when pods are moved into or out of ambient mode.
The eBPF program operates in kernel space and can redirect packets directly to the Ztunnel, bypassing the overhead of iptables chain traversal and eliminating the need for GENEVE encapsulation. The Ztunnel performs a connection lookup in the eBPF program’s map to determine the correct redirection for each packet.
Advantages of the eBPF path:
- Kernel-level efficiency: Avoids context switches between kernel and user space.
- No GENEVE overhead: Packets are redirected in-kernel without the encapsulation step.
- Flexible programmability: eBPF programs can be updated without kernel module reloads and can incorporate additional packet context for customized routing logic.
- Transparent to the application: The pod has no visibility into the redirection occurring beneath it.
The eBPF mode is currently opt-in on top of the already opt-in ambient mode, given CNI compatibility requirements. The default iptables + GENEVE path is supported across all Kubernetes CNI plugins, including Cilium, Calico, OpenShift SDN, and Amazon VPC CNI.
Decoupling Layer 7: Waypoint Proxies
If Ztunnel handles Layer 4 security, what happens to advanced Layer 7 features like HTTP header-based routing, traffic splitting, circuit breaking, per-route RBAC, JWT validation, and detailed distributed tracing?
Ambient Mesh introduces Waypoint Proxies to handle these concerns. A Waypoint is a standard Envoy instance deployed independently—per namespace, per service, or per service account—not as a sidecar. Waypoints have their own SPIFFE identity (waypoint-sa) and are deployed using Kubernetes Gateway API resources, making them first-class network infrastructure rather than a patch on top of application deployments.
The architecture is strictly opt-in. Ztunnel automatically detects when a destination has a Waypoint proxy configured and forwards traffic through it via an HBONE tunnel before delivering it to the destination. L4 AuthorizationPolicy continues to be enforced at the Ztunnel layer; L7 AuthorizationPolicy is enforced at the Waypoint.
A practical deployment pattern: route 80% of services through Ztunnel alone (mTLS, SPIFFE identity, L4 policy), and deploy Waypoints only for the 20% of services that genuinely need HTTP header routing, canary traffic splitting, or fine-grained L7 RBAC. This selective deployment eliminates the “Layer 7 tax” on traffic that doesn’t need it.
One current limitation to note: Waypoints enforce policies using the original source workload identity (not the waypoint’s own identity), but the EnvoyFilter API—widely used in sidecar mode for low-level Envoy customization—is not supported in ambient mode. Extensions must use WebAssembly plugins instead.
Where Ambient Mesh Is Going: The 2025–2026 Roadmap
The Istio project published a clear roadmap for 2025–2026 with three primary themes.
Migration parity from sidecar to ambient. The project is investing in tooling to assess migration readiness, rollback-safe interoperability between sidecar and ambient namespaces within the same cluster, and comprehensive documentation. Closing the most significant feature gaps—particularly multi-cluster traffic management and extensibility—is the central focus.
Multi-cluster ambient mesh. Multi-cluster support shipped as alpha in Istio 1.27 (August 2025), introduced by contributors from Microsoft. It extends ambient’s modular architecture to deliver secure connectivity, discovery, and load balancing across clusters—a feature that has been one of the most-requested capabilities from enterprise ambient users. This lays the groundwork for active-active configurations across regions or cloud providers.
Gateway API and extensibility maturity. At KubeCon Europe 2026, the Istio project announced Ambient Multicluster Beta, Gateway API Inference Extension Beta, and experimental Agentgateway support—signaling the evolution of the service mesh beyond microservice networking toward a traffic management platform for AI inference workloads. The Sail Operator (released in 2025) provides a streamlined way to manage Istio deployments via the Kubernetes operator pattern.
Practical Deployment: Enabling Ambient Mesh
For teams evaluating or migrating to Ambient Mesh, the recommended approach is gradual, namespace-by-namespace adoption.
Install Ambient profile:
istioctl install --set profile=ambient
Enroll a namespace:
kubectl label namespace my-app istio.io/dataplane-mode=ambient
Deploy a Waypoint proxy for L7 features on a specific namespace:
istioctl waypoint apply -n my-app --enroll-namespace
Verify Ztunnel is processing connections (look for SPIFFE identities in access logs):
kubectl logs -n istio-system daemonset/ztunnel -f
You will see log lines including src.identity and dst.identity fields containing the SPIFFE URIs of the source and destination workloads—confirmation that per-workload identity is being preserved at the node level.
Roll back if needed (no pod restarts required):
kubectl label namespace my-app istio.io/dataplane-mode- --overwrite
Conclusion: The Sidecar is Dead, Long Live the Mesh
The service mesh is now a foundational requirement for operating securely in cloud-native environments—but the architectural debt of the sidecar model threatened to price many organizations out of adopting it. The compute overhead was real, the operational complexity was real, and the lifecycle coupling was real.
Istio Ambient Mesh, now fully production-ready since Istio 1.24, resolves these problems at the architecture level. By separating L4 security (ztunnel, running as a DaemonSet with a Rust-based implementation consuming ~0.06 vCPU and 12 MB per node at 1,000 RPS) from L7 traffic management (waypoints, opt-in per service), the project has delivered on the original promise of the service mesh—robust zero-trust security, deep observability, and granular control—without the debilitating overhead.
The HBONE protocol, composing HTTP/2, HTTP CONNECT, and mTLS over port 15008, provides standards-based, multiplexed, identity-authenticated tunneling that is invisible to applications. The Istio CNI handles transparent traffic interception through iptables and GENEVE by default, with an eBPF-based fast path available for environments where kernel-level efficiency and reduced encapsulation overhead are priorities.
For platform engineering teams in 2026, the calculus is straightforward: ambient mode is the default for new Kubernetes service mesh deployments. Sidecars remain available and supported for workloads with specific technical requirements that ambient mode cannot yet meet—but for the vast majority of enterprise microservice traffic, the era of paying the sidecar tax is over.
Changelog
The following corrections and additions were made to the original draft, based on sourced web research:
Corrected — GA release framing: The original described Ambient Mesh as having reached “full general availability and robust maturity by 2026.” Corrected: Ambient Mesh GA was released in Istio 1.24, November 7, 2024, with ztunnel, waypoints, and all APIs marked Stable by the Istio TOC.
Corrected — Ztunnel memory benchmark: The original stated “less than 15 MB of memory.” The official Istio performance documentation reports 12 MB at 1,000 RPS for the ztunnel proxy; typical idle usage is 30–50 MB. The 15 MB figure was unattributed and inconsistent with official data.
Corrected — eBPF as default: The original implied eBPF redirection was the “industry standard” default mechanism for Ambient Mesh in 2026. Corrected: iptables + GENEVE is the default; eBPF-based redirection is an opt-in mode within the Istio CNI. The eBPF path eliminates the need for GENEVE encapsulation but has CNI compatibility requirements.
Corrected — HBONE protocol description: The original described HBONE as using “HTTP/2 CONNECT (or HTTP/3 in newer iterations).” There is no publicly documented HTTP/3 variant of HBONE in Istio. Corrected: HBONE composes HTTP CONNECT, HTTP/2, and mTLS—all three open standards together.
Corrected — Ztunnel origin: The original did not mention that ztunnel was initially implemented in Envoy before the Rust rewrite. The Envoy-to-Rust transition (announced February 2023) is architecturally significant and has been added for accuracy.
Added — SPIFFE identity details: Added specifics on how ztunnel carries per-workload SPIFFE identity (spiffe://<trust-domain>/ns/<namespace>/sa/<service-account>) in mTLS client certificates and how Istiod validates the ztunnel’s right to act on behalf of each workload.
Added — HBONE port 15008 and NetworkPolicy implications: Added the operational implication that existing NetworkPolicy objects must allow port 15008 inbound for ambient-enrolled pods.
Added — Waypoint limitations: Added that the EnvoyFilter API is unsupported in ambient mode, and that L7 extensions must use WebAssembly plugins.
Added — 2025–2026 roadmap section: Added sourced coverage of multi-cluster alpha in Istio 1.27 (August 2025), the sidecar-to-ambient migration tooling investments in Istio 1.28–1.29, and the KubeCon Europe 2026 announcements.
Added — Latency benchmarks: Added official P90/P99 latency data from Istio 1.23 benchmarks (0.17 ms / 0.20 ms for two ztunnel hops at 1,000 RPS with mTLS).
Removed — Unsubstantiated “80-80% reduction” claim: The original cited “over 70-80%” resource reduction from benchmark data attributed to “2026 Istio releases.” Replaced with sourced figures: 99% CPU and memory reduction in L4-only ambient vs. sidecars (Solo.io benchmark data), with 85% CPU and 90% memory reduction when waypoints are added for ~20% of services.
Related InstaTunnel pages
Continue from this article into the most relevant product guides and workflows.
Comments
Post a Comment