Pulse x reMKTR

Ship 25% Faster: QUIC on AWS Network Load Balancer

Written by Jacob Heinz | Nov 17, 2025 9:01:40 PM

Your users don’t care how elegant your backend is. They care how fast the app feels. And right now, speed is a moat. Here’s the move: QUIC on AWS Network Load Balancer (NLB) just landed in passthrough mode—and teams are seeing 25–30% lower latency for mobile, real-time, and streaming.

That’s not a rounding error. That’s the difference between a video starting instantly vs. buffering. A trade order feeling snappy vs. laggy. And a chat staying connected on subway Wi‑Fi. QUIC slashes handshakes, survives IP changes, and keeps sessions sticky via Connection IDs. All while you hold your TLS keys end to end.

If you’ve been waiting for a clean way to run modern UDP transport at scale, this is it. Do it without ripping out your stack. You keep your architecture, your certificates, your encryption. NLB forwards QUIC directly to your targets and handles stickiness the QUIC-native way.

Here’s how to wire it up, what to watch, and why it’s your fastest path to “feels instant” in production.

TLDR

  • QUIC on NLB (passthrough) cuts latency ~25–30% by minimizing handshakes and speeding up recovery.
  • You keep TLS end to end; NLB forwards QUIC/UDP directly to targets.
  • Sticky sessions use QUIC Connection IDs, resilient to IP/NAT changes.
  • Works great for mobile, real-time, and streaming workloads.
  • Enable NLB access logs and CloudWatch metrics to trace flows and tune capacity.

QUIC meets NLB

QUIC 101

QUIC runs over UDP and bakes TLS 1.3 into the transport. That gives you fewer round trips, especially on first connect. You get connection migration across IPs and built‑in stream multiplexing without head‑of‑line blocking. On flaky mobile networks, that’s money. Less handshake overhead plus resilience means lower tail latency.

In practical terms, you’ll see faster first frame for video. You’ll see fewer retries on chat. You’ll get smoother real-time updates when a device hops from cellular to Wi‑Fi. That’s the class of gain behind the 25–30% latency drop AWS highlights for QUIC passthrough on NLB.

Here’s what that looks like under the hood, without the academic jargon:

  • First connect: QUIC does TLS 1.3 during the transport handshake. You get to “secure and ready” in roughly 1 round trip (vs. multiple with TCP+TLS).
  • Repeat visitors: With session resumption and 0‑RTT in TLS 1.3, clients can send data right away on reconnects. Use care for replay‑safe operations.
  • No head‑of‑line blocking: Multiple independent streams share one connection. A single lost packet doesn’t stall the whole party like TCP would.
  • Loss recovery built for modern networks: QUIC measures and adapts quickly. That improves tail latency where mobile jitter hurts the most.

Translation: no magic, just fewer round trips and smarter recovery.

Passthrough mode

With NLB in QUIC passthrough, the load balancer doesn’t terminate TLS or parse QUIC packets. It forwards UDP 443 straight to your targets. You manage certs at the service—NGINX QUIC, Envoy/QUIC, or quic-go. So your cryptographic posture is unchanged. End-to-end encryption stays intact. You avoid dual certificate management.

This setup also means:

  • You control ALPN (e.g., h3) and cipher suites on the server, not the LB.
  • You can do mTLS at the application layer if you need it for service‑to‑service trust.
  • WAF-style L7 filtering won’t happen at NLB for UDP/QUIC. Plan security at the edge (e.g., Shield) and at the service.
  • Certificates renew the same way you do today—no extra certificate sync to the LB.

A real world pattern

  • Mobile API and chat on UDP/443 using QUIC
  • NLB listener: UDP:443 → target group (instance or IP targets)
  • Targets terminate TLS/QUIC and speak HTTP/3 internally
  • Health checks via TCP/HTTP on a designated port

This lets you ship QUIC fast without re-architecting everything behind the load balancer.

If you’re on containers (EKS/ECS), use IP target groups to register pods or tasks directly. That keeps traffic L4-clean and avoids extra hops through node proxies. For hybrid or on‑prem extension, IP targets also let you point NLB at non‑EC2 endpoints. As long as they’re reachable in your VPC.

Rolling out safely

  • Keep HTTP/2 as a fallback. Many enterprise networks still throttle UDP. Your server should advertise HTTP/3 but gracefully serve HTTP/2.
  • Announce HTTP/3 with Alt-Svc, then ramp to more users over days, not hours.
  • Start with one latency‑critical API or media path. Prove the win, then expand.

Connection IDs

QUIC stickiness on NLB

Sticky sessions are a headache with mobile devices. You get NAT rebinding, IP flips, and intermittent coverage. QUIC solves this with Connection IDs (CIDs). It’s an identifier that stays stable even if the client’s IP or port changes. In NLB’s QUIC passthrough mode, traffic is forwarded to backend targets with session stickiness via these CIDs.

Translation: you get consistent target routing for a client session. You don’t rely on cookies or brittle 5‑tuple hashing. Your session data and in-memory state on the target remain useful. Users don’t “pinball” between hosts after a network blip.

Beats traditional stickiness

  • Survives IP changes: Device switches networks? CID keeps the session anchored.
  • Fewer reconnect storms: No need to renegotiate or replay state on every rebinding.
  • Cleaner horizontal scaling: You can scale targets without trashing session affinity.

Compared to classic cookie stickiness on HTTP/1.1 or TCP flow hashing, QUIC’s CID-based routing is purpose‑built for modern, mobile‑heavy traffic.

The subway swap

Imagine your streaming app while a user goes from 5G → station Wi‑Fi → back to cellular in 90 seconds. With QUIC/CIDs, the existing session can migrate without breaking. The NLB keeps traffic flowing to the same target. Your video buffer doesn’t reset. And your metrics don’t spike with reconnect errors. That’s exactly the failure mode QUIC was built to tame.

Practical tip: if you keep in‑memory state on a single target (e.g., chat presence), match session lifetime to your QUIC idle timeouts. For higher resilience, push session state (tokens, presence, carts) into a shared store. That way a target loss doesn’t drop users on reconnect.

From zero to QUIC

Build

You’ll set up an NLB listener on UDP:443. You’ll forward to an EC2-based QUIC server. You’ll terminate TLS at the service. You’ll verify sticky sessions using CIDs. Treat this as your aws network load balancer nlb tutorial you can run in a dev account.

Prereqs and diagramming

  • Targets: EC2 instances or IP targets that support QUIC/HTTP/3 (e.g., NGINX with QUIC, Envoy with HTTP/3, or a quic-go demo server)
  • Security groups and NACLs allowing UDP:443 to targets
  • Health check port open (TCP/HTTP)
  • For diagrams, grab the official AWS Network Load Balancer icon from AWS Architecture Icons

Before you start, define what “good” looks like. Example SLOs:

  • p50/ p95 handshake time
  • p95 time-to-first-byte (TTFB)
  • Session migration success rate when IP changes
  • Error rate on fallback to HTTP/2

Step

1) Create target group (UDP):

  • Protocol: UDP, Port: 443
  • Target type: instance or IP
  • Health check: TCP or HTTP on a dedicated port (e.g., 8080)

2) Register targets:

  • Point to instances running your QUIC-capable server. For a quick test, use NGINX with HTTP/3 enabled or a quic-go sample. Terminate TLS with your certs on the service.

3) Create NLB:

  • Scheme: Internet‑facing (for public traffic) or internal (for private clients)
  • Listener: UDP:443 → forward to your UDP target group
  • Consider cross-zone load balancing if you want even traffic spread across AZs

4) Test connectivity:

  • Use a QUIC/HTTP/3 client (e.g., curl with HTTP/3 support) hitting https://your-nlb-dns
  • Validate that the server negotiates HTTP/3 and serves content

5) Validate stickiness:

  • Simulate IP change (switch Wi‑Fi/cellular or use a network tool)
  • Observe the same session continues without reconnecting to a new target

Quick demo server

Spin up a small EC2 (t4g.small). Install NGINX with QUIC or run a quic-go example server on UDP:443. Attach to the UDP target group. Then hit it with a QUIC‑capable curl (curl --http3). You’ll see fast handshake behavior and stable sessions, even when network conditions change.

Sanity checks

  • Confirm ALPN offers h3 and your cert chain is valid from common mobile roots.
  • Ensure security groups and NACLs allow return UDP traffic (stateful SGs, stateless NACLs).
  • Decide on idle timeouts and how long you’ll keep stickiness/mapping for inactive sessions.
  • Document fallback: if UDP fails, you must serve the same endpoint on HTTP/2.

Observability and algorithm playbook

How NLB picks targets

By default, NLB is a Layer 4 load balancer using a flow-hash approach. Think 5‑tuple: source IP/port, destination IP/port, protocol. It maps connections to targets. With QUIC passthrough, NLB still processes UDP flows. It can maintain session stickiness via QUIC Connection IDs. That keeps a given QUIC session anchored to a single target. Cross‑zone load balancing can smooth distribution across AZs if enabled.

Key implications for you:

  • Fewer mid-session target switches (less state churn)
  • More predictable cache/session hits on the target
  • Better tail latency under mobile IP flaps

Also note: if you autoscale quickly, newly added targets join the hashing pool. Watch distribution and consider pre‑warming if you expect traffic spikes. Think a live event or a product drop.

What to log

Turn on NLB access logs to Amazon S3 for per‑flow visibility. You also get CloudWatch metrics to watch capacity and health. Start with:

  • ActiveFlowCount, NewFlowCount: How many concurrent and new flows you’re handling
  • ProcessedBytes: Throughput at the LB
  • HealthyHostCount / UnHealthyHostCount: Target health snapshot

Pair this with VPC Flow Logs for subnet-level insights and service-level QUIC logging (qlog) on your targets. qlog gives you per-connection events like handshakes, losses, and CID changes. It makes QUIC troubleshooting humane.

Bonus: Wireshark understands QUIC and can decrypt with keys if you export them in a test environment. For production, rely on qlog-structured events to avoid touching secrets.

Troubleshooting tips

  • Handshake fails? Check that UDP:443 is open from client → NLB → targets. Many firewalls silently drop UDP.
  • Health checks flap? Use TCP/HTTP checks on a stable port. Verify your target isn’t rate limiting checks.
  • Clients fall back to HTTP/2? Confirm your server advertises HTTP/3 and your certs/ALPN are set right.
  • Spiky tail latency? Enable cross‑zone LB and scale targets. Inspect qlog for loss/retx patterns.
  • Packet loss on one cell network? Lower server pacing/aggressiveness. Validate congestion control choice.
  • Random stalls? Watch for MTU/fragmentation issues. Tune max datagram size and enable Path MTU Discovery.

Tracing a sticky session

Enable qlog on the server. Make a request, then switch networks mid‑stream. You’ll observe a stable Connection ID across migration. Correlate with NLB access logs to see consistent target mapping. That’s QUIC stickiness doing exactly what you wanted.

Pro tip: visualize the trace with qvis to see handshake, RTT, losses, and migration as a timeline. It’s night-and-day easier than parsing raw logs.

Performance tuning

You don’t need to be a kernel wizard. A few practical knobs make QUIC feel even snappier:

  • UDP buffers: Increase OS receive/send buffers on your targets to handle bursts. Do it carefully, with testing.
  • Congestion control: Many stacks default to CUBIC. Try BBR in a controlled canary if your stack supports it.
  • Stream design: Use separate QUIC streams for independent data. Keep metadata and video segments uncoupled.
  • Retry-sensitive endpoints: Avoid 0‑RTT for requests that mutate state unless you’ve built replay protections.
  • CPU pinning: QUIC is user‑space. Give busy listeners dedicated cores under consistent load.

Measure changes with real user metrics (RUM), not just lab tests. Your users’ networks are the truth.

Security compliance guardrails

You’re not giving up security by choosing QUIC passthrough on NLB. You’re moving it to where it belongs—your service.

  • End-to-end TLS 1.3: You manage certs, rotate keys, and enforce modern cipher suites.
  • mTLS (optional): For high-trust APIs, issue client certs and authenticate at the app.
  • DDoS posture: Protect at the edge with AWS Shield. Rate limit or challenge at the service.
  • Least privilege networking: Tight security groups and NACLs for UDP:443. Keep health check ports scoped.
  • Logging retention: Access logs (S3), qlog on servers, and flow logs form your audit trail.

Remember 0‑RTT: it’s awesome for latency but replayable. Only allow it for idempotent reads or when you’ve built replay fences.

Rollout playbook

  • Dual‑stack the endpoint (HTTP/3 + HTTP/2). Let clients choose.
  • Announce h3 via Alt-Svc gradually. Ramp from 5% → 25% → 50% → 100%.
  • Use feature flags in mobile apps to toggle QUIC if you need a fast off‑switch.
  • Watch p95/p99 TTFB and error rates per ASN/ISP. Some networks are just weird with UDP.
  • Keep a rollback: remove Alt-Svc, keep NLB listener, and fall back to HTTP/2 while you debug.

Cost and capacity

  • NLB pricing is based on hours, LCUs, and data processed. QUIC/UDP doesn’t change that model.
  • Access logs land in S3—set lifecycle rules so you don’t hoard logs forever.
  • Right-size targets: QUIC termination is CPU-heavy. Match instance types to traffic patterns.
  • Pre-warm before big launches if you know a spike is coming.

Compatibility checklist

  • Major browsers (Chrome, Safari, Firefox, Edge) support HTTP/3.
  • curl supports HTTP/3 for testing (build with a QUIC-capable TLS library).
  • Many mobile SDKs and modern OS stacks include QUIC support out of the box.

If you’re running a custom client (e.g., a game or streaming app), embed a proven QUIC library. Turn on qlog from day one. Future you will be grateful.

Halfway checkpoint

  • QUIC on NLB in passthrough trims 25–30% latency for mobile and streaming.
  • You keep TLS at the service; NLB forwards UDP/443 without termination.
  • QUIC Connection IDs give you real sticky sessions across IP changes.
  • NLB uses flow hashing under the hood; stickiness rides on CIDs.
  • Turn on access logs, CloudWatch metrics, and qlog to see what’s happening.
  • Keep HTTP/2 as a safety net; ramp QUIC with Alt-Svc and feature flags.

FAQ

Terminate QUIC or TLS

No. In QUIC passthrough, NLB forwards UDP traffic to your targets. You terminate TLS/QUIC at the service, preserving end‑to‑end encryption and full control of certificates.

Sticky sessions on NLB

QUIC uses Connection IDs that persist across IP/port changes. NLB leverages these IDs to keep a session routed to the same target, avoiding mid‑session hops common with pure 5‑tuple hashing.

Health checks for QUIC

Use TCP or HTTP health checks on a dedicated port. This keeps health signaling reliable while your app speaks QUIC on UDP:443.

Cross AZ traffic

Yes. Enable cross‑zone load balancing if you want even distribution across Availability Zones. QUIC stickiness continues to work. Just ensure targets in all AZs handle the same session state model, or externalize state.

AWS Network Load Balancer icon

Grab it from the official AWS Architecture Icons set. Using the correct icon helps teammates and auditors parse your design quickly.

Debug client fallbacks

Check ALPN settings and certificates on your QUIC server. Confirm UDP:443 path is open. Review qlog traces. Many corporate networks still hamstring UDP—plan an HTTP/2 fallback path for those environments.

Use AWS WAF with NLB

AWS WAF integrates with ALB and certain services at L7, not with NLB UDP listeners. For NLB, combine security groups, NACLs, Shield, and application-layer defenses at your targets.

Change mid tier or databases

No. QUIC changes the client ↔ edge transport. Your mid‑tier can keep using HTTP/2 or gRPC over TCP internally. Start at the edge where users feel it most.

Checklist to ship QUIC

  • Create a UDP target group on port 443; set TCP/HTTP health checks.
  • Deploy/enable QUIC (HTTP/3) on targets; install TLS certs there.
  • Create an NLB with a UDP:443 listener forwarding to the target group.
  • Open security group/NACL rules for UDP:443 end to end.
  • Enable NLB access logs to S3 and CloudWatch metrics.
  • Test with curl --http3 and simulate IP changes to confirm stickiness.
  • Advertise h3 with Alt-Svc, start with a small ramp, and watch p95.
  • Keep HTTP/2 fallback ready; document your rollback.

In 30 minutes, you’ll have real users riding a faster transport.

Your north star is simple: make your app feel instant. QUIC on NLB is a rare upgrade that gives you speed without complexity. No TLS termination changes, no rewrite of your mid-tier, and a clear story for mobile reality. Think IP churn, NAT, and sketchy Wi‑Fi. Start with one latency‑sensitive path—video first frame, chat presence, or real-time metrics. Move it to QUIC on UDP:443 behind NLB. Measure. If you see the 25–30% latency drop, scale it. If not, check logs, validate UDP reachability, and profile server loss/retransmits with qlog. Either way, you’ll learn fast—and likely ship faster.

References

  • RFC 9000: QUIC: A UDP‑Based Multiplexed and Secure Transport, IETF — https://www.rfc-editor.org/rfc/rfc9000
  • RFC 9001: Using TLS to Secure QUIC, IETF — https://www.rfc-editor.org/rfc/rfc9001
  • Network Load Balancers — User Guide (AWS) — https://docs.aws.amazon.com/elasticloadbalancing/latest/network/introduction.html
  • Access logs for Network Load Balancers (AWS) — https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-access-logs.html
  • Network Load Balancer CloudWatch metrics (AWS) — https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-cloudwatch-metrics.html
  • Target groups for Network Load Balancers (AWS) — https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-target-groups.html
  • AWS Architecture Icons — https://aws.amazon.com/architecture/icons/
  • RFC 9114: HTTP/3, IETF — https://www.rfc-editor.org/rfc/rfc9114
  • RFC 9002: QUIC Loss Detection and Congestion Control, IETF — https://www.rfc-editor.org/rfc/rfc9002
  • Alt-Svc (Alternative Services), IETF RFC 7838 — https://www.rfc-editor.org/rfc/rfc7838
  • MDN Web Docs: HTTP/3 — https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview#http3
  • Can I use: HTTP/3 — https://caniuse.com/http3
  • curl HTTP/3 docs — https://curl.se/docs/http3.html
  • Envoy HTTP/3 (QUIC) configuration — https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http3
  • Wireshark QUIC protocol support — https://www.wireshark.org/docs/dfref/q/quic.html
  • AWS Shield Advanced — https://docs.aws.amazon.com/waf/latest/developerguide/ddos-aws-shield.html
  • Elastic Load Balancing Pricing — https://aws.amazon.com/elasticloadbalancing/pricing/