• Technology Illumination
  • Posts
  • Spring Boot HTTP Clients: RestTemplate vs RestClient vs WebClient - Which One To Use - A Hands-On Analysis

Spring Boot HTTP Clients: RestTemplate vs RestClient vs WebClient - Which One To Use - A Hands-On Analysis

A Full Engineering and SRE Analysis Using an FOREX Qualification Business Application Workflow

In a business application team I am aware of, they had a discussion on Spring Boot RestTemplate, RestClient and WebClient usage and which client strategy the team should adopt. This team manages more than a handful of Spring Boot–based microservices and has historically used RestTemplate for calling downstream microservices.

To understand the differences in a practical, engineering-driven way, a complete reference implementation was developed based on a real business domain: a FOREX trade qualification workflow. In this workflow, every FX qualification request must synchronously call four downstream microservices in a strict request–response chain:

Customer -> Product -> Promo Code -> Market-Data

Validate Customer → Determine FX Product(s) → Validate Promo Codes → Use Market-Data For Interest Calaculation

All APIs are synchronous. There are no asynchronous endpoints. This makes the workload a perfect real-world lab for evaluating client behaviors.

Because this team also plays SRE responsibilities, the reference implementation was designed with The Four Golden Signals – latency, traffic, errors, and saturation—plus:

  • Prometheus metrics

  • Grafana dashboards

  • Jaeger distributed tracing

  • JVM / thread / Tomcat metrics

  • Resource sizing (CPU, RAM, concurrency)

  • Kubernetes deployment constraints

  • OpenTelemetry instrumentation

The FX qualification microservice was evaluated assuming production-grade load: 10,000 request-response calls per second per microservice pod in Azure Kubernetes Service.

The following sections walk through everything the team learned.

Why This Use Case Was Chosen

To move beyond theory, the team created a full working implementation using:

  • Spring Boot 3.5.x

  • RestTemplate

  • RestClient

  • WebClient

  • Downstream mocks using WireMock

  • Prometheus metrics

  • OpenTelemetry tracing

  • Grafana dashboards

  • Jupyter analytics notebook

  • Detailed thread / CPU / RAM saturation analysis

  • Kubernetes sizing guidance

This was not a toy example.
This was a realistic synchronous microservice evaluating a real business workflow.

Why this matters:
Many engineering teams continue using RestTemplate because “it has always worked”—even though:

  • It is blocking

  • It ties up a Tomcat thread

  • It struggles under concurrency

  • It is no longer recommended for most new code

Meanwhile:

  • RestClient is the modern successor to RestTemplate

  • WebClient uses non-blocking IO via Netty

The goal was to see how these client strategies behave in a strict synchronous environment, not a reactive pipeline.

The Reference Architecture

For each incoming request:

Client → Tomcat Thread → Qualify API Controller → Client Strategy  (rest template vs rest client vs web client)  → Customer API → Product API  → Promo API  → Market-Data API

This workflow was implemented three separate ways:

1. RestTemplate Strategy

  • Fully blocking

  • Tomcat thread is occupied during every downstream call

2. RestClient Strategy

  • A more modern version of RestTemplate

  • Cleaner configuration and error handling

  • Still blocking

  • Tomcat thread remains in wait state during downstream I/O

3. WebClient Strategy

  • Controller stays synchronous and uses .block()

  • Tomcat thread is still logically blocked

  • But actual downstream IO happens on Netty event loops

  • Non-blocking networking underneath

  • Much better scalability and resource efficiency

This gave a complete apples-to-apples comparison.

How Each HTTP Client Behaves in a Request–Response Workflow

Comparison Table

Behavior / Component

RestTemplate

RestClient

WebClient (using .block())

I/O Model

Blocking

Blocking

Non-blocking I/O under the hood

Tomcat Thread

Fully blocked

Fully blocked

Logically blocked but IO handled by Netty event loops

Thread Usage

Scales poorly

Scales poorly

Very efficient; fewer threads needed

Latency Under Load

Degrades quickly

Slightly better

Much flatter latency curve

Connection Pool

Legacy

Modern

Highly optimized (Netty)

Timeout Control

Basic

Stronger

Most flexible and predictable

Saturation Behavior

Threads saturate quickly

Slight improvement

Handles high concurrency far better

10k RPS Suitability

Poor

Moderate

Best choice

OTEL Integration

Basic

Good

Excellent

Why Tomcat Thread Blocking Still Matters

In a synchronous Spring Boot controller:

  1. Tomcat assigns a worker thread

  2. The thread enters the controller

  3. The thread issues downstream HTTP calls

  4. The thread waits for responses

  5. The thread returns the final response

This means:

RestTemplate and RestClient
→ Each downstream call keeps the Tomcat thread blocked
→ High concurrency = many blocked threads
→ Large thread pools = high CPU + RAM + GC pressure

WebClient (even with .block())
→ Tomcat thread is blocked logically
→ But actual IO is performed by Netty worker threads
→ Tomcat threads are not wasted on socket waiting
→ Thread count stays flat
→ Higher concurrency before saturation

What SREs Want to See

The reference implementation measures:

  • End-to-end workflow latency

  • Step-by-step latency (customer, product, promo, market-data)

  • Thread consumption (Tomcat, Netty, JVM threads)

  • CPU usage

  • GC pauses

  • Memory usage

  • Prometheus counters, gauges, and summaries

  • Jaeger distributed traces

  • Saturation behavior under increasing concurrency

The microservice emits many useful Prometheus metrics from:

  • JVM

  • Tomcat

  • Executors

  • Custom FX workflow timings

  • Step-level timings

  • Traffic volume

  • Error counts

Some example metric families:

  • fxqual_step_duration_ms_seconds_*

  • fxqual_workflow_duration_ms_seconds_*

  • http_server_requests_seconds_*

  • jvm_memory_used_bytes

  • jvm_threads_live_threads

  • jvm_threads_peak_threads

  • executor_active_threads

  • process_cpu_usage

These reveal clear capacity and performance characteristics.

What We Learned (Deep Dive With Prometheus, Grafana & Jaeger)

Every API invocation produces:

  • OpenTelemetry traces

  • Prometheus metrics

  • Grafana dashboards

  • Thread and JVM insights

  • Workflow step timings

The system was examined end-to-end:

  • HTTP clients

  • Tomcat thread behavior

  • JVM garbage collection and memory

  • Downstream API latency

  • Saturation patterns as concurrency increased

Below are the learnings.

1. RestTemplate saturates fastest

  • Tomcat threads increase sharply

  • JVM thread count grows

  • CPU usage rises quickly

  • Latency increases nonlinearly

  • At high throughput (10k RPS), queueing and request timeouts appear

2. RestClient behaves slightly better

  • Cleaner API

  • More modern configuration

  • But still blocking

  • Still struggles at very high concurrency

3. WebClient performs best — even when used synchronously with .block()

Even though the controller is synchronous and uses:

webClient.post() .uri(...) .body(...) .retrieve() .bodyToMono(...) .block();

The benefits still hold because of Netty’s event-loop model:

  • Downstream I/O is non-blocking

  • Tomcat thread is blocked only logically

  • Actual socket operations handled by Netty

  • Very small number of Netty threads handle all I/O

  • Thread count remains stable

  • Latency remains flatter under pressure

  • CPU usage is lower

  • Throughput remains stable even as concurrency rises

This proves that WebClient offers clear advantages even when not using reactive chains.

Kubernetes Deployment & Resource Sizing for 10,000 RPS

A realistic starting point for the FX Qualification microservice:

  • CPU: 6–8 vCPUs per pod

  • RAM: 8 GB per pod

  • Tomcat thread pool: 200–300

  • WebClient: default Netty event loops

  • Network: downstream services in same VNet for minimal latency

  • Throughput expectation: significantly higher with WebClient

RestTemplate and RestClient require larger thread pools and result in higher CPU/GC pressure.

Final Recommendation

For synchronous request-response microservices with high concurrency, such as the FX Qualification workflow:

WebClient is the best choice.

Even if:

  • Your controller is synchronous

  • You call .block()

  • Your workflow is 100% request–response

  • You are not using reactive programming

You still gain:

  • Better latency

  • Better thread efficiency

  • Better CPU usage

  • Better saturation behavior

  • Stronger OTEL integration

  • Better timeout/backpressure control

  • Higher sustainable throughput per pod

RestTemplate no longer meets the needs of modern distributed systems at scale.

Explore the Full Reference Implementation

The repo includes:

  • Complete working FX Qualification Engine

  • RestTemplate, RestClient, WebClient strategies

  • Downstream mocks

  • Prometheus + OTEL + Grafana

  • Analytics notebook

  • Performance dashboards

  • SRE-grade metrics + tracing

  • Kubernetes manifests

If your team is evaluating HTTP client strategies in Spring Boot, this reference implementation will save weeks of experimentation.