Technology Illumination
Posts
Spring Boot HTTP Clients: RestTemplate vs RestClient vs WebClient - Which One To Use - A Hands-On Analysis

Spring Boot HTTP Clients: RestTemplate vs RestClient vs WebClient - Which One To Use - A Hands-On Analysis

A Full Engineering and SRE Analysis Using an FOREX Qualification Business Application Workflow

Aruna Kishore Veleti
December 07, 2025

In a business application team I am aware of, they had a discussion on Spring Boot RestTemplate, RestClient and WebClient usage and which client strategy the team should adopt. This team manages more than a handful of Spring Boot–based microservices and has historically used RestTemplate for calling downstream microservices.

To understand the differences in a practical, engineering-driven way, a complete reference implementation was developed based on a real business domain: a FOREX trade qualification workflow. In this workflow, every FX qualification request must synchronously call four downstream microservices in a strict request–response chain:

Customer -> Product -> Promo Code -> Market-Data

Validate Customer → Determine FX Product(s) → Validate Promo Codes → Use Market-Data For Interest Calaculation

All APIs are synchronous. There are no asynchronous endpoints. This makes the workload a perfect real-world lab for evaluating client behaviors.

Because this team also plays SRE responsibilities, the reference implementation was designed with The Four Golden Signals – latency, traffic, errors, and saturation—plus:

Prometheus metrics
Grafana dashboards
Jaeger distributed tracing
JVM / thread / Tomcat metrics
Resource sizing (CPU, RAM, concurrency)
Kubernetes deployment constraints
OpenTelemetry instrumentation

The FX qualification microservice was evaluated assuming production-grade load: 10,000 request-response calls per second per microservice pod in Azure Kubernetes Service.

All source code is available here:
https://github.com/javakishore-veleti/Forex-Qual-Engine-Rest-Reactive-Clients

The following sections walk through everything the team learned.

Why This Use Case Was Chosen

To move beyond theory, the team created a full working implementation using:

Spring Boot 3.5.x
RestTemplate
RestClient
WebClient
Downstream mocks using WireMock
Prometheus metrics
OpenTelemetry tracing
Grafana dashboards
Jupyter analytics notebook
Detailed thread / CPU / RAM saturation analysis
Kubernetes sizing guidance

This was not a toy example.
This was a realistic synchronous microservice evaluating a real business workflow.

Why this matters:
Many engineering teams continue using RestTemplate because “it has always worked”—even though:

It is blocking
It ties up a Tomcat thread
It struggles under concurrency
It is no longer recommended for most new code

Meanwhile:

RestClient is the modern successor to RestTemplate
WebClient uses non-blocking IO via Netty

The goal was to see how these client strategies behave in a strict synchronous environment, not a reactive pipeline.

The Reference Architecture

For each incoming request:

Client → Tomcat Thread → Qualify API Controller → Client Strategy (rest template vs rest client vs web client) → Customer API → Product API → Promo API → Market-Data API

This workflow was implemented three separate ways:

1. RestTemplate Strategy

Fully blocking
Tomcat thread is occupied during every downstream call

2. RestClient Strategy

A more modern version of RestTemplate
Cleaner configuration and error handling
Still blocking
Tomcat thread remains in wait state during downstream I/O

3. WebClient Strategy

Controller stays synchronous and uses .block()
Tomcat thread is still logically blocked
But actual downstream IO happens on Netty event loops
Non-blocking networking underneath
Much better scalability and resource efficiency

This gave a complete apples-to-apples comparison.

How Each HTTP Client Behaves in a Request–Response Workflow

Comparison Table

Behavior / Component	RestTemplate	RestClient	WebClient (using `.block()`)
I/O Model	Blocking	Blocking	Non-blocking I/O under the hood
Tomcat Thread	Fully blocked	Fully blocked	Logically blocked but IO handled by Netty event loops
Thread Usage	Scales poorly	Scales poorly	Very efficient; fewer threads needed
Latency Under Load	Degrades quickly	Slightly better	Much flatter latency curve
Connection Pool	Legacy	Modern	Highly optimized (Netty)
Timeout Control	Basic	Stronger	Most flexible and predictable
Saturation Behavior	Threads saturate quickly	Slight improvement	Handles high concurrency far better
10k RPS Suitability	Poor	Moderate	Best choice
OTEL Integration	Basic	Good	Excellent

Why Tomcat Thread Blocking Still Matters

In a synchronous Spring Boot controller:

Tomcat assigns a worker thread
The thread enters the controller
The thread issues downstream HTTP calls
The thread waits for responses
The thread returns the final response

This means:

RestTemplate and RestClient
→ Each downstream call keeps the Tomcat thread blocked
→ High concurrency = many blocked threads
→ Large thread pools = high CPU + RAM + GC pressure

WebClient (even with .block())
→ Tomcat thread is blocked logically
→ But actual IO is performed by Netty worker threads
→ Tomcat threads are not wasted on socket waiting
→ Thread count stays flat
→ Higher concurrency before saturation

What SREs Want to See

The reference implementation measures:

End-to-end workflow latency
Step-by-step latency (customer, product, promo, market-data)
Thread consumption (Tomcat, Netty, JVM threads)
CPU usage
GC pauses
Memory usage
Prometheus counters, gauges, and summaries
Jaeger distributed traces
Saturation behavior under increasing concurrency

The microservice emits many useful Prometheus metrics from:

JVM
Tomcat
Executors
Custom FX workflow timings
Step-level timings
Traffic volume
Error counts

Some example metric families:

fxqual_step_duration_ms_seconds_*
fxqual_workflow_duration_ms_seconds_*
http_server_requests_seconds_*
jvm_memory_used_bytes
jvm_threads_live_threads
jvm_threads_peak_threads
executor_active_threads
process_cpu_usage

These reveal clear capacity and performance characteristics.

What We Learned (Deep Dive With Prometheus, Grafana & Jaeger)

Every API invocation produces:

OpenTelemetry traces
Prometheus metrics
Grafana dashboards
Thread and JVM insights
Workflow step timings

The system was examined end-to-end:

HTTP clients
Tomcat thread behavior
JVM garbage collection and memory
Downstream API latency
Saturation patterns as concurrency increased

Below are the learnings.

1. RestTemplate saturates fastest

Tomcat threads increase sharply
JVM thread count grows
CPU usage rises quickly
Latency increases nonlinearly
At high throughput (10k RPS), queueing and request timeouts appear

2. RestClient behaves slightly better

Cleaner API
More modern configuration
But still blocking
Still struggles at very high concurrency

3. WebClient performs best — even when used synchronously with `.block()`

Even though the controller is synchronous and uses:

webClient.post() .uri(...) .body(...) .retrieve() .bodyToMono(...) .block();

The benefits still hold because of Netty’s event-loop model:

Downstream I/O is non-blocking
Tomcat thread is blocked only logically
Actual socket operations handled by Netty
Very small number of Netty threads handle all I/O
Thread count remains stable
Latency remains flatter under pressure
CPU usage is lower
Throughput remains stable even as concurrency rises

This proves that WebClient offers clear advantages even when not using reactive chains.

Kubernetes Deployment & Resource Sizing for 10,000 RPS

A realistic starting point for the FX Qualification microservice:

CPU: 6–8 vCPUs per pod
RAM: 8 GB per pod
Tomcat thread pool: 200–300
WebClient: default Netty event loops
Network: downstream services in same VNet for minimal latency
Throughput expectation: significantly higher with WebClient

RestTemplate and RestClient require larger thread pools and result in higher CPU/GC pressure.

Final Recommendation

For synchronous request-response microservices with high concurrency, such as the FX Qualification workflow:

WebClient is the best choice.

Even if:

Your controller is synchronous
You call .block()
Your workflow is 100% request–response
You are not using reactive programming

You still gain:

Better latency
Better thread efficiency
Better CPU usage
Better saturation behavior
Stronger OTEL integration
Better timeout/backpressure control
Higher sustainable throughput per pod

RestTemplate no longer meets the needs of modern distributed systems at scale.

Explore the Full Reference Implementation

GitHub Repository:
https://github.com/javakishore-veleti/Forex-Qual-Engine-Rest-Reactive-Clients

The repo includes:

Complete working FX Qualification Engine
RestTemplate, RestClient, WebClient strategies
Downstream mocks
Prometheus + OTEL + Grafana
Analytics notebook
Performance dashboards
SRE-grade metrics + tracing
Kubernetes manifests

If your team is evaluating HTTP client strategies in Spring Boot, this reference implementation will save weeks of experimentation.