- Technology Illumination
- Posts
- Spring Boot HTTP Clients: RestTemplate vs RestClient vs WebClient - Which One To Use - A Hands-On Analysis
Spring Boot HTTP Clients: RestTemplate vs RestClient vs WebClient - Which One To Use - A Hands-On Analysis
A Full Engineering and SRE Analysis Using an FOREX Qualification Business Application Workflow
In a business application team I am aware of, they had a discussion on Spring Boot RestTemplate, RestClient and WebClient usage and which client strategy the team should adopt. This team manages more than a handful of Spring Boot–based microservices and has historically used RestTemplate for calling downstream microservices.
To understand the differences in a practical, engineering-driven way, a complete reference implementation was developed based on a real business domain: a FOREX trade qualification workflow. In this workflow, every FX qualification request must synchronously call four downstream microservices in a strict request–response chain:
Customer -> Product -> Promo Code -> Market-Data
Validate Customer → Determine FX Product(s) → Validate Promo Codes → Use Market-Data For Interest Calaculation
All APIs are synchronous. There are no asynchronous endpoints. This makes the workload a perfect real-world lab for evaluating client behaviors.
Because this team also plays SRE responsibilities, the reference implementation was designed with The Four Golden Signals – latency, traffic, errors, and saturation—plus:
Prometheus metrics
Grafana dashboards
Jaeger distributed tracing
JVM / thread / Tomcat metrics
Resource sizing (CPU, RAM, concurrency)
Kubernetes deployment constraints
OpenTelemetry instrumentation
The FX qualification microservice was evaluated assuming production-grade load: 10,000 request-response calls per second per microservice pod in Azure Kubernetes Service.
All source code is available here:
https://github.com/javakishore-veleti/Forex-Qual-Engine-Rest-Reactive-Clients
The following sections walk through everything the team learned.
Why This Use Case Was Chosen
To move beyond theory, the team created a full working implementation using:
Spring Boot 3.5.x
RestTemplate
RestClient
WebClient
Downstream mocks using WireMock
Prometheus metrics
OpenTelemetry tracing
Grafana dashboards
Jupyter analytics notebook
Detailed thread / CPU / RAM saturation analysis
Kubernetes sizing guidance
This was not a toy example.
This was a realistic synchronous microservice evaluating a real business workflow.
Why this matters:
Many engineering teams continue using RestTemplate because “it has always worked”—even though:
It is blocking
It ties up a Tomcat thread
It struggles under concurrency
It is no longer recommended for most new code
Meanwhile:
RestClient is the modern successor to RestTemplate
WebClient uses non-blocking IO via Netty
The goal was to see how these client strategies behave in a strict synchronous environment, not a reactive pipeline.
The Reference Architecture
For each incoming request:
Client → Tomcat Thread → Qualify API Controller → Client Strategy (rest template vs rest client vs web client) → Customer API → Product API → Promo API → Market-Data API
This workflow was implemented three separate ways:
1. RestTemplate Strategy
Fully blocking
Tomcat thread is occupied during every downstream call
2. RestClient Strategy
A more modern version of RestTemplate
Cleaner configuration and error handling
Still blocking
Tomcat thread remains in wait state during downstream I/O
3. WebClient Strategy
Controller stays synchronous and uses
.block()Tomcat thread is still logically blocked
But actual downstream IO happens on Netty event loops
Non-blocking networking underneath
Much better scalability and resource efficiency
This gave a complete apples-to-apples comparison.
How Each HTTP Client Behaves in a Request–Response Workflow
Comparison Table
Behavior / Component | RestTemplate | RestClient | WebClient (using |
|---|---|---|---|
I/O Model | Blocking | Blocking | Non-blocking I/O under the hood |
Tomcat Thread | Fully blocked | Fully blocked | Logically blocked but IO handled by Netty event loops |
Thread Usage | Scales poorly | Scales poorly | Very efficient; fewer threads needed |
Latency Under Load | Degrades quickly | Slightly better | Much flatter latency curve |
Connection Pool | Legacy | Modern | Highly optimized (Netty) |
Timeout Control | Basic | Stronger | Most flexible and predictable |
Saturation Behavior | Threads saturate quickly | Slight improvement | Handles high concurrency far better |
10k RPS Suitability | Poor | Moderate | Best choice |
OTEL Integration | Basic | Good | Excellent |
Why Tomcat Thread Blocking Still Matters
In a synchronous Spring Boot controller:
Tomcat assigns a worker thread
The thread enters the controller
The thread issues downstream HTTP calls
The thread waits for responses
The thread returns the final response
This means:
RestTemplate and RestClient
→ Each downstream call keeps the Tomcat thread blocked
→ High concurrency = many blocked threads
→ Large thread pools = high CPU + RAM + GC pressure
WebClient (even with .block())
→ Tomcat thread is blocked logically
→ But actual IO is performed by Netty worker threads
→ Tomcat threads are not wasted on socket waiting
→ Thread count stays flat
→ Higher concurrency before saturation
What SREs Want to See
The reference implementation measures:
End-to-end workflow latency
Step-by-step latency (customer, product, promo, market-data)
Thread consumption (Tomcat, Netty, JVM threads)
CPU usage
GC pauses
Memory usage
Prometheus counters, gauges, and summaries
Jaeger distributed traces
Saturation behavior under increasing concurrency
The microservice emits many useful Prometheus metrics from:
JVM
Tomcat
Executors
Custom FX workflow timings
Step-level timings
Traffic volume
Error counts
Some example metric families:
fxqual_step_duration_ms_seconds_*fxqual_workflow_duration_ms_seconds_*http_server_requests_seconds_*jvm_memory_used_bytesjvm_threads_live_threadsjvm_threads_peak_threadsexecutor_active_threadsprocess_cpu_usage
These reveal clear capacity and performance characteristics.
What We Learned (Deep Dive With Prometheus, Grafana & Jaeger)
Every API invocation produces:
OpenTelemetry traces
Prometheus metrics
Grafana dashboards
Thread and JVM insights
Workflow step timings
The system was examined end-to-end:
HTTP clients
Tomcat thread behavior
JVM garbage collection and memory
Downstream API latency
Saturation patterns as concurrency increased
Below are the learnings.
1. RestTemplate saturates fastest
Tomcat threads increase sharply
JVM thread count grows
CPU usage rises quickly
Latency increases nonlinearly
At high throughput (10k RPS), queueing and request timeouts appear
2. RestClient behaves slightly better
Cleaner API
More modern configuration
But still blocking
Still struggles at very high concurrency
3. WebClient performs best — even when used synchronously with .block()
Even though the controller is synchronous and uses:
webClient.post() .uri(...) .body(...) .retrieve() .bodyToMono(...) .block();
The benefits still hold because of Netty’s event-loop model:
Downstream I/O is non-blocking
Tomcat thread is blocked only logically
Actual socket operations handled by Netty
Very small number of Netty threads handle all I/O
Thread count remains stable
Latency remains flatter under pressure
CPU usage is lower
Throughput remains stable even as concurrency rises
This proves that WebClient offers clear advantages even when not using reactive chains.
Kubernetes Deployment & Resource Sizing for 10,000 RPS
A realistic starting point for the FX Qualification microservice:
CPU: 6–8 vCPUs per pod
RAM: 8 GB per pod
Tomcat thread pool: 200–300
WebClient: default Netty event loops
Network: downstream services in same VNet for minimal latency
Throughput expectation: significantly higher with WebClient
RestTemplate and RestClient require larger thread pools and result in higher CPU/GC pressure.
Final Recommendation
For synchronous request-response microservices with high concurrency, such as the FX Qualification workflow:
WebClient is the best choice.
Even if:
Your controller is synchronous
You call
.block()Your workflow is 100% request–response
You are not using reactive programming
You still gain:
Better latency
Better thread efficiency
Better CPU usage
Better saturation behavior
Stronger OTEL integration
Better timeout/backpressure control
Higher sustainable throughput per pod
RestTemplate no longer meets the needs of modern distributed systems at scale.
Explore the Full Reference Implementation
The repo includes:
Complete working FX Qualification Engine
RestTemplate, RestClient, WebClient strategies
Downstream mocks
Prometheus + OTEL + Grafana
Analytics notebook
Performance dashboards
SRE-grade metrics + tracing
Kubernetes manifests
If your team is evaluating HTTP client strategies in Spring Boot, this reference implementation will save weeks of experimentation.