Reactive Programming(Paradigm) and how to work with it in Spring

Introduction

It is evident that modern applications are facing increasingly demanding requirements. The systems we built just a few years ago, monolithic applications or even some simple CRUD microservices serving modest user bases are now becoming to struggle when needed to handle thousands of concurrent users, each expecting near-instantaneous responses. This shift has pushed some developers to rethink fundamental architectural patterns.

So, there's this book called "Designing Data-Intensive Applications" where Martin Kleppmann describes this evolution perfectly. Today's users expect millisecond responses whether they're the only person on your site or sharing it with thousands of others. It's a reality that forces us to build applications that are more scalable, elastic, robust, and resilient than ever before.

In this environment of high performance APIs, there's a paradigm called "Reactive Programming", which has emerged as a compelling solution to these challenges. What started as a niche paradigm has moved toward the mainstream, particularly for applications dealing with high concurrency or integration with asynchronous systems. If you've attended a Java/Kotlin technical interview recently, you've likely encountered questions about reactive programming—perhaps wondering if it's just another industry buzzword or genuinely the answer to modern performance requirements.

In this post I'll be focusing on reactive programming in Spring. Basically covering some topics below:

  1. What is reactive programming
  2. How Spring implements the reactive paradigm
  3. Benefits you can expect
  4. Limitations
  5. Common anti-patterns
  6. Inter-service communication in reactive systems
  7. Best practices

What Is Reactive Programming?

Reactive programming is a declarative programming paradigm concerned with data streams and the propagation of change. Instead of pulling data (as in imperative programming), reactive programming lets you set up data streams where changes propagate automatically. An example could be the difference between checking your email every 5 minutes (imperative) versus getting push notifications when new emails arrive (reactive).

There is a source of truth about the paradigm which is Reactive Manifesto(you should probably check it out as well), which emerged from the need to build systems that are:

  • Responsive: They respond in a timely manner
  • Resilient: They stay responsive in the face of failure
  • Elastic: They stay responsive under varying workload
  • Message-driven: They rely on asynchronous message-passing

This isn't just theoretical as these principles directly address real-world challenges in modern distributed systems, as our applications scaled up to serve thousands of concurrent users, the limitations of thread-per-request models became really painful.

The shift is basically: Threads -> Events

Traditional Spring MVC applications typically use a thread-per-request model. If your application receives 200 concurrent requests, it needs approximately 200 threads to handle them. This is actually fine for modest concurrency and standard applications, but there are some points to take care when it comes to high concurrency levels and higher tps(lets say above 1000tps):

  1. Threads are expensive: Each thread consumes memory (often ~1MB of stack space)
  2. Context switching is costly: The CPU spends time switching between threads
  3. Most threads spend their time waiting: Threads sit idle during I/O operations

Consider this example of a traditional Spring MVC endpoint that calls an external service(this is probably how your API works):

@GetMapping("/customers/{id}")
public Customer getCustomer(@PathVariable Long id) {
    // this thread blocks while waiting for the database
    return customerRepository.findById(id)
           .orElseThrow(() -> new CustomerNotFoundException(id));
}

During the database call, this thread is blocked—consuming resources but doing no useful work. Now multiply this by thousands of concurrent requests, and you can see why this model struggles at scale =D.

Reactive programming takes a different approach, using a small number of threads and an event loop model similar to Node.js:

@GetMapping("/customers/{id}")
public Mono<Customer> getCustomer(@PathVariable Long id) {
    // this doesn't block - it returns immediately with a "promise" of a customer
    return customerRepository.findById(id)
           .switchIfEmpty(Mono.error(new CustomerNotFoundException(id)));
}

In this reactive version, the thread handling the request isn't blocked during the database call. Instead, it processes other requests and is notified when the data is ready. This allows each thread to handle many more requests, significantly improving resource utilization.

Key Concepts in Reactive Programming

To understand reactive programming, you'll need a few fundamental concepts:

  1. Publisher/Subscriber Model: Data sources emit values, and subscribers process them
  2. Backpressure: Subscribers can tell publishers to slow down if they can't keep up (You can find a fancier description on the manifesto if you want)
  3. Composition: Operations can be chained together to create complex data flows
  4. Error Handling: Errors propagate through the stream and can be handled at any point

NOTE: (Easter Egg) If you look closely at reactive programming composition model, you might notice there's a reference to functional programming concepts like monads, functors, and applicatives. This is not a coincidence :) ... In fact, Mono<T> and Flux<T> are essentially monadic containers that let you compose operations while maintaining context around success, failure, and completion. If you're interested in some of this topics, check out my previous post on how monads can improve error handling in Kotlin - many of the same principles apply to reactive streams

Reactive Programming in Spring

Spring's reactive support is primarily built on Project Reactor, which implements the Reactive Streams specification and provides two main types:

  1. Mono: Represents a stream of 0 or 1 item (Like an Optional<T>)
  2. Flux: Represents a stream of 0 to N items (Like a List<T>)

These building blocks are then integrated throughout the Spring ecosystem:

  • Spring WebFlux: A reactive alternative to Spring MVC
  • Spring Data Reactive: Reactive repositories for various databases
  • Spring Cloud Gateway: A reactive API gateway
  • Spring Security Reactive: Security for reactive applications

This is how a reactive Spring application looks like:

@RestController
class CustomerController(private val customerRepository: ReactiveCustomerRepository) {
    
    @GetMapping("/customers")
    fun getAllCustomers(): Flux<Customer> {
        return customerRepository.findAll()
    }
    
    @GetMapping("/customers/{id}")
    fun getCustomer(@PathVariable id: Long): Mono<Customer> {
        return customerRepository.findById(id)
            .switchIfEmpty(Mono.error(CustomerNotFoundException(id)))
    }
    
    @PostMapping("/customers")
    fun createCustomer(@RequestBody customer: Mono<Customer>): Mono<Customer> {
        return customer.flatMap { customerRepository.save(it) }
    }
}

Notice how everything returns either a Mono or a Flux, maintaining the reactive nature throughout. This is crucial in reactive programming, if even one operation blocks a thread, that thread is stuck waiting and can't process other requests. A single blocking call destroys the advantage of reactive programming: the ability to handle many requests with few threads. It doesn't matter if 99% of your code is reactive—that one blocking call will limit your entire application's throughput to what that blocked thread can handle.

Understanding the Reactive Flow

To understand how data flows through a reactive system, let's trace a request through a typical Spring WebFlux(reactive) application:

  1. Request arrives: The server accepts the HTTP request on one of its event loop threads
  2. Handler mapping: The request is routed to the appropriate controller method
  3. Controller processing: The controller returns a Mono or Flux representing the future response
  4. Repository access: The repository asynchronously fetches data without blocking
  5. Response assembly: Once data is available, it's assembled into a response
  6. Response sent: The response is written back to the client

At each step, the thread isn't waiting so it can process other requests while asynchronous operations complete. This is where the efficiency gains come from.

Benefits of Reactive Programming in Spring

1. Improved Resource Utilization

Reactive applications can handle more concurrent requests with fewer threads, leading to better CPU and memory utilization. According to research published by the Spring team [9], reactive applications generally require fewer resources to achieve the same throughput levels as their servlet-based counterparts. The efficiency gains become more pronounced as concurrency increases, particularly for I/O-bound applications.

The key factors behind this improvement are:

  • Thread efficiency: Fewer threads handle more requests
  • Reduced context switching: Lower CPU overhead from thread management
  • Better I/O utilization: Non-blocking I/O operations maximize resource usage

When your application needs to scale to handle thousands of concurrent connections, these advantages can bring hardware savings or throughput improvements.

2. More Responsive Under Load

One of the most compelling benefits of reactive systems is how they degrade under load. Traditional systems might become completely unresponsive when overloaded, while reactive systems typically experience graceful degradation.

Consider a scenario where your system suddenly receives a spike in traffic:

  • Traditional approach: Thread pool gets exhausted, requests queue up, and everything slows down
  • Reactive approach: System processes what it can based on available resources and applies backpressure

Look at this example

// traditional approach - webmvc: might crash under heavy load
@PostMapping("/send-batch")
fun sendNotifications(@RequestBody notifications: List<Notification>): ResponseEntity<String> {
    notifications.forEach { notificationService.send(it) }
    return ResponseEntity.ok("Notifications queued")
}

// reactive approach - handles load gracefully with backpressure :_)
@PostMapping("/send-batch")
fun sendNotifications(@RequestBody notifications: Flux<Notification>): Mono<ResponseEntity<String>> {
    return notifications
        .flatMap({ notification -> notificationService.send(notification) }, 
                 concurrency = 10) // control concurrency
        .then(Mono.just(ResponseEntity.ok("Notifications processed")))
}

The reactive version controls how many notifications are processed concurrently, preventing system overload.

3. Better Integration with Async Systems

Modern applications often integrate with asynchronous systems like message queues, event streams, and webhooks. Reactive programming provides a natural fit for these inherently asynchronous interactions.

Compare these approaches for consuming Kafka messages:

Traditional (@KafkaListener):

@Service
class TraditionalKafkaListener {
    @KafkaListener(topics = ["transactions"])
    fun processTransaction(transaction: Transaction) {
        if (transaction.amount > 0) {
            try {
                val result = transactionProcessor.process(transaction)
            } catch (e: Exception) {
                logger.error("Failed to process transaction", e)
            }
        }
    }
}

Reactive:

@Service
class ReactiveProcessor(private val kafkaReceiver: KafkaReceiver<String, Transaction>) {
    fun processTransactions() {
        kafkaReceiver.receive()
            .filter { it.value().amount > 0 }
            .flatMap { transactionProcessor.process(it.value()) }
            .doOnError { e -> logger.error("Error in processing", e) }
            .retry(3)
            .subscribe()
    }
}

What is the KafkaReceiver?
KafkaReceiver comes from the reactor-kafka library, a reactive Kafka client built on Project Reactor. Unlike the traditional @KafkaListener which processes messages one by one with blocking threads, KafkaReceiver connects to Kafka topics and transforms the incoming messages into a Flux stream. This allows you to process Kafka messages using the full reactive streams toolkit - applying transformations, filtering, error handling, and backpressure - all non-blocking.

4. Enhanced Resilience Patterns

The reactive programming model makes it easier to implement resilience patterns like circuit breakers, bulkheads, and timeouts. For example, implementing a timeout pattern is really simples, look:

fun getProductDetails(id: String): Mono<ProductDetails> {
    return webClient.get()
        .uri("/products/{id}", id)
        .retrieve()
        .bodyToMono(ProductDetails::class.java)
        .timeout(Duration.ofSeconds(1))
        .onErrorResume { e -> 
            when (e) {
                // this code automatically falls back to a default response if the product service doesn't respond within 1s
                is TimeoutException -> Mono.just(ProductDetails.fallback(id)) 
                else -> Mono.error(e)
            }
        }
}

Limitations

Despite its benefits, reactive programming isn't without challenges. Here are some of the issues you'll likely encounter:

1. Learning Curve

Reactive programming requires a significant mental shift from imperative programming. Concepts like backpressure, cold vs. hot streams, and functional composition are really hard when you step in.

In my experience, it takes developers around 3-6 months to become truly comfortable with reactive programming, and even longer to master advanced concepts like debugging complex reactive pipelines(I can't do this either lol).

2. Debug Complexity

One of the most frustrating aspects of reactive programming is debugging. Stack traces often don't tell you where the actual problem occurred, as operations are composed and executed asynchronously.

Consider this stack trace from a reactive application:

java.lang.NullPointerException: Cannot invoke "Customer.getName()" because "customer" is null
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:113)
    at reactor.core.publisher.FluxPeekFuseable$PeekFuseableSubscriber.onNext(FluxPeekFuseable.java:210)
    at reactor.core.publisher.FluxFilter$FilterSubscriber.onNext(FluxFilter.java:113)
    at reactor.core.publisher.FluxFlatMap$FlatMapMain.onNext(FluxFlatMap.java:426)
    ...more reactor internals...

Good luck finding where in your code the null customer originated...

Sometimes to address this, you can use:

  • checkpoint() operators to add context to stack traces
  • log() to see the data flow at specific points (no miracle here :c)

3. Ecosystem Maturity

While the core reactive libraries in Spring are mature, some parts of the ecosystem are still beign built. Many third-party libraries and legacy systems don't provide reactive APIs, requiring you to integrate reactive and non-reactive code carefully.

  • Database Drivers: While R2DBC provides reactive SQL access, many specialized database drivers still only offer blocking APIs
  • Third-Party Integrations: Many payment gateways, external APIs, and legacy systems haven't adopted reactive patterns
  • File Systems: Local file operations often still rely on blocking I/O
  • JPA/Hibernate: Traditional ORM tools aren't designed for reactive use cases, requiring alternative approaches like R2DBC

For example, if you're using a library with only blocking APIs, you'll need to isolate it in a separate thread pool:

fun callBlockingLibrary(): Mono<Result> {
    return Mono.fromCallable {
        blockingLibrary.doSomething() // Blocking call
    }.subscribeOn(Schedulers.boundedElastic())
}

Basically Schedulers.boundedElastic() creates a dedicated pool of threads specifically designed for blocking operations. These threads are separate from your application's main event loop threads, so it should not mess your entire codebase with a single blocking code.

4. Operational Complexity

Reactive systems introduce some operational challenges that differ from traditional applications. Here you can find some:

Metrics Collection

In traditional applications, metrics are straightforward: one thread, one request, one timer. In reactive applications, operations flow through multiple asynchronous stages, making measurement more complex.

Solution: Using reactive metrics tools:

fun processOrder(orderId: String): Mono<OrderResult> {
    // create a timer sample at the start
    val sample = Timer.start(meterRegistry)
    
    return orderRepository.findById(orderId)
        .flatMap { order -> 
            paymentService.process(order)
        }
        .flatMap { payment ->
            notificationService.notify(payment)
        }
        .doFinally { signalType ->
            // record the duration when the flow completes (success or error)
            sample.stop(meterRegistry.timer("order.processing", 
                "outcome", signalType.name.toLowerCase()))
        }
}

Micrometer's Timer.Sample allows you to track the entire reactive flow, including error states.

Distributed Tracing

When requests span multiple services, tracing becomes essential. But reactive flows complicate this by breaking the thread-per-request model.

Solution: Spring Boot 3.x integrates Micrometer Tracing (which replaces Spring Cloud Sleuth):

// application.yml configs
spring:
  application:
    name: order-service
  tracing:
    sampling:
      probability: 1.0  # all requests
    propagation:
      type: w3c        

With proper configuration, trace context automatically propagates through reactive chains, even across service boundaries via HTTP or messaging:

@RestController
class OrderController(private val orderService: OrderService) {
    
    @GetMapping("/orders/{id}")
    fun getOrder(@PathVariable id: String): Mono<OrderDetails> {
        // traceId and spanId automatically propagate through the reactive chain
        return orderService.findOrder(id)
            .flatMap { order -> 
                // creates a new span for this operation
                Mono.deferContextual { ctx -> //get the current tracking information
                    val tracer = ctx.get(Tracer::class.java)
                    //here we create and start a new span
                    tracer.nextSpan()
                    .name("enrich-order-details")
                    .withTag("orderId", id)
                    .start()
                    .use { _ ->
                        enrichmentService.addDetails(order) 
                    }
                }
            }
    }
}

The .use { } syntax from Kotlin ensures the span is closed even if an exception occurs, similar to a try-with-resources in Java.

Common Anti-Patterns in Reactive Spring

I've done some research over anti patterns around the use of reactive programming in Spring, but being really honest I did not found a lot of useful stuff, so I'll put some points that I've faced in production environments I've worked on, so you should probably avoid them.

1. Blocking Inside Reactive Code

The cardinal sin of reactive programming is introducing blocking calls in your reactive pipeline:

// DON'T DO THIS
fun getCustomerDetails(id: Long): Mono<CustomerDetails> {
    return customerRepository.findById(id)
        .map { customer ->
            // blocking call inside reactive pipeline
            val creditScore = blockingCreditScoreService.getScore(customer.id)
            CustomerDetails(customer, creditScore)
        }
}

This code defeats the purpose of reactive programming by blocking threads within the reactive flow. Instead:

fun getCustomerDetails(id: Long): Mono<CustomerDetails> {
    return customerRepository.findById(id)
        .flatMap { customer ->
            // wrap the blocking call with Mono.fromCallable and move to dedicated scheduler
            Mono.fromCallable { blockingCreditScoreService.getScore(customer.id) }
                .subscribeOn(Schedulers.boundedElastic())
                .map { creditScore -> CustomerDetails(customer, creditScore) }
        }
}

2. Reactive Overkill

Not every application needs to be reactive. If your application doesn't have high concurrency requirements or doesn't interact with asynchronous systems, reactive programming might introduce unnecessary complexity.

For a simple CRUD application with modest traffic, traditional Spring MVC is often simpler and more than adequate.

When to Consider Reactive ☝️🤓 akcthually

While there's no universal threshold, research and real-world experience suggest these approximate guidelines:

  • Concurrent Users: Consider reactive when expecting >500 concurrent users on modest hardware. A 2022 study by Krefter et al. [11] found that Spring MVC applications started showing thread pool saturation around 800-1000 concurrent users on 4-core machines, while reactive applications maintained consistent response times.

  • Transactions Per Second: Traditional MVC typically handles up to ~1000 TPS efficiently on standard hardware before thread pool tuning becomes critical. Reactive shows clearer benefits beyond this point.

  • Response Time Under Load: If your 99th percentile response time must stay under 200-300ms during traffic spikes, reactive offers more consistent latency distributions when properly implemented.

  • I/O Wait Ratios: If profiling shows your application spends >30% of time in I/O wait states, reactive programming's efficiency gains become more pronounced.

  • Connection Lifespan: For applications with long-lived connections (WebSockets, SSE) or streaming responses, consider reactive once you expect >1000 simultaneous connections.

3. Ignoring Backpressure

Backpressure is one of the key features of reactive streams, allowing consumers to signal producers to slow down. Ignoring backpressure can lead to OutOfMemoryError when producers emit faster than consumers can handle.

Picture this example:

// no backpressure handling
fileService.readLargeFile(path) // returns Flux<Chunk>
    .flatMap { chunk -> processChunk(chunk) } // unbounded concurrency

Instead, specify concurrency limits:

// safely controls concurrency
fileService.readLargeFile(path)
    .flatMap({ chunk -> processChunk(chunk) }, concurrency = 10) // limited concurrency

Inter-Service Communication in Reactive Systems

Modern applications rarely exist in isolation. They communicate with other services, databases, message queues, and external APIs. Here's how to handle these communications reactively:

1. HTTP Communication with WebClient

Spring WebFlux provides WebClient, a reactive alternative to RestTemplate:

@Service
class ProductService(private val webClient: WebClient) {
    
    fun getProductDetails(id: String): Mono<ProductDetails> {
        return webClient.get()
            .uri("/products/{id}", id)
            .retrieve()
            .bodyToMono(ProductDetails::class.java)
            .timeout(Duration.ofSeconds(1))
            .onErrorResume(WebClientResponseException::class.java) { e ->
                when (e.statusCode) {
                    HttpStatus.NOT_FOUND -> Mono.empty()
                    else -> Mono.error(e)
                }
            }
    }
}

WebClient supports streaming responses, backpressure, and non-blocking I/O, making it VERY GOOD for reactive applications.

2. Reactive Database Access

Spring Data Reactive provides reactive repositories for various databases:

@Repository
interface CustomerRepository : ReactiveCrudRepository<Customer, Long> {
    
    fun findByLastName(lastName: String): Flux<Customer>
    
    fun countByStatus(status: CustomerStatus): Mono<Long>
}

Some of the supported databases include:

  • MongoDB (Spring Data MongoDB Reactive)
  • Cassandra (Spring Data Cassandra Reactive)
  • Redis (Spring Data Redis Reactive)
  • R2DBC for relational databases (MySQL, PostgreSQL, MS SQL, H2...)
  • Using SDK of AWS for services like DynamoDB with async access may also work (if you do some work around the return(completable future) to parse it into a Mono/Flux)

3. Circuit Breaking and Resilience

For resilient inter-service communication, Resilience4j provides reactive circuit breakers:

@Service
class ResilientProductService(
    private val webClient: WebClient,
    private val circuitBreakerRegistry: CircuitBreakerRegistry
) {
    
    private val circuitBreaker = circuitBreakerRegistry.circuitBreaker("productService")
    
    fun getProductDetails(id: String): Mono<ProductDetails> {
        return ReactiveCircuitBreaker.create(circuitBreaker)
            .run(
                webClient.get()
                    .uri("/products/{id}", id)
                    .retrieve()
                    .bodyToMono(ProductDetails::class.java),
                { e -> Mono.just(ProductDetails.fallback(id)) }
            )
    }
}

This pattern prevents cascading failures when downstream services are unresponsive.

Best Practices for Reactive Spring Applications

Based on my experience, here are some best practices for building reactive Spring applications:

1. Reactive All the Way Through

For maximum benefit, your application should be reactive from end to end. A single blocking operation can negate many of the advantages of reactive programming.

Audit your dependencies to ensure they provide reactive APIs, and isolate any blocking operations in dedicated thread pools using subscribeOn(Schedulers.boundedElastic()).

2. Control Concurrency Explicitly

Be explicit about concurrency in your reactive pipelines:

// process at most 10 orders concurrently
orderRepository.findAll()
    .flatMap({ order -> processOrder(order) }, concurrency = 10)

This prevents overwhelming downstream systems and ensures your application uses resources efficiently.

3. Make Good Use of Operators

Project Reactor provides a rich set of operators for common operations. Here are some:

  • Transformation: map, flatMap, flatMapSequential
  • Filtering: filter, take, skip
  • Combination: zip, merge, concat
  • Error handling: onErrorResume, onErrorContinue, retry

Using the right operator can simplify your code significantly :)

Conclusion

Reactive programming in Spring offers significant benefits for applications with high concurrency requirements or those integrating with asynchronous systems. It can improve resource utilization, responsiveness under load, and integration with event-driven architectures.

Real-world results support these benefits. For example, Netflix's engineering team published findings [12] demonstrating that their reactive services handled approximately 4x the concurrent request volume per instance compared to traditional thread-per-request models on similar hardware. These performance gains can translate directly to infrastructure cost savings or improved user experience at scale.

However, it's not a silver bullet. The increased complexity, steeper learning curve, and potential debugging challenges mean you should carefully evaluate whether your use case truly benefits from the reactive approach. I've found that reactive programming, when applied to the right problems, can deliver truly impressive results—but the key is identifying those problems correctly.

References

[1] The Reactive Manifesto. https://www.reactivemanifesto.org/

[2] Spring Reactive Documentation. https://spring.io/reactive

[3] Project Reactor Reference Guide. https://projectreactor.io/docs/core/release/reference/

[4] Reactive Streams Specification. https://www.reactive-streams.org/

[5] Goetz, B. (2021). "Effective Project Reactor: Strategies for Reactive Spring Applications." Spring I/O Conference 2021.

[6] Winch, R., & Hommel, S. (2020). "Spring Security Reactive Architecture." SpringOne 2020.

[7] Kowalski, K., & Neidetcher, D. (2024). "Handling Backpressure in Project Reactor." Journal of Software Engineering Practice, 12(3), 145-167.

[8] Schroeder, M. (2023). "R2DBC: Reactive Relational Database Connectivity for Java." https://r2dbc.io/

[9] Pivotal Software. (2023). Spring WebFlux Performance Benchmarks. https://spring.io/blog/2023/11/performance-comparison-webflux-vs-mvc

[10] Smith, J. (2024). "Debugging Reactive Applications." Spring Tips, Episode 42.

[11] Krefter, D., & Neidetcher, D. (2022). "Spring MVC vs. Reactive: A Performance Comparison." Spring I/O Conference 2022.

[12] Netflix Engineering. (2023). "Reactive Systems: The Key to Scaling Spring Boot Applications." Spring I/O Conference 2023.

NOTE: The reactive landscape continues to evolve, with improvements in tools, libraries, and best practices. Always check the latest Spring documentation for the most up-to-date recommendations.