Back to Projects

Resilience Visualizer

An interactive browser-based simulator for microservices resilience patterns — circuit breakers, thread pool exhaustion, cascading failures, and more.

About the project

Resilience Visualizer is an interactive simulator for microservices resilience patterns that runs entirely in the browser — no installation, no backend, no infrastructure.

The idea came from a recurring problem: concepts like circuit breaker, thread pool exhaustion, or cascading failure are hard to explain with words or a static diagram. You can draw arrows and boxes, but that doesn’t show what happens over time — how one slow service can bring down an entire system, or how a well-configured circuit breaker isolates the failure and protects everything else.

The tool lets you draw any microservices architecture as a graph, fire real-time traffic through it, and watch exactly what breaks and why.

Demo

Screenshots

Resilience Visualizer – system running normally
System running normally — nodes are green, requests flow through the BFF architecture with a load balancer.
Resilience Visualizer – cascading failure in progress
Cascading failure — an overloaded backend exhausts BFF threads, which starts rejecting client requests.
Resilience Visualizer – retry storm
Retry storm — a sudden traffic spike overwhelms the gateway, threads are exhausted, the system enters an error spiral.
Resilience Visualizer – e-commerce microservice mesh
Microservice mesh — Gateway → User/Product/Inventory Service → separate databases. A realistic production scenario.

Key features

  • Circuit Breaker with three threshold modes: Count, Percentage, Both — full closed→open→half-open→closed lifecycle
  • Platform vs Virtual/Async threads toggle — demonstrates how blocking I/O causes thread starvation vs async models
  • Load Balancer with round-robin, random, and least-connections strategies
  • Kill / Recover buttons per node for chaos engineering — configuration is preserved on recovery
  • Dual error metrics: windowed Err%(W) for fast node coloring, cumulative Err%(∑) for long-term trends
  • 8 built-in scenarios: Cascading Failure, Circuit Breaker Demo, Retry Storm, Microservice Mesh, Timeout Tuning, and more
  • Real-time metrics per node: RPS, avg/p99 latency, error rate, thread pool and connection pool utilization
  • Speed control from 0.1× to 10×

Try it

resilience-visualization.vercel.app →

Open it in your browser, pick a preset scenario, hit Play, then kill a node and watch what happens.

Read more

For a deep dive into the simulation engine, the bugs I found along the way, and how Claude Code helped build this — read the full article.