Push to Prod
Subscribe
Sign in
Home
Archive
About
concurrency-war-stories
Best Questions and Answers to the “Terrifying Netflix Concurrency Bug” Post
Why couldn't we roll back? Weren't we concerned about costs? Why is operating Netflix so complicated? Plus many others.
Nov 29, 2024
•
Matthew Hawthorne
7
2
Making an AB Test Allocator 20x Faster Using Non-blocking IO
Many moons ago, I inherited a cross-company AB test and had to figure out how to allocate it effectively.
Oct 30, 2024
•
Matthew Hawthorne
3
How Would You Design Amazon S3?
One of my favorite system design questions from a few years ago and a fascinating thing I learned while answering it.
Oct 1, 2024
•
Matthew Hawthorne
5
One Key Aspect of Optimizing Computational Throughput
When optimizing throughput, think carefully about how many CPUs you have when adding threads or processes.
Sep 18, 2024
•
Matthew Hawthorne
1
How We Built a Self-Healing System to Survive a Terrifying Concurrency Bug At Netflix
Our CPUs were dying, the bug was temporarily un-fixable, and we had no viable path forward. Here's how we managed to survive.
Aug 27, 2024
•
Matthew Hawthorne
56
1
When Things Are Slow, Look for Queues
When your system is slower than desired, queues are often heavily involved. Here's an overview of the most common situations.
Aug 13, 2024
•
Matthew Hawthorne
10
Metrics Are The Map, Not The Territory
We can never obtain a complete and perfect understanding of what our systems are doing. All we have are incomplete signals to inform our theories.
Jul 30, 2024
•
Matthew Hawthorne
5
Comprehension of Concurrency is a Lifelong Journey
Accept that your understanding is imperfect. There is comfort in letting go.
Jul 23, 2024
•
Matthew Hawthorne
3
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts