When Things Are Slow, Look for Queues
When your system is slower than desired, queues are often heavily involved. Here's an overview of the most common situations.
NOTE: Check out the other posts from the Concurrency War Stories series here.
Hello friends. Today I’m discussing one of my favorite topics: queues. They’re everywhere, and if your system is slow, they are probably involved.
ANOTHER NOTE: My perspective here is based on blocking IO. For non-blocking IO, my understanding is that queuing still slows you down, but nothing “blocks”; instead, your callback just waits to execute. Two different roads to the same destination.
Activities and Resources
Below is a diagram of a hypothetical service1 making an HTTP request to service2, on an also-hypothetical node1 and node2, respectively:
The blue boxes represent activities, and the pink cylinders represent queues, or, more precisely, resources for which access is gated by a queue.
Every time a blue box points to a pink cylinder is an opportunity for service1 to slow down tremendously.
One of the most memorable archetypes you meet in the software industry is the person who, immediately after noticing that the system has “become slow”, starts adding indexes to database tables.
Now, occasionally your situation could actually be improved by adding indexes to your database. But much more often, your problem is queues. Or, more precisely, that your system is using queueing to mitigate resource contention.
Queue Types
Here’s a bit of detail about the queues in this diagram:
JVM Thread Pools: If you’re building HTTP services in a JVM-based language, you’re very likely processing requests within a thread pool. The size is often set in a config file. If an incoming request is ready for processing but all threads are in use, you’ll block until one is available.
Process Queues: In the JVM, threads are equivalent to OS-level processes, so your running threads are still at the mercy of the OS’s process queue, which may interrupt them at any moment. Sometimes, this makes sense, like if you are waiting on IO. Other times, you have too many threads fighting for too few CPUs, so you may want to consider running fewer threads per node and/or running more nodes (I’ll discuss this in more detail in the future.)
HTTP Connection Pools: If you’re sending HTTP requests from your service, you likely have a fixed-size connection pool. If you need a connection but they’re all in use, you’ll block until one is available. Having a too-small connection pool has bitten me approximately twenty-thousand times in my career.
Also - if you need a connection, they’re all in use, but there is space in the queue for more of them, you’ll block until the connection is established, which may take longer than expected if you’re making an SSL connection to a server thousands of miles away.
“Longer than expected” is likely hundreds of milliseconds, which doesn’t sound like much, unless you’re required to service requests in a time much shorter than that.
TCP Queues: Finally, once you have your HTTP connection and are ready to rock, you must deal with TCP queues. Specifically, there is a TCP send queue, a network interface queue, and a TCP receive queue. Or at least, that is what ChatGPT told me this morning. (My familiarity with this level of the network stack is closer to suspicion than knowledge.)
Thread Dumps Always Help
I’ll talk more about how to detect and resolve queueing problems in a future post. Until then, if you’re using a JVM-based language and are not equipped with any sophisticated metrics or tooling, try this:
SSH onto a box
Generate a thread dump
Look for an unexpected number of threads blocked at suspicious stack locations
Within your HTTP client library, or your database client, for example
In my career, I’ve obtained much more value out of generating thread dumps than almost any other activity. Your situation may differ. Thanks for reading and have a wonderful day.