When Overwhelmed, Make a List of Questions
In my first big project at Netflix, I learned this disturbingly simple problem-solving technique.
Intro
Hello friends. Today’s post discusses parts of Lesson 1.3 from my book Push to Prod or Die Trying, where I describe how I approached a critical problem in an unfamiliar domain.
Disclaimer: Neurafilm is a fictional company used to dramatize my real-world experiences.
The Problem
Let’s go back about 15 years or so. I had been working at Neurafilm for a short time. I was on the API team, which you can think of as a large functional gateway between Neurafilm-friendly devices and a large collection of backend services.
Here’s a diagram:
One day, in our weekly team meeting, the director of our team announced that we’d be launching in Europe in a few months.
It was the first time we’d be serving requests from multiple AWS regions.
My manager asked me to figure out how to route HTTP traffic to the new region, or, more specifically, to implement a system to ensure that traffic always went to the right place for a specific device.
I had a very limited understanding of how anything worked. A colleague of mine came into my cube and said:
I think you can do some geo stuff with DNS, or something
Then he walked away. I stared out the window for a while.
Questions and Answers
I felt overwhelmed. I didn’t have any better ideas, so I sat down and wrote down all of my questions, and gradually started answering them. Here are a few examples:
Q: Can we use DNS to route traffic?
A: Yes, but it won’t work correctly 100% of the time.
Q: Why won’t DNS route customers correctly all of the time?
A: Generally, DNS lookups include a resolver IP but not a client IP. So it can geolocate the resolver but not the actual client requesting the lookup.
(I may discuss this more in a future post.)
Q: What percentage of customers won’t be routed correctly by DNS?
A: Not sure, but the number I’ve heard casually mentioned is between 1% and 5%.
Q: What will happen when DNS doesn’t route a customer correctly?
A: They will have a broken experience. There is a very large “why” for this that involves content metadata management along with typical system design constraints for latency, memory, and resiliency.
(I may discuss this more in a future post.)
Q: How can we ensure that the product works for all customers?
A: We can either proxy or redirect them to the right place. If a customer can only be served from one region, but the requests can initially land in any region, you only have two options:
Modify your system to serve them from anywhere (not a practical option at the time)
Build a system to always route them to the right place.
Q: What’s the best choice, proxying or redirecting?
A: It depends: how much time do we have to solve the problem and what parts of our stack are most amenable to change?
(I will definitely discuss this more in a future post.)
(Also, I immediately apologized to myself for using the “it depends” architect-speak.)
Many more questions and answers followed, but you get the idea.
Isn’t This Obvious?
Fast forward to a few months ago. I received this feedback from a reviewer of Chapter 1:
Do you really need all of these words just to say that it’s good practice to write down questions and answer them?
Yes, I think I do. Here’s why:
This approach has been immensely valuable to me for years, even in non-software contexts (before meeting with an accountant or lawyer, for example).
When illustrating a point like this, details matter. A specific example of a problem where I used this approach is more valuable than saying, simply, “Try writing stuff down.”
Decision-making is hard. It’s easy to get overwhelmed by too much or too little information, too many choices, and people pressuring you to move more quickly than you are comfortable with.
I cannot overstate the value of sitting down, writing a bunch of questions, and starting to answer them.
Having one question answered feels exponentially better than zero. And before you know it, you’ve got a useful collection of information that you can convert into:
An opinion
A proposal (if necessary)
A decision
And look at that, you’re moving forward.
Next time you’re facing a complex problem, give this approach a try and let me know how it goes.