Hidden dependencies and the Fastly outage
Next time you have an unexpected outage, take note, document it, and consider coming up with a mitigation strategy.In IT we talk a lot about dependencies. We depend on NPM modules in our code. Our microservices have dependencies on each other. We depend on certain operating systems and versions to run our microservices.
A lot of effort goes into managing these dependencies effectively and efficiently.
Even so, dependencies almost inevitably leak through the cracks, and we can end up with hidden or invisible dependencies.
Yesterday served as a big reminder of this for a huge part of the Internet when Fastly deployed a bug.
Immediately, many of us began to see panic from colleagues and online friends blaming StackOverflow, or The Verge, or some other online property for being down, when in fact the problem was an unseen (to the user, at least) dependency on Fastly.
What’s more, Fastly’s own network status page was victim of the outage*, making it impossible to even read about Fastly’s ongoing attempts to resolve the problem.
What hidden and invisible dependencies does your project rely on? Obviously, you don’t know. But next time you have an unexpected outage (perhaps caused by Fastly), take note, document it, and consider coming up with a mitigation strategy (an incident postmortem can be a great tool for this). Most of our hidden dependencies are not managed by a company as responsive as Fastly, and could result in days, or months of downtime, or even bankruptcy in extreme cases.
*at least it was unreachable for me, although I've seen others claim they could reach it.