It always gets worse before it gets betterWhenever you introduce a new quality control measure, things often seem to get worse before they get better.
One of the clients I’m working with recently added error monitoring to one of their services.
They’re now seeing hundreds of thousands of errors per day.
Of course, those errors were already there, and happening. But now they’re visible.
This means someone will have to spend time going through these errors, and silencing the ones that are just noise (ideally by fixing the bug that’s causing the error to be reported in the first place), and fixing the legitimate ones. If the team did nothing else until the errors were all fixed, it would probably represent several weeks of work. That means several weeks of not producing new features.
In this case, they won’t be dedicating all efforts at resolving these errors. But they certainly will spend some effort.
And some of the fixes will uncover and/or introduce new errors.
Of course, this is normal. Whenever you introduce a new quality control measure, things often seem to get worse (and sometimes they actually get worse in the short term) before they get better. This is normal.
Adventures in DevOps 129: The Future of Intelligent Monitoring and Alerting with Ava Naeini
Ava Naeini shares her patent-pending tool that uses ML to determin the health and performance of distributed systems.
Alerting or Monitoring?
An alert that you can't respond to is a wasted alert. But monitoring can cover anything.