The False Dichotomies of Automated Testing
August 24, 2017
This is the first in a series of posts about automated testing for software developers.
I’ve been fascinated by this thing called “programming” since I first learned I could enter BASIC programs into my family’s Commodore 64 when I was 8 years old.
I became a full-time software developer in 2006.
And I “got religion” about automated tests shortly after that.
But still not everyone is as “enlightened” as I am when it comes to writing automated tests. In fact, I’ve run into a large number of fellow software developers who are absolutely convinced, for one reason or another, that those like me are utterly naïve when it comes to the pragmatic reality surrounding automated tests.
This series intends to address the criticisms I’ve most often heard lodged against automated testing. But my goal isn’t simply to preach salvation to the lost testing nay-sayers. The truth is that most of these criticisms are based on some element truth–truth newly initiated Testing converts often overlook in their zeal for their newfound religion.
So if you’re looking for debate fodder with which to clobber your test-nay-saying colleagues, you’re probably in for a bit of a rude awakening yourself, too.
Why Automated Testing is great
I do think automated testing is great, and for many reasons. I’ve written a little about this before. And if you read this series, you’ll come to understand some more of my reasons. But test evangelism isn’t my primary goal in these posts. The 17 Reasons why Unit Tests Will Save The World will have to wait for another day.
My goal with this series is to bring some semblance of reason to what far too often becomes a polarizing debate that usually boils down to ideological extremes.
Each post will highlight two opposing, extreme views–a false dichotomy–relating to automated testing, then try to find a reasonable middle ground based in reality, rather than in ideological extremes.
Unit vs. Integration vs Acceptance
For the purpose of this discussion, I’m not especially interested in the various flavors of automated tests for the purpose of this series. For one, the specific definitions tend to get blurry, with different individuals or industry sub-segments having particular definitions. In addition, many definitions tend to overlap with each other, and many discussions about testing types just turn into pointless debates over semantics.
For these posts, I simply want to focus on automated testing as opposed to human-driven testing. If a computer executes the test, from start to finish, and reports a pass/fail state, it is an automated test for my purpose (it may provide additional output as well, such as which subtests may have failed, but that’s a detail).
False Dichotomy #1: Testing vs. Monitoring
I will start with what has become my favorite Automated Testing False Dichotomy. It’s my favorite because it’s perhaps the easiest to demonstrate as a false dichotomy.
This one I picked up during my year working at Booking.com. In my first month I attended an on-boarding session facilitated by a principal developer to learn the merits of Booking.com’s advanced monitoring system. And to be certain, Booking.com does have a pretty advanced monitoring system. Actually, they have at least three distinct monitoring systems (probably more), which monitor different aspects of the system (and in some cases, they redundantly monitor the same aspects of the system).
The result is that when there’s any significant problem with the production servers, people know almost immediately. If hotel bookings drop due to a server crash, everyone will known by glancing at the flat line in the sales graph displayed on dozens of wall-mounted monitors around the office.
After spending the allotted hour discussing server infrastructure, the trainer spent another hour and fifteen minutes (!!) singing the praises of Booking.com’s (rightfully impressive) monitoring, and denouncing any possible unspoken suggestion that perhaps Booking.com should have invested some of their effort into automated testing instead.
“If you had to choose…”
“If you had to choose between monitoring and testing, wouldn’t you rather have monitoring?” was the summary statement from the principal developer.
Now, I don’t actually know how I would answer that question. If I had to choose, would I chose monitoring or testing? I don’t know. But then, that’s why I’m writing this post: It’s a false dichotomy!
Booking.com’s entire pro-monitoring and anti-testing culture is founded on this blatant, obvious false dichotomy.
But to humor Booking.com, let’s imagine I did have to choose between monitoring and testing. Which would I choose? So I lied a moment ago when I said I didn’t know how I would answer. I actually do know exactly how I would answer:
I would choose half of each.
If I have X developer hours to invest in building either a monitoring system, or automated tests, I would spend X/2 of those hours building a monitoring system that monitors the most vital half of services, and I’d spend X/2 of those hours building automated tests for the most vital of the software components.
Then we would still notice if sales drop off significantly, thanks to monitoring the most important things. We’d also notice if a software change suddenly started converting Mexican pesos to Indian Rupees incorrectly, for the 0.003% of customers who do this conversion, thanks to automated testing.
But who wants half a monitoring system? Or half of an automated testing infrastructure?
The Nintey-Nintey Rule
Let’s consider a quote from Tom Cargill (so that we can take it out of context), also known as the “Nintey-Nintey rule”:
The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time.
Of course we should not draw sweeping conclusions from an intentionally humorous quote. But I’m willing to bet that your experience matches mine. In the later stages of development, a lot more more time is spent tweaking things, and fixing small bugs, versus adding new functionality. So let’s extrapolate and simplify a bit, and call this the “Ninety-Fifty rule”:
The first 90 percent of the code accounts for the first 50 percent of the development time.
If we accept this rule, and we divide our X developer hours evenly across monitoring and testing, the end result would be a 90%-complete monitoring system and a 90%-complete testing system.
(Of course if this rule actually played out flawlessly in the real world, we’d find everyone always dividing resources infinitely, so that we always got 90% of two things instead of 100% of one thing. So this concept has its limitations.)
But even if we get only a 50%-functional monitoring system and a 50%-functional testing system, I think the end result is still better than completely neglecting testing or monitoring. To whatever extent the Ninety-Fifty rule actually applies to software development, there is potentially an additional net-gain by splitting resources this way.
At the scale of Booking.com, it seems obvious to me that spending time on both fronts would make perfect sense. With 3+ complex, large-scale monitoring systems in place, what if we had just developed 1 or 2 of those systems, and spent the remaining time developing automated tests? It certainly would have been possible. Nobody can argue that the resources weren’t there.
But what if you’re a small startup, or a one-person team starting from scratch? You can’t build half tests and half monitoring first. I’ll discuss more of this topic in a future post.
Series Index
- False Dichotomy #1: Testing vs. Monitoring
- False Dichotomy #2: All vs None