Say Farewell to Forgotten Cleanups

December 6, 2019

How often do you find some code like this in your production code base?

form.submit(function(event) {
    console.log("Submit hit");

    var formData = {

It’s obvious case of stray debugging code that never got cleaned up during code review.

Or have you ever seen something like this, perhaps in your CI config?

script:
  # - npm install
  # - npm test
  - ./script/test.sh

This one is especially dangerous. It appears that someone commented out the core functionality of the automated test stage of CI! It probably happened while testing some change to ./script/test.sh, so that the CI stage would get to the code to be tested quicker. But what a dangerous mistake to not undo this before merging to master!

What these cases have in common is that they are forgotten cleanups. Code was added, commented out, or otherwise modified, with the intention of reverting it later, but the reversion was accidentally missed.

I have a simple technique I like to use in my CI pipelines to catch common, simple mistakes like this, without getting in the way of testing.

But first, a little CI background…

The Stages of CI

In a typical project, I will probably define the following stages:

  • Pre-test

    This runs my code linters and any static analysis tools I’m using. Most or all of these tools should also be run in my editor, but this is a final safety check. These should run very quickly, too (30 seconds or less).

    This is also the best place to run unit tests. Almost by definition, unit tests should not require the entire project to run, and they should also be very quick (a few minutes at most, even on a large project).

  • Build

    Here the project is built, and the final artifact(s) stored for later use.

  • Integration Tests

    If the project has any sort of integration tests that depend on the built product, this is the place to run them. API tests, security vulnerabilities, etc, can be run here.

  • Review Environment

    I like to start a review environment for each merge request. In this stage a complete, or nearly complete, copy of the app is deployed to a temporary domain, where it can be completely tested, or demed live.

  • UI Tests

    If you have automated UI tests, as Selenium, this is the place to run them, as they will depend on a running Review environment.

  • Deploy

    This stage would normally run instead of the Review stage, for builds on master.

If you’re using CI/CD, I would imagine your pipeline looks something like this. The particulars, of course, vary from project to project.

Maybe you don’t have dynamic review environments. Maybe you have a dedicated staging or test environment. That’s fine. Just substitute “Deploy To Staging” for “Review” in the list above, and the rest should still work the same.

With that background in mind, let’s get back to solving the problem of forgotten cleanups.

Detecting cruft

The first task at hand, if we wish to solve this problem with our CI pipeline, is to detect these forgotten cleanups. Here I’ll talk about just a few ways to do this, which should be enough to get your creative juices flowing.

A “safeword”

One technique I like to use, is to put a “Safeword” in my code at the same time I’m adding code I know I want to revert. My usual safeword of choice is NOMERGE, but use what you like. Detecting the safeword is a simple matter of grep:

if git -C "${dir}" grep -n NOMERGE > $tmp; then
    echo "NOMERGE found; merging not permitted"
    cat ${tmp}
    exit 1
fi

Of course this method does require discipline. Say I was adding the debug statement in my first example, I might add it like so:

form.submit(function(event) {
    console.log("Submit hit"); // NOMERGE

    var formData = {

This serves the secondary purpose of advertising to any colleagues who may be reviewing my code, that this line is temporary, and will be removed before merging. This may save a lot of otherwise unwanted review comments, reminding you to remove your debug code.

But this is only helpful if you remember to add the safeword to your code.

Magic commit messages

Another proactive approach is to use a magic git commit message. When such a message is found, merging can be blocked. In most of my projects, I do this for two keywords: “debug” and “wip”. If a merge request has a commit message beginning with either word, merging is blocked.

if ! git -C "${dir}" log ${ANCESTOR}..HEAD --exit-code -i --grep "^debug$" > $tmp; then
    echo "'debug' commit message found; merging not permitted"
    cat ${tmp}
    exit 1
fi

Custom checks

For those cases where it’s easy to forget, like commenting out a time-consuming function you don’t care about for a specific test case, a safeword isn’t as likely to help.

This is where it can be useful to write some custom checks. I like this approach for very specific files (like CI config, or a configuration file) that I feel is of especially high importance.

To avoid the accidental commenting-out of our test suite in the earlier example, perhaps we prohibit comments in our CI configuration.

if grep '^\s*#' .gitlab-ci.yml > $tmp; then
    echo "CI config commented out"
    cat ${tmp}
    exit 1
fi

Or maybe we want to allow comments if they match a specific pattern that isn’t likely to happen by accident, while prohibiting comments created by editor shortcuts.

  #-# This comment is allowed
  script:
    # But these comments are not
    # - apt-get update && apt-get install -y libespeak-dev
    # - go test -race ./...
    - ./script/test.sh

We can modify our test so:

if grep -v '^\s*#-#' .gitlab-ci.yml | grep '^\s*#' > $tmp; then
    echo "CI config commented out"
    cat ${tmp}
    exit 1
fi

Paranoid Linters

One last technique worth considering here is an ultra-paranoid linter. You may use the same linters used in the Pre-Test stage, but with a custom configuration or higher sensitivity. Whether this applies to you may also depend largely on the tools available for your language.

But let’s consider the first example again, of an extraneous console.log call in our JavaScript.

For a case like this, we may want to use the no-console rule provided by ESLint. If we know that all legitimate, production debug logs go through our logging library, this should be safe. You may also wish to prohibit the use of alert here.

If the language of your project doesn’t have tools that allow the prohibition of such commonly-abused debugging tools, as writing to standard output or standard error, you may resort to another grep. It may not be fool-proof, but even grepping your code for console.log would catch most casual offenders.

Putting the pieces in place

“That all looks nice, but if I add these strict rules to CI, how can I test in my review/staging environment?” I hear you possibly say.

It’s for this reason that I separate these extra paranoid checks into a separate CI stage, and I put them in the pipeline after the Review stage.

So my new stage list will look like this:

  • Pre-test
  • Build
  • Integration Tests
  • Review Environment
  • UI Tests
  • forgotten-cleanups
  • Deploy

By putting these new tests here, I’m still able to use console.log, and comment out large sections of code, even debug (most of) the CI pipeline itself, using whatever ugly techniques I like. But a failure to clean up my cruft will prevent a final merge into master.

In Action

Help yourself to the bash function I use in my projects. Save the file in your project as post_test.sh, for example, then source it from your CI stage, and execute it. For example, in my GitLab-CI configuration:

post-test:
  stage: post-test
script:
  - . post_test.sh
  - post_test
except:
  - master

To demonstrate, I’ve created a merge request with my safeword. The pipeline deploys the review environment, but then fails, before allowing a merge to master.

And when I look at the details of the pipeline failure, I see the expected failure reason:

Conclusion

I hope this technique will provide some value for you. I would love to hear from you below in the comments, about your experience with this, or any similar approaches, or if you have any questions.

Notes

A few notes on the script as I’ve written it:

  1. It defines the safeword, without referencing it directly, to avoid false-positives while scanning the script itself.

    SAFEWORD="NO"MERGE # construct the safeword without triggering the test
    
  2. It determines where this branch diverged from master, for the next tests.

    ANCESTOR=$(ancestor "${dir}")
    
  3. I then do three tests: Check for the safeword, then check for debug and wip in commit messages.

  4. If any of these conditions are met, the script exits with some descriptive output, and exit code 1.

If you wish to use a “paranoid linter”, you would add it to this function. I’ve left that out of my example, because this depends a lot on which language and tools you’re using.


comments powered by Disqus