The Dangers of Fatal Logging

I want to talk about fatal logging. It’s practically always a bad idea. Let me explain…

I was recently reviewing some code written in Go, where I saw this pattern in a constructor function:

func NewConnection(url string) *Client {
    conn, err := service.Connect(url)
    if err != nil {
        log.Fatal( err)
    }
    return &Client{conn: conn}
}

Whatever language you use, if your logging library has a Fatal option, I emplore you to never use it, and for one simple, but profound reason:

It violates the Single-Responsibility Principle

Now, I don’t generally put the SOLID principles on a high pedistal. But this is one place where the SRP violation doesn’t just peak through the curtains, asking you politely to re-consider. It bursts through the seams like the Kool-aid Man screaming for attention. Oh yeah!

In Go, calling log.Fatal in the standard library, and to my knowledge in any other logger implementation, does two things:

It logs an error message with a priority of FATAL
It immediately exits the program (by calling os.Exit(1))

Do you see the SRP violation?

Logging is one concern. Affecting control flow of the entire program (by exiting) is something else entirely. This function, by design, has two unrelated responsibilities.

But does it matter? If you know that calling log.Fatal exits the program, surely you can make an informed decision, right?

If you’re writting some short, throw-away program, akin to a bash script, sure. Whatever. I often have no problem violating the SRP, or countless other rules of thumb and best practices in such a case.

But if you’re writing anything like a real application, this SRP violation creates an insideous form of tight-coupling in your program.

What if the caller of your function wants to try to recover from an error? Maybe you add an option for the user to provide a list of connection URLs, and you want to try all of them until one works.

for _, url := range config.URLs {
    conn = NewConnection(url)
}

With our current implementation, the first failure will cause the program to crash.

“Yes, but I can edit it!” You can. But SRP. You should have only one reason to change a piece of code. We now have two: (1) we want to change the way a connection works (2) we want to decouple control flow.

Or what if you add the option to re-connect to the service after a failure after the app has already been running. Do you want a failed connection attempt to exit the program? Well… maybe. But also maybe not. That should be up to the caller of the function, not the function itself.

Only exit your program from the top of the call stack

In Go parlance, this means: Only ever call os.Exit() from your main() function. This precludes calling log.Fatal from anywhere except possibly in main() itself.

If you follow this simple rule, your constructors will not surprise their callers by exiting the program.

This is doubly important in any language (like Go), where exiting the program precludes any cleanup. In Go, calling os.Exit means that deferred functions don’t get called, and there’s no opportunity for recovery. It’s final. Do not pass Go. Do not collect $200.

This means that in many applications, calling log.Fatal may actually not even log your error! What?

If you’re logging to a network service, one of the last things you must do before exiting your program is flush your log buffer. If you call os.Exit, that flushing never happens.

Won’t that be a lovely debugging session? You typo your database config. Now your app won’t start… and it doesn’t send you any logs. 🤦‍♂️

What’s the alternative?

In the Go example above, the obvious alternative is to return the error to the caller.

func NewConnection(url string) (*Client, error) {
    conn, err := service.Connect(url)
    if err != nil {
        return nil, err
    }
    return &Client{conn: conn}, nil
}

Your language may use exceptions. That’s fine. Use whatever normal error-handling capability your language/tool provides.

In Go, there’s also the option to panic, if returning an error really doesn’t make sense. panic differs from os.Exit in three distinct ways:

It has a different semantic meaning. It means “something unrecoverable happened”, where as os.Exit means “quit the program” without regard for why.
Deferred functions are still called, so cleanup can be done prior to program exit.
It’s recoverable.

The Dangers of Fatal Logging

February 26, 2022

It violates the Single-Responsibility Principle

Only exit your program from the top of the call stack

What’s the alternative?

Related Content

Unlocking the Power of Infrastructure as Code, Go, & More on Schematical

New YouTube Channel: Boldly Go

FOSDEM 2024: You're already running my Code

The Dangers of Fatal Logging

February 26, 2022

It violates the Single-Responsibility Principle

Only exit your program from the top of the call stack

What’s the alternative?

Related Content

Unlocking the Power of Infrastructure as Code, Go, & More on Schematical

New YouTube Channel: Boldly Go

FOSDEM 2024: You're already running my Code

Improve your software delivery