I recently encountered a bug that was caused by a special equality definition on a NaN value. NaN means not-a-number, and it's special floating-point number value, representing result of impossible operation. This issue shows how NaN can introduce hard-to-find bugs.

The problem

One of our external services started to return NaN values. After some investigation, it turns out that the responsible part is the following part of the code:

let missingToOption missing value =
    if value = missing then None
    else Some value

This seems perfectly valid. We take a value, and if it is equal to some specific missing value, it returns None. We can think about it as the inverse to Option.defaultValue.

But, there is a catch when applied to NaN. When missing and value are NaNs, this function returns Some(nan). (nan here is an alias to System.Double.NaN, defined in the F# standard library). It's because equality for NaN is defined as nan = nan equals false and nan <> nan equals true. Actually, the <> operator is the only comparison operator that returns true when one of its operands is NaN. For details, check the wiki page on NaN.

In our case, the NaN value appeared from an external service, where some configuration for the missing value representation had a value of NaN. This is easy to miss because the NaN value wasn't explicitly mentioned to start with.

When we know that NaN is the problem, the solution is easy:

let missingToOption missing value =
    if value = missing then None
    elif Double.IsNaN missing && Double.IsNaN value then None
    else Some value

NaN origin

Why is there this strange NaN value anyway? NaN means not-a-number, and it's defined as the result of undefined mathematical operations, like dividing by zero, taking the square root of negative numbers, and some others. Practice shows that raising an exception or another way of signaling an invalid operation would be better, but that's quite expensive not only because raising an exception is expensive, but maybe more importantly, it would mean adding a check to every affected operation. By introducing one special value, the division operator can remain fast, and still, we have a signal that some calculation was invalid.

NaN behavior is defined in the IEEE 754 standard.

NaN was designed to be viral - every operation returning a float type returns nan if one or both of its operands are nan. This way NaN "bubbles up" through almost every calculation. That's quite reasonable, but what about comparisons?

Probably the main reason why nan <> nan was that NaN should be incomparable with every number as a comparison doesn't make sense. But it has the unfortunate effect of breaking the rules of comparisons, for example, that exactly one of x < y or x >= y must be true.

Conclusion

Whatever the real reasons were, this behavior of NaN is present in probably every language. When working with floats, it's important to keep in mind that special values - NaN or infinity - can appear, and their specifics.

Problem with NaN equality

The problem

NaN origin

Conclusion

Subscribe to my newsletter

Jindřich Ivánek

Jindřich Ivánek