Code Review: Enddianness and Neddiness

This is not an article about byte ordering.

Nor is it a discussion of the correct way to eat your eggs.

Rather, it is an exposition on the inability of the human brain to grasp the implications of statistics.

I'm writing this on a Saturday evening. On Friday night, the last thing I did before leaving the office was to start a stress test running on one of my chip designs. It's testing the chip communications. I'm testing that under no circumstances will the chip lock up, or fail to respond correctly to communications. I'm running the test because previous designs on other silicon have exhibited failure modes in which it was possible for the chip to stop responding to communications.

On Monday morning the first thing I will do on going into the office will be to see if the LED attached to the chip is still blinking, indicating that the test is still running, and no failures have occurred. I've been running this same test every night and every weekend for about three weeks, and seen no failures yet. Does this mean that the communications in this chip is bulletproof? No, it does not.

The highly recommended book The Black Swan, by Nassim Nicholas Taleb, talks about the nature of randomness, and in particular what the author dubs Black Swans - events deemed to be of very low likelihood, that have a devastating impact, and that can be explained away after the fact by experts with a little judicious hindsight. (In this podcast, Taleb describes this as "a retrospective ability to weave a causal link", which is a rather wonderful turn of phrase). These would be the self same experts who were completely blindsided by the events themselves, but who can confidently explain them away after the fact. Events such as the two World Wars and the various economic disasters of the last century or so are all Black Swans.

Taleb, in both this book and the earlier Fooled by Randomness, pours withering (and often entertaining) scorn upon such experts. Economists, stock traders, MBAs, and indeed anyone having the temerity to make predictions about the future based on past trends, are all mercilessly ridiculed. (He makes a few honourable exceptions - Karl Popper, George Soros, and Benoit Mandelbrot have all earned his respect.)

In one chapter of The Black Swan Taleb discusses the nature of human fallibility. In it he talks about the medical acronym NED (No Evidence of Disease). Apparently this is written on a patient's records after some tests have been run, and no sign of any malignancy or unusual activity has been found.

What, Taleb points out, doctors will never write is END - Evidence of No Disease. That is, they will happily say that they did not see any sign of a problem, but they will never say that there is no problem.

As engineers, it behooves us to adopt the same approach. If you have been working as an engineer for any length of time, you will have been caught out by the apparent absence of any problems in just the same way that I have in the past. Just because we have run tests for a period of time and seen no bugs, this does not mean that there are no bugs - it just means that we have not seen any. But even though I'm now on the lookout for feathery portents of doom, I'm still really hoping that that LED will be blinking on Monday morning.

Code Review

Thursday, 1 July 2010

Enddianness and Neddiness

No comments:

Blog Archive

About Me