Code Review: June 2009

Nigel Jones recently wrote a blog entry that very neatly summarises an article called 'Reciprocal Multiplication, a tutorial', by Douglas Jones.

The (Douglas) Jones article explains how to use reciprocal multiplication to perform division - for instance, if you want to divide by 10, this is the same as multiplying by 1/10. However, by doing everything in fixed point arithmetic, you can avoid the computational overhead of invoking your compiler's division routine. The result is smaller, faster code that has exactly the same accuracy as the compiler-supplied result. Anyone writing PC code may not appreciate this, but in the embedded world this sort of approach is vitally important in terms of both code size and run time.

The original article is quite long and detailed, and despite having stumbled upon it a couple of times in the past, I'd never got round to actually reading it. And so (Nigel) Jones's article came as a very welcome abstract, boiling it down as he does to a couple of simple algorithms - if you want to divide by 'x', here's what to do.

At the end of his post, (Nigel) Jones said 'If someone has too much time on their hands and would care to write a program to compute the values for all integer divisors, then I'd be happy to post the results for everyone to use.' Well, I don't know about having too much time, but I do enjoy a lunchtime programming exercise, so I sent him a text file containing the coefficients to perform integer division for all 16-bit unsigned numbers from 3 to 32768. I also asked if he would mind if I posted the source code here, in case it was of use to anyone.

He responded with an excellent suggestion - how about generating a header file containing macros with the appropriate coefficients for all divisors? Anyone wanting to use these algorithms could then simply include the header file, and call the relevant macro to perform the required division.

I've followed this suggestion, so please follow the links to check out:

a header file containing the coefficients for unsigned 16-bit division
a header file containing the coefficients for unsigned 8-bit division
a command-line program generating the coefficients for unsigned 16-bit division [source][exe]
a command-line program generating the coefficients for unsigned 8-bit division [source][exe]

The programs were written in Borland Builder, but are in ANSI C, and so should run on any platform with appropriate tweaks to the uintX_t typedefs. They accept floating-point arguments, and so can generate the coefficients for division by π, sqrt(2), etc. They will also report the number of errors found during an exhaustive search of all divisors, and the maximum error found.

This is another entry in an intermittent series in which I arrogantly pontificate on what constitutes profound advice in the world of embedded software. Actually, I'm unnecessarily narrowing my horizons here, I feel quite sure I could be arrogant in a much larger sphere.

So, here we go again. This time I would urge neophytes to learn a scripting language.

These may not be directly useful in terms of running on a little 8-bitter, but they are unbelievably useful in a range of other applications.

The advantage of scripting languages is that they're... well, scripting languages. They act like super batch files letting you leverage the power of other programs to get a job done. They tend to be interpreted rather than compiled, and so are (a) really fast to write, since they sidestep the edit-compile-link-run paradigm, and (b) relatively slow to run, since they've sidestepped the edit-compile-link-run paradigm. But with the power of modern desktops, performance really isn't a huge concern with the applications I'm interested in.

At a more detailed level, they also tend to support dynamic typing, which lets you blithely ignore whether the variable you've just created is a string, an integer, or whatever, until you start using it in a context that defines what it has to be. This is both a good and a bad thing, as what the interpreter thinks it is doesn't always tally with what you think it is.

Yes, you could create an executable that would run much faster, as it's not interpreted. But for sheer speed of get-in-there-and-hack-out-something-that-works, you can't beat a scripting language.

But before I go blindly hacking away into the undergrowth of what a language is or isn't, let's try to steer the conversation back on track.

The scripting applications I find myself revisiting time and again are (1) analyzing log files, and (2) generating code.

Dumping data to disk on a PC hooked up to an embedded system is part and parcel of the daily life of a firmware engineer. But when the system's been running for several days in an environmental chamber, or for months in a remote data acquisition application, these log files can be enormous.

The chances are you're only interested in a tiny fraction of the logged data, or you're looking for a particular event, or you want to change its format slightly, or... something else. Maybe you need to break it up into chunks for analysis. Maybe you only want to see every line in which the fourth CSV field is greater than 10. Maybe you want to reformat the data for input to another program. For whatever reason, you've got a bunch of data on a disk, and you want to pull the data needle out of the multi-megabyte haystack.

The script to do this will typically boil down to a few lines containing regular expressions. Pipe the data into the script, and out to a text file, and you're done.

Scripting languages are also a natural for code generation. Write a script to generate big chunks of your firmware, which are then compiled as normal. Any type of code containing lots of variations on a theme is a natural for this approach - good candidates are communications handlers and state machines. Give the script a list of the UART commands to be handled, or the FSM states, and let it build all of the scaffolding code in a loop over the list contents. You follow along afterwards and fill in the blanks detailing, for instance, how to handle the parameters of a specific command.

You could of course make your script sophisticated enough to accept details like the command parameters, and then generate the handling code too. But there's always the danger of spending too much time developing the script instead of doing actual useful work - as always, your mileage will vary, you need to be pragmatic, and you need to make realistic trade-offs.

My weapon of choice as a scripting language is Perl. The reason for this is mainly historical, as I've been using it for years now. I can't even remember why I first picked it up, but would guess that I wanted to adapt something that almost-but-not-quite did what I wanted, and it happened to be in Perl.

My relationship with Perl is somewhat ambivalent. I find it an enormous sprawling mess of a language, both immensely powerful and willfully obtuse. In fairness I don't spend enough time with it to make it really sing, and tend to rely on the Camel, the Cookbook, and my collection of previous scripts to bash together what I need at any given time.

The Perl apostles trumpet its approach of There's More Than One Way To Do It (TMTOWTDI) as a strength, but I find it muddies the waters; ask three Perl coders to write some code, and they'll do it in at least six different ways.

However, there's no denying the power of the language. Most scripts boil down to relatively few lines of actual code, and are amazingly compact compared to the functionally equivalent C/C++ code.

There are of course scads of scripting languages; Ruby and Python seem to be very much à la mode. Just pick one, preferably one that somebody near your desk can help out with when you get stuck, and dive in.

Code Review

Sunday, 21 June 2009

Division of Integers by Constants

Monday, 1 June 2009

Profound Advice: Learn a Scripting Language

Blog Archive

About Me