Print Story Mine's a double
Ranting
By gpig (Thu May 27, 2010 at 10:40:11 PM EST) single precision, double precision, MATLAB, whisky (all tags)
MATLAB considered harmful


Today I discovered something about the programming language that I currently use for most code at work. I have to admit that MATLAB wouldn't be my first choice; but the boss has a lot of code already written in it. Also, if I use MATLAB it means he'll be able to easily maintain and use my code if I go to work somewhere else.

All of this is fair enough. As I've learnt more about MATLAB I've found some of its dark corners — weird little inconsistencies in the language where the developers had to make a decision about how the language would behave, and in hindsight they jumped the wrong way. Every language has these, though some have more than most, and some languages will break compatibility to clear them up.

Today, however, I found something that went far beyond such minor complaints.


>> a = single(1);
>> b = double(2);
>> c = a*b;
>> class(c)

ans =

single

A single precision floating-point number, multiplied by a double precision floating-point number, returns a single. This means that MATLAB is silently destroying information.

To be fair, when I looked this up in the documentation, it was clearly documented. I just never thought to look, as I would never have expected behaviour like this in any language, and especially not from software which is intended for use in scientific computing.

Here's the worst of it, though:


>> a = double(3);
>> b = uint8(100);
>> c = a*b;
>> class(c)

ans =

uint8

>> c

c =

  255

Not only is uint8 taken as the type of the result, but it silently truncates the real result (300) to the maximum value which will fit in the type, 255. No overflow error, not even a warning.

The danger of this is that from even one value, single precision (or worse) will silently propagate through the code. The only way you'd ever know is if this produces an very unexpected or impossible result. (In my case, I got negative variances using the computational form. This formula subtracts two very large numbers, so it's sensitive to precision).

Double precision is the default in many cases. For example, numeric literals in the code are interpreted as double. So there are cues giving the developer a false sense of security.

On discovering this, I had a daydream about exposing this to the world. In this fantasy a health warning would have to be added to the box: "This product will produce single precision results on single and double input, which is considered harmful in the state of Massachusetts". I then imagined that I was the tech lead for a software team, and after a tense conversation with the boss, had gathered all my developers to tell them that I could no longer trust the language we were using, and would have to start plans to migrate away*.

As it is, in the situation I'm in, I'll just have to swallow my pride and keep using it. I'll need to review all the code that I manage, and maybe rerun some of our calculations, some of which take several weeks. So, I'm sitting at home, ranting away, and drinking a double from my glorious motherland**. Cheers!

gpig rant disclaimer: For all that this is annoying, I have running water and no civil war, and so count myself lucky.


* I would actually, really have done this if I was in any position to do so. It's about the only time I've remotely been close to wishing for authority of any sort in my work life.

** Oban 14yo, in case you were wondering.

< MLP: Cyclist records drivers' bad behavior with helmet cam. | Fire, fire, fire! >
Mine's a double | 17 comments (17 topical, 0 hidden) | Trackback
This sound like a (dramatic pause) by johnny (4.00 / 3) #1 Thu May 27, 2010 at 11:02:21 PM EST
Floating Point Error!

Any relationship between my comment and my famous novella "Bees, or, The Floating Point Error" (wholely encapsulated in Cheap Complex Devices) is purely whatever.

Anyway, thanks for a trip down floating point memory lane.

Hope you eventually arrive at a more-or-less accurate resolution.

She has effectively checked out. She's an un-person of her own making. So it falls to me.--ad hoc (in the hole)

would have rated this comment 3.999999999999999999 by gpig (4.00 / 1) #2 Thu May 27, 2010 at 11:18:50 PM EST
if I only could.

Thanks, I'm sure it will work out, it's just some more lost time when I could be doing Science.
---
(,   ,') -- eep

[ Parent ]
short story by LilFlightTest (2.00 / 0) #15 Sat May 29, 2010 at 02:27:43 PM EST
my dad has a couple beehives now...and planted a whole field of clover just for them.

unrelated, did you know that basswood pollen makes honey taste vaguely minty?
---------
if de-virgination results in me being able to birth hammerhead sharks, SIGN ME UP!!! --misslake

[ Parent ]
The action is arguably correct by wiredog (4.00 / 1) #3 Fri May 28, 2010 at 08:17:45 AM EST
The lack of a warning is problematic.

Earth First!
(We can strip mine the rest later.)

My initial <incorrect> opinion as well. by wumpus (4.00 / 2) #11 Fri May 28, 2010 at 02:36:24 PM EST
On the surface, it looks like it is following the simple significant values rules taught in high school. On the other hand, when doing anything remotely resembling real numerical science or DSP work (MATLAB is heavily used there as well), rounding is key.

On recent computers, it is possible to compute the FFT of an entire CD*. All those values are signed int16s. You can compute the FFT for anywhere less than a half a second using 32bit floats. Go much beyond that and your signal steadily becomes noisier due to rounding noise. I'd take a pretty good guess that the reason double is used is because it takes more than a few seconds to run the calculation, not because anybody is pretending that they measured and are computing 7 digits of accuracy.

Wumpus

* it came a bit too late for prog rock.

[ Parent ]
I agree with wiredog by sasquatchan (4.00 / 1) #4 Fri May 28, 2010 at 09:20:11 AM EST
that you should get a warning.. But then again, so few people compile at /w4 and warnings as errors on visual studio .. well.. lazy programmers..

Ardmod by ana (4.00 / 1) #5 Fri May 28, 2010 at 10:01:38 AM EST
is the latest bottle of amber liquid I'm using to poison myself. Tasty, light, interesting.

These days I hardly write a program anymore; just scripts to tie other people's tools together and reformat the outputs of one to match what's required as input for another. Unfortunately, I do most of this in the tcshell, which makes for utterly inscrutable code. It does have the advantage that I can try it out, one tool at a time, from the command line, and end up with long pipelines of stuff strung together.

"And this ... is a piece of Synergy." --Kellnerin

er... by ana (2.00 / 0) #12 Fri May 28, 2010 at 09:42:15 PM EST
Ardbeg, the bottle informs me.

"And this ... is a piece of Synergy." --Kellnerin

[ Parent ]
Hmm by gpig (2.00 / 0) #13 Sat May 29, 2010 at 12:10:24 AM EST
I wondered, and in fact I almost looked it up ....

As a fan of the peaty whiskies, it's one of my favourites. If you like Ardbeg you should also try Caol Ila (if you haven't already).
---
(,   ,') -- eep

[ Parent ]
In Key West, by ana (4.00 / 1) #14 Sat May 29, 2010 at 08:22:13 AM EST
when we were there on vacation... was it just a year ago? Anyway, we found this little hole-in-the-wall bar (on the front of a B&B, wouldn't you know) called Rum Runners.

Now having happened across a wonderful bottle of Ron Zacapo (trying to duplicate the wee nip of the cordial at another B&B in Provincetown), we were aware that there's ho-hum-rum, yo-ho-ho-and-a-bottle-of rum, and then there's RUM.

The guy behind the counter there knew his stuff, and had a top shelf of the good stuff, which we worked through on more or less nightly visits over the time we were there.

I'd like to find a similar experience with Scotch someday.

"And this ... is a piece of Synergy." --Kellnerin

[ Parent ]
Our dudes and dudettes by technician (4.00 / 1) #6 Fri May 28, 2010 at 10:11:25 AM EST
here use Matlab. The only things I know about it: administering the licenses used to be a HUGE pain in the ass, and now it's a lot easier unless you have a disconnected system. Also, on a 32bit system, it will take days to process data files larger than 100GB.

yeah, well, if YOUR piece of shit by sasquatchan (4.00 / 1) #9 Fri May 28, 2010 at 02:19:55 PM EST
ISCSI drives wouldn't keep failing..

;)

[ Parent ]
We're on fibre channel now, by technician (4.00 / 1) #10 Fri May 28, 2010 at 02:26:59 PM EST
so things are roughly 1.2x faster.

[ Parent ]
error propagation by blarney (4.00 / 1) #7 Fri May 28, 2010 at 12:39:54 PM EST
I can see why they did it this way, though.  It's stupid from a programming point of view but it almost makes sense from a calculation point of view.  If I multiply a low-precision number and a high-precision number the result will inherit the lower precision.  5 + 1 X 8 + 0.0001 = 40 + 40 sqrt(0.22 + 0.00001252) = 40 + 8 pretty much.  So if you think of a single precision float, not as a number, but as a range of numbers, and you think of the double as a much tighter range of numbers, then it makes sense to 'demote' the result of a multiply.  Addition is a bit more problematic, but if it had to be done one way or the other, it would make sense to 'demote'.  If you want to consider floats as ranges in R that makes sense.

Makes no sense with ints, though.  The int '3' really isn't 3 + 0.5 in R, but a single value in Z.  At least I would think of it this way. 

So the comp sci way of promoting types is probably better than the way MATLAB does it.  But I could be open to a system which 'demotes' double-single products to single, but 'promotes' int-double products to double.



Fair point by gpig (2.00 / 0) #8 Fri May 28, 2010 at 01:25:08 PM EST
.... but we're not just talking about one or two operations here. I found this because of a perfectly ordinary statistical calculation, summing over a whole load of numbers -- and the lack of precision in the sum basically screwed the calculation and gave me negative variance.
---
(,   ,') -- eep
[ Parent ]
our modelling group uses matlab. by garlic (2.00 / 0) #16 Sun May 30, 2010 at 03:34:55 AM EST
it's a huge pain to get it to operate in identical precision as the the hardware.


not that passing NAN, or inf, or ERR would help by garlic (2.00 / 0) #17 Sun May 30, 2010 at 03:37:35 AM EST
I've seen the SW group passing the results of calculations into the hardware that have values of all 3, because they don't self-check to verify they're at least writing in an actual number.


[ Parent ]
Mine's a double | 17 comments (17 topical, 0 hidden) | Trackback