# Rounding Bits over Rounding Modes

I was recently talking to Hans Boehm about floating point rounding modes. They are a mess, aren't they? It's a piece of global application state that very subtly changes the meaning of every floating-point operation in the whole program. It causes bugs in compilers, confusion for programmers, and all sorts of pain for library developers. So what would be better?

## Can we get rid of rounding modes?

Well, for one, we could just get rid of rounding modes entirely. I don't mean remove them from the hardware, of course we can't do that, but we can remove them from languages (like C++ is proposing) if we provide good alternatives. Almost every application spends almost all of its time in round-to-nearest. Moreover, changing rounding modes is extremely expensive (it flushes the CPU pipeline on many CPUs). However, we can't quite do this because there are a number of actual uses for rounding modes.

What are they? To the best of my knowledge, they are:

**Float parsing needs rounding modes.** It is potentially helpful to be
able to round to or away from zero when parsing floats. That's not how
the GLibC, LLVM, JDK, or Python work, though. So it's clearly not
necessary for that. And moreover, float parsing is something typically
provided by a language implementation, so we still don't need to
expose rounding modes to users.

**Interval arithmetic.** It is important to be able to at least round to
one of the infinities to implement interval arithmetic. (Typically,
only one infinity; to compute the lower interval bound you negate the
inputs going in and negate them again on the way out.) And some
libraries (like GAOL) do use hardware floating point for speed.
Moreover, interval arithmetic is kind of niche (though recently
standardized) so it wouldn't make sense for language implementations
to provide this.

**Randomized rounding.** Here you introduce extra noise to the low-order
bits to get a sense of your application's sensitivity to rounding
error. One example tool is Verificarlo. In theory, you can do this by
randomly changing change your hardware rounding mode as your
application runs. But that's not what Verificarlo does, since it would
be super slow! Instead, it can randomly flip the low bit of random
floating-point numbers, or randomly increment and decrement it. (There
are multiple options.) This is faster and fit for purpose.

The first and third ones don't really need hardware rounding modes.
For the second, interval arithmetic, hardware rounding modes do have
some benefits, but if you're interested in low-level speed-conscious
programming like this, I don't think it's crazy to say you need write
a little bit of assembly and pick your compiler flags very carefully.
There may be longer-tail uses beyond these, but I *should* be the expert
here and nothing strikes me as particularly significant.

## The rounding bit is useful

That said, imagine we were able to modify the hardware. What would be
better than rounding modes to expose to applications? I think it is a
*rounding bit*.

The rounding bit would be a new floating-point exception flag, and it would tell you which way you rounded, 1 if up and 0 if down. If there was no rounding, the rounding bit wouldn't matter, and you'd be able to tell by checking the inexact flag. With the rounding bit, and modern SSE registers that allow doing integer operations on floating-point values easily, it is pretty easy to emulate the other rounding modes:

- To round up, add
`inexact & !round`

to the round-nearest-even output - To round down, subtract
`inexact & round`

from the round-nearest-even output - To round to zero, something similar but also looking at the sign bit

And, importantly, the rounding bit doesn't involve any global state. It is a pure result of each floating-point operation, which makes the compiler's job much easier.

Moreover, right now, it's not easy to replicate the rounding bit. You could set the rounding mode to round up, like interval arithmetic libraries do, and then you definitely know it rounded up, but this loses accuracy: rounding to nearest even has 0.5 ulps of error per operation, while rounding up has 1.0 ulps. Of course, you could execute an operation twice with different rounding modes, but that is slow. So the rounding bit is a new capability that seems to supercede rounding modes but offers additional capabilities.

## The rounding bit is implementable

Moreover, I think this could be implemented efficiently in a modern chip.

A quick background—please correct me if you know hardware. When implementing a mathematical operation like addition, multiplication, division, or FMA, you have two inputs with (say) 53 significant figures, but when you add (or multiply, etc) them, you can end up with more than 53 significant figures. In fact, the number of significant figures can be quite large, so you don't represent them all exactly. Instead, for rounding, you only need two bits:

- The "guard bit" tells you if the next bit (which would be rounded off) is a one or a zero.
- The "sticky bit" tells you if any
*further*bits are ones

With these two bits, plus the sign and mantissa, at hand, you can round any which way you want:

- If the guard and sticky bit are 0, you have the exact right answer,
no rounding
^{1}[^{1}This is also how you set the inexact exception flag.] - If the sticky bit is 1, the guard bit tells you which endpoint you're closer to.
- If the sticky bit is 0 and the guard bit is 1, you are exactly midway between the endpoints

This connection was made for me by Jay P. Lim in his breakthrough RLIBMALL paper, but its source is much older.

The rounding bit would be a new exception flag, and it would tell you which way you rounded, 1 if up and 0 if down. If there was no rounding, the rounding bit wouldn't matter, and you'd be able to tell by checking the inexact flag. Since all the code runs in round-nearest-even, the rounding bit would be set by:

guard & (last | sticky)

So as you can see, this would not be an expensive operation.

The rounding bit could be treated as a exception flag, in MXCSR or the architectural equivalent. MXCSR has some reserved bits that could be used. And modern Intel chips make access to MXCSR reasonably fast (latency 5, throughput 0.5 according to Fog's tables, though it does have to hit memory), likely faster than changing rounding mode.

To make this work best, I would make the rounding bit work a bit differently from other exception flags. I wouldn't make it an actual exception; there's no point triggering exceptions only on round up anyway. And instead of the bit being "sticky" (an "or" of all previous operations) I would just set it per operation. This is a little different from the other exception flags, but those are required by IEEE 754, while this isn't. It also would be great if you could load the rounding and inexact bits into a register (perhaps a register destination version of LDMXCSR?) since you typically need to do some bitwise arithmetic on these values, but I guess that's not essential, especially on a chip with memory renaming.

## Will this happen

No, definitely not. But perhaps it could be implemented in software libraries? Perhaps MPFR?

## Footnotes:

^{1}

This is also how you set the inexact exception flag.