Pavel Panchekha

By

Share under CC-BY-SA.

A Plan for Herbie Plaforms

This blog post is a design document, laying out how the platforms feature should work and how it should be organized. First—what is the platforms feature? I think it’s useful to separate out two parts:

Let’s consider each in turn, but first—a background on platforms.

What is a platform?

As discussed in our ASPLOS’25 paper, a platform is a list of floating-point operations plus a cost model for them. Each operation has:

  • A signature, with representations for each argument and for the output
  • A mathematical specification
  • A callable floating-point implementation
  • A translation to FPCore, which is sometimes slightly different from the specification

The platform also has some miscellaneous information:

  • A cost model, which is basically:
    • A numerical cost for literals/variables for each representation
    • A numerical cost for if statements, which are treated separately
    • Whether if statements use vector or serial style costs
    • A numerical cost for each operator
  • A set of identities, which are over the existing operators
  • A compiler, maybe? This part is still a bit unclear

On a more concrete level, a platform is a loadable Racket file.

  • Maybe it uses a Racket #lang herbie/platform or something like that
  • It also imports libraries that provide representations or floating-point implementations
  • It uses define-platform to define and export a platform object

Maybe those files should live somewhere centralized (.config/herbie?) or maybe loose.

Specifying a Platform

First, users should be able to specify a platform and have Herbie load and run using that platform. The platform should affect:

  • The operators Herbie can invoke
  • The cost model Herbie uses
  • The output format for programs

On the command-line, I think the syntax should be something like:

racket -l herbie report --platform libm

I guess it should also be possible to name a file path:

racket -l herbie report --platform ~/foo.rkt

I think the first mode will search some standard directories, while the second mode will just load the file directly. Probably we can differentiate between the two cases by checking for a period in the name. Realistically, we'll want to check not just one standard directory but multiple; for example, .config/herbie but also a platforms directory in the Herbie distribution. If we check the user directory first, we'd allow users to override the standard platforms, which would be valuable for auto-tuning.

On the web and in Odyssey, I think we’ll want to present a dropdown list of platforms (and possibly also allow file upload). I’m not sure how this should work. There are two options:

  • Just show a list of file names, and don’t load or execute any of the platforms. This is safe and means platforms can do “whatever” and just work. But it means we can’t easily integrate features like disabled platforms (AVX on an ARM machine) or parameterizable platforms (include or don’t include a certain library). We can’t even show nice names for platforms.
  • Load each platform in turn, and query it for additional information like a pretty name, parameters, enabled/disabled operators, etc. This would be nice, but loading all platforms might be slow (so we’d need to cache this? And invalidate that cache?) and also we’d need to make sure that platforms don’t mutate any important state or interfere with one another.

Another more speculative option is to macro-expand the platform file and provide some data during macro-expansion time, without actually executing the platform. This could be cool but we'd need to provide a nice library to do this, no one is going to want to learn Racket's macro system in depth just to add a new Herbie platforms.

One thing that’s a little unclear to me is how many platforms, exactly, we’ll end up with. I think the Racket platform is a good example. With just “Racket” as the target platform, we could imagine targeting just racket/base, or math/base as well, or math/flonum as well, and if we target math/flonum maybe we want the fl2 or fllog functions and maybe we don’t. You could imagine this being one Racket platform that’s parameterized (more user-friendly) or something like five or six different platforms. If it’s five or six, and knowing that we’ll want a similar split for lots of other languages, then we need to think about organizing platforms somehow (directories?).

My guess is that we can delay this for now. Parameterized platforms are cool, but we don’t have enough platforms yet that this is a problem, and it would require more syntax and more API surface, and we don’t really want any of that if it isn’t worth it for users.

Creating a Platform

I think we also want to make it easy for users to create platforms of their own. It should be possible to autotune an existing platform for the user’s machine, or to add new operators to an existing platform (for example to add a project's existing helper methods); it should also be possible to start from scratch. I think it would be neat to offer this as a command-line tool; users would have to be programmers to use this feature anyway.

Here are some APIs I’m imagining:

racket -l herbie platform new name

This would create a new name.rkt file in the standard directory, pre-fill it with something minimal, and open an editor or something to the file. The file might look like this:

#lang herbie/platform

It should also be easy to modify an existing platform to add operators or something, maybe by running:

racket -l herbie platform extend racket racket-bigfloat

This would create a new racket-bigfloat file in the standard directory and prefill it like this:

#lang herbie/platform
(include-platform racket)

When loaded that platform would basically be an overlay onto the racket platform, and you could add, remove, or modify operators. The include-platform primitive would perform the same platform-loading functionality that specifying a platform on the command-line would, which would make sure to load any libraries that platform needs.

Finally, it would be great if you could autotune a platform, which I think you’d do by running:

racket -l herbie platform autotune racket racket-tuned

Ideally this would invoke the compiler defined by the racket platform to execute some simple programs, time them, and use that information to define a cost model. That cost-model would go into a racket-tuned.rkt in the user standard directory, which would shadow / override the one that shipped with Herbie, and its contents would look like:

#lang herbie/platform
(include-platform racket)

(define-cost + 14.234)
(define-cost add1 12.234)

The details of the syntax could be changed but the idea is that users would be able to autotune platforms they use often to get more accurate results.

I also think it would aid development a lot to be able to see existing platforms, like this:

racket -l herbie platform show racket

That would output a table to the terminal, like this:

| Operator           | Spec               | CR | Cost |
|--------------------+--------------------+----+------|
| (sinf x)           | (sin x)            |    | 143. |
| (expm1 x)          | (- (exp x) 1)      |    |  87. |

The “CR” column would show if the implementation is the auto-generated correctly-rounded one or something else. Maybe there should be a signature column, or it should go in the operator column, dunno. The point is that this would make it easy to examine an existing platform and then see what you want to add to it.

Development plan

I think the development plan has to go step by step so we can have frequent merges and not get dragged into the tar pit of complexity. Here’s how I imagine it:

  1. Remove as many global tables from within Herbie as possible. Everything platform-dependent should be loaded from the platform. Everything platform-independent should be immutable.
  2. Write a few built-in platforms to get used to the routine. At the very least, we need platforms for:
    • c-libm, which targets C code using the local libm library. This is also the default platform. The costs we can autotune on one of our machines.
    • herbie-2.0, which uses the same operators but overrides the cost model with the dumb "arith-1 library-100" cost model that Herbie 2.0 used.
    • herbie-1.0, which sets the cost of every operator to 0 and thus emulates the no-pareto mode.
    • racket, which is a fallback when the local libm is not available for some reason, and I guess in homage to our host language
  3. Make each existing platform an independent file. Put all the files in one convenient place. Load only one of those files, matching the selected platform, at a time.
  4. Add lots more platforms, including at least the ones in the ASPLOS’25 paper.
  5. Iterate on the platform API until these platforms are as clean as possible.
  6. Add functionality to check a user-owned directory for platform files. This is OS-specific but hopefully Racket has a library for this.
  7. Build out the new and clone functionality
  8. Build out the autotune functionality. This strikes me as very error-prone because we’ll need to actually run code on the user’s machine, and also probably not every platform can be auto-tuned. But even doing this just for c-libm would be valuable.