We'd be Better Off with 9-bit Bytes
A number of 70s computing systems had nine-bit bytes, most prominently
the PDP-10, but today1 [1 Apparently, it was the System/360 that
really set the standard here.] all systems use 8-bit bytes and that
now seems natural.2 [2 Though you still see RFCs use "octet", and the
C standard has a CHAR_BITS
macro, to handle the possibility of a
different-sized byte.] As a power of two, eight is definitely nicer.
But I think a series of historical coincidences would actually go our
way with 9-bit bytes.3 [3 And, as I say below, I don't think
equivalently bad coincidences would break against us.]
IPv4: Everyone knows the story: IPv4 had 32-bit addresses, so about 4 billion total.4 [4 Less due to various reserved subnets.] That's not enough in a world with 8 billion humans, and that's lead to NATs, more active network middleware, and the impossibly glacial pace of IPv6 roll-out. It's 2025 and Github—Github!—doesn't support IPv6. But in a world with 9-bit bytes IPv4 would have had 36-bit addresses, about 64 billion total. That would still be enough right now, and even with continuing growth in India and Africa it would probably be enough for about a decade more.5 [5 In our timeline, exhaustion hit in 2011, when demand was doubling every five years. 16× more addresses gets us to 2031 projecting linearly, and probably a little later with growth slowing.] When exhaustion does set in, it would plausibly at a time where there's not a lot of growth left in penetration, population, or devices, and mild market mechanisms instead of NATs would be the solution.
UNIX time: In our timeline, 32-bit UNIX timestamps run out in 2038, so again all software has to painfully transition to larger, 64-bit structures. Equivalent 36-bit timestamps last until 3058, so no hurry. Negative timestamps would represent any time since 882, so could cover the founding of Kievan Rus', the death of Alfred the Great, the collapse of the Classic Maya,6 [6 The people stuck around, but they stopped building cool cities.] and the movement of Magyar tribes into the Carpathian basin.
Unicode: In our universe, there are 65 thousand 16-bit characters, which looked like maybe enough for all the world's languages, assuming you're really careful about which Chinese characters you let in.7 [7 Known as CJK unification, a real design flaw in Unicode that we're stuck with.] With 9-bit bytes we'd have 262 thousand 18-bit characters instead, which would totally be enough—there are only 155 thousand Unicode characters today, and that's with all the cat smileys and emojis we can dream of. UTF-9 would be thought of more as a compression format and largely sidelined by GZip.8 [8 Alternatively, we could lose a bit, be sparing with the cat smileys, and UTF-9 could be one/two byte like Shift-JIS. That would be pretty attractive.]
Pointers: In 8-bit byte land, 32-bit operating systems impose a 2 GB cap on processes,9 [9 Because the kernel needs the top half of the memory space.] which turns out to be pretty restrictive. 36-bit operating systems would allow up to 32 GB per process, which even today would be a big machine; I'm writing this on a four-year-old Macbook Pro and it only has 16 GB of RAM. Server-class machines would still need to address more memory than that, but they're usually running specialized software or virtualizing; databases and hypervisors are already tricky code and segmentation wouldn't be the end of the world.10 [10 Basically, it'd be x32 as the standard.] Memory usage, even measured in bits, would be lower thanks to smaller pointers11 [11 So maybe 5% faster per x32 benchmarks?] though strings would be bigger.12 [12 So overall, maybe a wash?]
There are more obscure wins too. 16-bit AS numbers ran out years ago; 18-bit numbers would still be enough. 18-bit ports and process IDs and user IDs would be a bit roomier. X86 and A64 instruction encodings would be a bit saner.13 [13 Thumb would work better?] Half-precision (18-bit) floats might be prominent earlier.14 [14 Today's manic 4- and 5-bit floats wouldn't work, and 3-bit floats seem impossible. Maybe 6-bit floats, 6-in-4, would be the consensus OCP float.] Extended ASCII would have room for Greek and would become a kind of NATO code page.15 [15 And UTF-9 would privilege most of Western Europe, not just the US.] Unix permissions would be one byte, so would lack sticky bits. Octal, not hex, would be standard.16 [16 It all comes from the PDP-10!] Probably there are other benefits too.17 [17 I measured ΔE for 18-bit color, which nicely splits 6/6/6. ChatGPT says the numbers I'm getting are imperceptible, but I don't really know, and losing an alpha channel would hurt.]
Would there be costs? No system has bit addressing; if a byte isn't a power of two it doesn't actually matter.18 [18 No CPU would be dividing by nine or anything like that.] Page sizes and block sizes probably wouldn't change, the kernel wouldn't be doing anything different from now.19 [19 Though kernels would need to support some kind of 54-bit segment selector plus pointer memory mapping.] Would something else exhaust in ugly ways because it would look like it might fit? A bunch of single-system stuff, probably; one-byte UID/GID might be tempting, or two-byte inode numbers, but these happened in our universe and didn't cause painful transitions.
The scariest I've come up with20 [20 ChatGPT o4 came up with it.] is TCP sequence numbers, where 18-bit sequence numbers might look appealing but would cause real problems for high-bandwidth connections. You'd need window scaling by the early 90s and a bump to 36-bit sequence numbers by the mid 90s, culminating in an IPv6-like TCPv2 effort. Or maybe instead of IPv6's "skinny upgrade" strategy TCPv2 would incorporate networking concerns of the era; maybe ECN would be on by default. But it's still not as bad as IPv6: ISPs would need to support TCPv2 to offer higher speeds, which was the main way ISPs competed. They'd make the investment. And since it all happens in the mid-90s, HTTP might end up requiring TCPv2. We wouldn't dual-stack.
Update: This post hit the HN front page. Mostly the comments were "like always" as we say,21 [21 As usual, reading the post and looking up the author's identity instead of guessing gets you a long way.] but I wanted to highlight JdeBP's wonderful comment sketching more of this 9-bit alternate history. Do read it. My gestalt impression is that this alternate world sounds pretty good! Fewer annoying limits, lame protocol extensions, US-specificity, and so on. So much of the early computing era was shaped by numerological limits.
Thank you to GPT 4o and o4 for discussions, research, and drafting.
Footnotes:
Apparently, it was the System/360 that really set the standard here.
Though you still see RFCs use "octet", and the
C standard has a CHAR_BITS
macro, to handle the possibility of a
different-sized byte.
And, as I say below, I don't think equivalently bad coincidences would break against us.
Less due to various reserved subnets.
In our timeline, exhaustion hit in 2011, when demand was doubling every five years. 16× more addresses gets us to 2031 projecting linearly, and probably a little later with growth slowing.
The people stuck around, but they stopped building cool cities.
Known as CJK unification, a real design flaw in Unicode that we're stuck with.
Alternatively, we could lose a bit, be sparing with the cat smileys, and UTF-9 could be one/two byte like Shift-JIS. That would be pretty attractive.
Because the kernel needs the top half of the memory space.
Basically, it'd be x32 as the standard.
So maybe 5% faster per x32 benchmarks?
So overall, maybe a wash?
Thumb would work better?
Today's manic 4- and 5-bit floats wouldn't work, and 3-bit floats seem impossible. Maybe 6-bit floats, 6-in-4, would be the consensus OCP float.
And UTF-9 would privilege most of Western Europe, not just the US.
It all comes from the PDP-10!
I measured ΔE for 18-bit color, which nicely splits 6/6/6. ChatGPT says the numbers I'm getting are imperceptible, but I don't really know, and losing an alpha channel would hurt.
No CPU would be dividing by nine or anything like that.
Though kernels would need to support some kind of 54-bit segment selector plus pointer memory mapping.
ChatGPT o4 came up with it.
As usual, reading the post and looking up the author's identity instead of guessing gets you a long way.