- The real meat of the story is in the referenced blog post: https://blog.codingconfessions.com/p/how-unix-spell-ran-in-64kb-ram - TL;DR - If you’re short on time, here’s the key engineering story: - 
McIlroy’s first innovation was a clever linguistics-based stemming algorithm that reduced the dictionary to just 25,000 words while improving accuracy. 
- 
For fast lookups, he initially used a Bloom filter—perhaps one of its first production uses. Interestingly, Dennis Ritchie provided the implementation. They tuned it to have such a low false positive rate that they could skip actual dictionary lookups. 
- 
When the dictionary grew to 30,000 words, the Bloom filter approach became impractical, leading to innovative hash compression techniques. 
- 
They computed that 27-bit hash codes would keep collision probability acceptably low, but needed compression. 
- 
McIlroy’s solution was to store differences between sorted hash codes, after discovering these differences followed a geometric distribution. 
- 
Using Golomb’s code, a compression scheme designed for geometric distributions, he achieved 13.60 bits per word—remarkably close to the theoretical minimum of 13.57 bits. 
- 
Finally, he partitioned the compressed data to speed up lookups, trading a small memory increase (final size ~14 bits per word) for significantly faster performance. 
 - For anyone struggling, lemmy web interface added the colon into the URL for the blog post link. Here’s a clickable version without the colon: - https://blog.codingconfessions.com/p/how-unix-spell-ran-in-64kb-ram - Thanks, and sorry about that! I removed the colon from near my URL now, just in case. 
 
- Thank you 
- The blog post is an incredible read. 
 
- 
- Long article for one sentence of trivia and no info on the algo itself. The death of the internet is upon us. - Doesn’t even name the algorithm, and somehow spells LZMA wrong, despite just having written it out longhand. - Well, it’s PC Gamer. - [edit] I still can’t figure out if they’re referencing LZW encoding… the L and Z being the same Lempel and Ziv from LZMA, but with Welch having a different solution for the rest of the algorithm due to size constraints. - Probably mostly AI written. 
 
- I’d like to imagine they took the short trivia fact and applied the inverse of the compression algorithm to bloat it into something that satisfied the editor. 
 
- If it aint broke, don’t fix it. 
- !lemmysilver - Thank you for voting. You can vote again in 24 hours. leaderboard 
 
- Only 1 GiB of RAM? Moooom! 
 Shut up Johnny, Voyager’s still out there with way less.







