I often find myself explaining the same things in real life and online, so I recently started writing technical blog posts.
This one is about why it was a mistake to call 1024 bytes a kilobyte. It’s about a 20min read so thank you very much in advance if you find the time to read it.
Feedback is very much welcome. Thank you.
Well it’s because computer science has been around for 60+ years and computers are binary machines. It was natural for everything to be base 2. The most infuriating part is why drive manufacturers arbitrarily started calling 1000 bytes a kilobyte, 1000 kilobytes a megabyte, and 1000 megabytes a gigabyte, and a 1000 gigabytes a terabyte when until then a 1 TB was 1099511627776 bytes. They did this simply because it made their drives appear 10% bigger. So good ol’ shrinkflation. You could make drives 10% smaller and sell them for the same price.
If a hard drive has exactly 8’269’642’989’568 bytes what’s the benefit of using binary prefixes instead of decimal prefixes?
There is a reason for memory like caches, buffer sizes and RAM. But we don’t count printer paper with binary prefixes because the printer communication uses binary.
There is no(!) reason to label hard drive sizes with binary prefixes.
So here’s the thing. I don’t necessarily disagree with you. And if this had done from the start it would never had been a problem. But it wasn’t and THAT is what caused the confusion. You put a lot of thought and research into your post and I can very much respect that. It’s something you feel strongly about and you took the time to write about your beef with this. IEC changed the nomenclature in the late 90s. But the REASON they changed it was to avoid the confusion caused by the drive manufacturers (I bet you can guess who was in the committee that proposed the change).
But I can tell you as a professional IT person we never really expect any drive (solid state or otherwise) to be any specific size. RAID, file system overhead, block size fragmentation, etc all take a cut. It’s basically just bistromathics (that’s a Hitchhiker’s reference) and the overall size of any storage system is only vaguely related to actual drive size.
So I just want to basically apologize for being so flippant before. It’s important enough to you that you took the time to write this. It’s just that I’m getting rather cynical as I get older and just expect the enshittification of every to continue ad infinitum on everything digital.
It more accurately describes how much space you have and how you can expect to see it shown in your software when you actually install it somewhere.
Pretty obvious that you didn’t read the article. If you find the time I’d like to encourage you to read it. I hope it clears up some misconceptions and make things clearer why even in those 60+ years it was always intellectually dishonest to call 1024 byte a kilobyte.
You should at least read “(Un)lucky coincidence”
Ok so I did read the article. For one I can’t take an article seriously that is using memes. Thing the second yes drive manufacturers are at fault because I’ve been in IT a very very long time and I remember when HD manufacturers actually changed. And the reason was greed (shrinkflation). I mean why change, why inject confusion where there wasn’t any before. Find the simplest least complex reason and that is likely true (Occam’s razor). Or follow the money usually works too.
It was never intellectually dishonest to call it a kilobyte, it was convenient and was close enough. It’s what I would have done and it was obviously accepted by lots of really smart people back then so it stuck. If there was ever any confusion it’s by people who created the confusion by creating the alternative (see above).
If you wanna be upset you should be upset at the gibi, kibi, tebi nonsense that we have to deal with now because of said confusion (see above). I can tell you for a fact that no one in my professional IT career of over 30 years has ever used any of the **bi words.
You can be upset if you want but it is never really a problem for folks like me.
Hopefully this helps…
Pushing 30 years myself and I confirm literally not a single person I’ve worked with has ever used **bi… terms. Also, I recall the switch where drive manufacturers went from 1024 to 1000. I recall the poor attempt from shill writers in tech saying it better represents the number of bits as the format parameters applied to a drive changes the space available for files. I recall exactly zero people buying that excuse.
Old IT represent!! 😂
I just think that kilobyte should have been 1000 (in binary, so 16 in decimal) bytes and so on. Just keep everything relating to the binary storage in binary. That couldn’t ever become confusing, right?
kilobit = 1000 bits. Kilobyte = 1000 bytes.
How is anything about that intellectually dishonest??
The only ones being dishonest are the drive manufacturers, like the person above said. They sell storage drives by advertising them in the byte quantity but they’re actually in the bit quantity.
Calling 1024 a kilo is intellectually dishonest. Your conversation is perfectly fine.
I genuinely don’t understand your disdain for using base 2 on something that calculates in base 2. Do you know how counting works in binary? Every byte is made up of 8 bits, and goes from 0000 0000 to 1111 1111, or 0-15. When converted to larger scales, 1024 bytes is a clean mathematical derivation in base 2, 1000 is a fractional number. Your pedantry seems to hinge on the use of the prefix right? I think 1024 is a better representation of kilo- in base 2, because a kilo- can be directly translated up to exabytes and down to nybbles while “1000” in base 2 is extremely difficult. The point of metric is specifically to facilitate easy measuring, right? So measuring in the units that the computer uses makes perfect sense. It’s like me saying that a kilogram should be measured in base 60, because that was the original number system.
TLDR: the problem isn’t using base 2 multipliers. The problem is doing so then saying it’s a base 10 number
In 1998 when the problem was solved it wasn’t a big deal, but now the difference between a gigabyte and a gibibyte is large enough to cause problems
Using kilo- in base 2 for something that calculates in base 2 simply makes sense to me. However, like I said to OP, ultimately this debate amounts to rage bait for nerds. All I ask is that I’m not pedantically corrected if the conversation isn’t directly related to kibi- vs kilo-
Did you read the post? The problem I have is redefining the kilo because of a mathematical fluke.
You certainly can write a mass in base 60 and kg, there is nothing wrong about that, but calling 3600 gramm a “kilogram” because you think it’s convenient that 3600 (60^2) is “close to” 1000 so you just call it a kilogram, because that’s exactly what’s happening with binary and 1024.
If you find the time you should read the post and if not at least the section “(Un)lucky coincidence”.
I started reading it, but the disdain towards measuring in base 2 turned me off. Ultimately though this is all nerd rage bait. I’m annoyed that kilobytes aren’t measured as 1024 anymore, but it’s also not a big deal because we still have standardized units in base 2. Those alternative units are also fun to say, which immediately removes any annoyance as soon as I say gibibyte. All I ask is that I’m not pedantically corrected if the discussion is about something else involving amounts of data.
I do think there is a problem with marketing, because even the most know-nothing users are primed to know that a kilobyte is measured differently from a kilogram, so people feel a little screwed when their drive reads 931GiB instead of 1TB.
Yeah I’m with you, I read most of it but I just don’t know where the disdain comes from. At most scales of infrastructure anymore you can use them interchangeably because the difference is immaterial in practical applications.
Like if I am going to provision 2TB I don’t really care if it’s 2000 or 2048GB, I’ll be resizing it when it gets to 1800 either way, and if I needed to actually store 2TB I would create a 3TB volume, storage is cheap and my time calculating the difference is not.
Wait until you learn about how different fields use different precision levels of pi.
Removed by mod
I’m not sure if I’m too stupid, but how so?
Removed by mod
when you format a 256GB drive and find out that you don’t actually have 256GB
Most of the time you have at least 256GB. It’s just you 256GB=238.4GiB, and windows reports GiB but calls them GB. You wouldn’t have that problem in Mac OS that counts GB properly, or gnome that counts GiB and calls them GiB.
(This is ignoring the few MB that takes to format a drive, but that’s also space on the disk and you’re the one choosing to partition and format the drive. If you dumped a file straight into the drive you’d get that back, but it would be kind of inconvenient)
Here’s the summary for the wikipedia article you mentioned in your comment:
Both the British imperial measurement system and United States customary systems of measurement derive from earlier English unit systems used prior to 1824 that were the result of a combination of the local Anglo-Saxon units inherited from Germanic tribes and Roman units. Having this shared heritage, the two systems are quite similar, but there are differences. The US customary system is based on English systems of the 18th century, while the imperial system was defined in 1824, almost a half-century after American independence.
So why don’t they just label drives in Terabit instead of terabyte. The number would be even bigger. Why don’t Europeans also use Fahrenheit, with the bigger numbers the temperature for sure would instantly feel warmer 🤣
Jokes aside. Even if HDD manufacturers benefit from “the bigger numbers” using the 1000 conversation is the objectively only correct answer here, because there is nothing intrinsically base 2 about hard drives. You should give the blog post a read 😉
there is nothing intrinsically base 2 about hard drives
did you miss the part where those devices store binary data?
Binary prefixes (the ones with 1024 conversations) are used to simplify numbers that are exact powers of two - for example RAM and similar types of memory. Hard drive sizes are never exact powers of two. Disk storing bits don’t have anything to do with the size of the disk.
sure, but one of the intrinsic properties of binary data is that it is in binary sized chunks. you won’t find a hard drive that stores 1000 bits of data per chunk.
The “chunk” is often 32,768 bits these days and it never matches the actual size of the drive.
A 120 GB drive might actually be closer to 180 GB when it’s brand new (if it’s a good drive - cheap ones might be more like 130 GB)… and will get smaller as the drive wears out with normal use. I once had a HDD go from 500 GB down to about 50 GB before I stopped using it - it was a work computer and only used for email so 50 GB was when it actually started running out of space.
HDD / SSD sellers are often accused of being stingy - but the reality is they’re selling a bigger drive than what you’re told you’re getting.
Look up the exact number of bytes and then explain to me what the benefits are of using 1024 conversations instead of 1000 for a hard drive?
SSDs are.
Not even SSDs are. Do you have an SSD? You should lookup the exact drive size in bytes, it’s very likely not an exact power of two.
Thanks for this article. Unfortunately, you used the word “prefix” when you really meant “unit symbol”. So, “kilo” and “mega” are prefixes, kB and MB are unit symbols. You repeatedly called the latter “prefixes”.
Thank you for the feedback. I know that only the “first” part is the prefix and I tried to be careful to not use it wrong. I just checked all 53 instances of “prefix” and I don’t see a wrong one, but to be fair there are situations that could be misunderstood easily like here:
Today the only correct conversions are to either use SI prefixes (like 1 MB = 1000² bytes) or binary prefixes (1 MiB = 1024² bytes).
But with prefix I only meant the “M” and “Mi” part and they are both prefixes.
I’ll try to clarify that later so the difference is clear to all readers. Thank you.
Ok, I understand what you are trying to do, but I that is not how I read it at the time. Prefix to me in this context means e.g., “kilo” in “kilobyte”, and not the “k” in “kB”. I am not sure it is helpful to split the unit symbol up like that.
But the first part is called prefix even in the standard itself. I wanted to make that distinction because it’s not important what the base unit is. By speaking about prefixes instead of the unit as a whole I wanted to make it clear that you can (at least in theory) use any base unit. So everything I said about KiB and kB is also true for Kib and kb and even for kK (kilokelvin) and KiB (kibikelvin) 🤣
Kilo = 1000
Byte = Byte
Kilobyte = 1000 bytes
Kibibyte = 1024 bytes
Byte = 8 Bits?
Yes, that is what a byte is.
This is why I only use nibbles. At least it’s not spelled funny. But, unfortunately, it sounds like dogfood… Kibinibbles.
A kilobyte (kB) is 1000 bytes, that’s what the prefix kilo means. A kibibyte (KiB) is 1024 bytes (the “bi” in the prefix means base 2 or binary). People often confuse them, but they’re similar enough for smaller units, 10^3 ~ 2^10.
Oh and at first, kilobyte was used for both amounts, which is why kibibytes were introduced to fix the confusion, which perhaps was a bit late anyway.
True and that’s what the article is about. You should check out the interactive diagram in the “(Un)lucky coincidence” section.
I know it’s already been explained but here is a visualization of why.
0 2 4 8 16 32 64 128 256 512 1024
Did you read the blog post? If you don’t find the time you should at least read “(Un)lucky coincidence” to see why it’s not (and never was) a bright idea to call 1024 “a kilo”.
No we didn’t read your click bait and have no interest in doing so.
You asked for feedback, so here is my feedback:
The article is okay. I read most of it, but not all of it, because it seemed overly worded for the sentiment. It could have been condensed quite a bit. I would argue the focus should be more on the fact that there should be a standard in technical documentation, OS’s, specification sheets, etc. That’s the part that impacts most people, and the reason they should care. But that kind of gets lost in all the text.
Your replies here come off as pretty condescending. You should anticipate most people not reading the article before commenting. Just pay them no attention, or reiterate what you already stated in the article. You shouldn’t just say “did you read the article” and then “it’s in this section of the article”. Just like how people comment on youtube before watching the video, people will comment on the topic without reading the article.
Maybe they didn’t realize it was an article, maybe they knew it was an article and chose not to read it, or maybe they read it and disagree with some of the things you said. It’s okay for people to disagree with something you said, even if you sincerely believe something you said isn’t a matter of opinion (even though it probably is). You can agree to disagree and move on with your life.
Thank you for taking the time to read it and your feedback.
Your replies here come off as pretty condescending.
That was definitely never my intention but a lot of people here said something similar. I should probably work on my English (I’m not a native speaker) to phrase things more carefully.
You shouldn’t just say “did you read the article” and then “it’s in this section of the article”
It never crossed my mind this could be interpreted in a negative way. I tried to gauge if someone read it and still disagreed or if someone didn’t read it and disagrees, because those situations are two different things, at least for me. The hint with the sections was also meant as a pointer because I know that most people won’t read the entire thing but maybe have 5min on their hand to read the relevant section.
I feel bad for you OP, I get this a lot and I’m totally gonna go there because I feel your pain and your article was fantastic! I read almost every word ;p
This phenomena stems from an aversion to high-confidence people who make highly logical arguments from low self-confidence people who basically make themselves feel unworthy/inadequate when justly critiqued/busted. It makes sense for them to feel that way too, I empathize. It’s hard to overcome the vapid rewarding and inflation in school. They should feel cheated and insolent at this whole situation.
I’ll be honest in front of the internet; people (in majority mind you, say 70-80% of Americans, I’m American) do not read every word of the article with full attention because of ever present and prevelant distractions, attention deficit, and motivation. They skip sentences or even paragraphs of things they are expecting they already know, apply bias before the conclusion, do not suspend their own perspective to understand yours for only a brief time, and come from a skeptical position no matter if they agreed with it or not!
In general, people also want to feel they have some valid perspective “truth” (as it’s all relative to them…) of their own to add and they want to be validated and acknowledged for it, as in school.
Guess what though, Corporations, Schools, Market Analysis, Novelists, PR people, Video Game Makers, Communications Managers and Small and Medium Business already know this! They even take a much more, ehh, progressive? approach about it, let’s say. That is, to really not let them speak/feedback, at all. Nearly all comment sections are gone from websites, comment boxes are gone from retail shops, customer service is a bot, technical writers make videos now to go over what they just wrote, Newspapers write for 4th graders, etc., etc.
Nothing you said is even remotely condescending and nothing you said was out of order. Don’t defend yourself in these situations because it’s just encouragement for them to do it again. Don’t take it personally yourself, that is just the state of things.
Improvise, Adapt, Re-engineer, Re-deploy, Overcome, repeat until done.
TL;DR?
“I am smart.”… “Most people have an attention span the length of a yo mama joke.”… “Ramble ramble yada yada yada.”
Imagine getting your weak ass argument destroyed by ChatGPT
lol, I didn’t know you could share chatGPT responses
The mistake is thinking that a 1000 byte file takes up a 1000 bytes on any storage medium. The mistake is thinking that it even matters if a kB means 1000 or 1024 bytes. It only matters for some programmers, and to those 1024 is the number that matters.
Disregarding reality in favor of pedantics is the real mistake.
I dunno it makes up a few gigabytes of lost storage on a terrabyte hard drive.
Because a kilo is 1000. That’s why you have kibi, mebi, gibi binary prefixes for those times where 1024 (power of 2’s) matter.
I know, that’s what the post is about 😉
It’s a scam by HDD makers to sell less storage for more money.
Did you read the blog post? It’s not a scam. HDD vendors might profit from “bigger numbers” but using the units they do is objectively the only sensible and correct option. It’s like saying that the weather report is in Fahrenheit because in Celsius the numbers would be lower and feel somehow colder 🤣
If it would be about bigger numbers why don’t HDD manufacturers just use Terabit instead of terabyte? The “bigger number” argument is not a good one.
Because it’s much easier to mistake a number for a somewhat close number than one that is orders of magnitude different…
I’ll try to read the article later but the reality is that HDD manufacturers could help customers disambiguate but that would hurt their bottom line so they don’t.
Videogame companies literally did use “megabit” when the truth was “128KiB”, because it sounded better. Actual computer companies were still listing binary power numbers, because buyers had more to invest and care about accuracy.
You say “sensible”, but it’s lying for profit.
WD needed to sell a drive with more advertised space than real space.
Unlike many comments here, I enjoyed reading the article, especially the parts in the “I don’t want to use gibibyte!” chapter, where you explain that this (the pedantry) is important in technical and formal situations (such as documentation). Seeing some of the comments here, I think it would have helped to focus on this aspect a bit more.
I also liked the extra part explaining the reasoning for using the Nokia E60.
I don’t quite agree with the recommendation to use base 10 SI units where neither KiB or kB would result in nice numbers. I don’t see why base 10 should have an influence on computers, and I think it makes more sense to stick to a single unit, such as KiB.
The reasons I have this opinion are probably to do with:
- My computer has shown me values using KiB, Gib, etc for years - I think it’s a KDE default - so I’m already used to the concept of KiB being different from kB.
- I dislike the concept of base 10 in general. I like the idea of using base 16 universally (because computers. Base 12 is also valid in a less computer-dominant society). I therefore also think 1024 is a silly number to use, and we should measure memory in multiples of 2^8 or 2^16…
p.s, I agree with other commenters that your comments starting with “Pretty obvious that you didn’t read the article.” or similar are probably not helping your case… I understand that some comments here have been quite frustrating though.
I dislike the concept of base 10 in general.
You’re not human.
He’s got 8 fingers on each hand. 🤣
❤️ Thank you for taking the time to read it and thank you for your feedback, I really appreciate it.
i mean, you can’t get to 1000 by doubling twos, so, no?
Reality doesn’t care what you prefer my dude
I was taught 1024 in my tech school. So I won’t ever refer to it as 1000 instead 1024. Not that it seems even remotely relevant though.