CS Crash Course: #4 Representing Numbers and Letters with Binary

2021年6月6日 10:49

- Each of these binary digits, 1 or 0, is called a "bit."

- You might have heard of 8-bit computers, or 8-bit graphics or audio. These were computers that did most of their operations in chunks of 8 bits. But, 256 different values isn't a lot to work with, so it meant things like 8-bit games were limited to 256 different colors for their graphics.

- And 8-bits is such a common size in computing, it has a special word: a byte. 8 bits = 1 byte

- You've heard of kilobytes, megabytes, gigabytes and so on. Just like one kilogram is a thousand grams, 1 kilobyte is a thousand bytes, or really 8000 bits. Mega is a million bytes(MB), and giga is a billion bytes(GB). Today you might even have a hard drive that has 1 terabyte(TB) of storage -- that's 8 trillion ones and zeros!!!!!!!!!!

- But it's always not true. In binary, a kilobyte has two to the power of 10 bytes, or 1024. 1000 is also right when talking about kilobytes, but we should acknowledge it isn't the only correct definition.

- You've probably also heard the term 32-bit or 64-bit computers -- you're almost certainly using one right now. What this means is that they operate in chunks of 32 or 64 bits. That's a lot of bits!!!!!!

- The largest number you can represent with 32 bits is just under 4.3 billion, which is 4,294,967,295 and 32 1's in binary. This is why our Instagram photos are so smooth and pretty -- they are composed of millions of colors, because computers today use 32-bit color graphics.

Not everything is a positive number, so how can we represent a negative number in binary???

- Most computers use the first bit for the sign: 1 for negative, 0 for positive numbers, and then use the remaining 31 bits for the number itself. That gives us a range of roughly plus or minus two billion which is from 2,147,483,648 to -2,147,483,647.

- While this is a pretty big range of numbers, it's not enough for many tasks. There are 7 billion people on the earth, and the US national debt is almost 20 trillion dollars after all. This is why 64-bit numbers are useful. The largest value a 64-bit number can represent is around 9.2 quintillion!!!!!!!!! It's +/-9.223,372,036,854,775,807（900京）

How about floating numbers???

- The most common way to represent a floating number is the IEEE 754 standard.

- 123.4 can be written by 0.1234 * 10 to the power of 3.

- There are two important numbers here: the .1234 is called the significand and 3 is the exponent. In a 32-bit floating point number, the first bit is used for the sign of the number -- positive or negative. The next 8 bits are used to store the exponent and the remaining 23 bits are used to store the significand.

Finally, Letters!!!!

- The most straightforward approach might be to simply number the letters of the alphabet: A being 1, B being 2, C being 3, and so on.

- Back in the 1600s, in fact, Francis Bacon, the famous English writer, used five-bit sequences to encode all 26 letters of the English alphabet to send secret messages. Five bits can store 32 possible values -- so that's enouh for the 26 letters, but not enough for punctuation, digits, and upper and lower case letters.

- ASCII, the American Standard Code for Information Interchange. Invented in 1963, ASCII was a 7-bit code, enough to store 128 different values. In ASCII codes, a is 97, A is 65, : is 58, and ) is 41.

- ASCII became widely used, and critically, allowed different computers built by different companies to exchange data. This ability to universally exchange information is called "interoperability."

- However, it had a major limitation: it was only designed for English.

- Fortunately, there were 8-bit, and 128 changed to 256.

- But rising computers in Asia broke it because there are thousands of Kanjis. The Japanese were so familiar with this encoding problem that they had a special name for it: "mojibake," which means "scrambled text."

- Therefore, it was born - Unicode - one format to rule them all. Devised in 1992 to finally do away with all of the different international schemes. The most common version of Unicode uses 16 bits with space for over a million codes -- enough for every single character from every language ever used -- more than 120,000 of them in over 100 types of script plus space for mathematical symbols and even graphical characters like Emoji.

- Most importantly, under the hood it all comes down to long sequences of bits. Text messages, this YouTube video, every webpage on the internet, and even your computer's operating system, are nothing but long sequences of 1s and 0s.

この記事が気に入ったらサポートをしてみませんか？