& i never knew this…

The ASCII codes for lowercase letters are all exactly one bit different from their uppercase equivalents (e.g. ‘a’ 01100001 vs ‘A’ 01000001). Likewise, ASCII codes for numbers are all exactly one bit different from the punctuation characters they share a key with (e.g. ‘1’ 00110001 vs ‘!’ 00100001). This historically made keyboards easy to create. The Shift key merely toggled one particular bit.

[update]
In GTK+ applications, you can insert any Unicode character by holding down Ctrl and Shift, and entering the Unicode code point. For instance, Ctrl-Shift-203D inserts an interrobang, one of the Gaim developers’ favorite punctuation marks.

[update-2]
What is UTF-8
UTF-8 is an alternate Unicode encoding that attempts to reduce the number of bytes needed to represent a string. Using UTF-8, every ASCII character takes exactly one byte, and foreign characters take more. This works in a way similar to how I described Asian multibyte encodings earlier.
For code points from 0 to 127, UTF-8 takes exactly one byte, leaving the first bit zero. If the first bit is a 1, that’s a sign that the character takes more than one byte—the number of 1s at the start of the byte tells you how many more. Two 1s means the character takes two bytes, three 1s means it takes three bytes, and four 1s indicates four bytes. These 1s are then followed by a 0.

No related posts.

Related posts brought to you by Yet Another Related Posts Plugin.