In the beginning was ASCII, and ASCII was limited--128 characters wasn't enough. So Microsoft extended it to 256--still not enough. True, you could now access "foreign-language" and other special characters by using "code pages" with different fonts in Microsoft Word. If you've clicked Insert > Symbol and then changed the font on the drop-down list in the Symbol dialog, you've seen how this works: the same character "position" (or number) often displays a different character in different fonts.
But what if you want to use special characters--*any* special characters--in the *same* font as your regular text? That's what Unicode is all about. As the Unicode Web site explains, "Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language." How many characters? Potentially more than a million. So whether you're working with Greek or Gothic, Klingon or Korean, Unicode is for you.
Unicode also includes special typographical characters, such as hair spaces, thin spaces, and zero-width spaces, which we made by hand in last week's newsletter. But now you don't have to make them; using Unicode, you can get the real thing.
Of course, there is a catch. Using Unicode requires three things:
1. An operating system that supports it.
2. A program (application) that supports it.
3. A Unicode font that includes the characters you need (not all of them will, although in theory they should).
There's a list of such items here:
http://www.unicode.org/unicode/onlinedat/products.html
But I'll make it easy for you:
1. Common operating systems include Microsoft Windows 2000, NT, and XP, and Macintosh OS 9.2, X, 10.1, and X Server.
2. Versions of Microsoft Word include 97, 2000, and 2002 for Windows, and 98, 2001, and X for Macintosh. However, the Mac versions (and operating systems) may require a "Language Kit," which you can learn more about here:
http://www.hclrss.demon.co.uk/unicode/utilities_fonts.html#apple
3. Unicode fonts are rapidly becoming available. There's a great list here, and many of the fonts are free:
http://www.hclrss.demon.co.uk/unicode/fonts.html#general
Once you've installed a Unicode font, you can insert its special characters with the good old Insert > Symbol menu (be sure to select the Unicode font in the dropdown Font list).
You can also insert a character with the keyboard (in Word 2000 and higher) if you know its Unicode number. To do so, be sure a Unicode font is selected (Format > Font); then type the number into your document and press ALT + X. For example, let's say we need a zero-width space in Word 2000. The Unicode number for such a space is 200B. So all we have to do is type 200B into our document and press ALT + X. Presto!
You can learn more about using Unicode characters in Word here:
http://www.hclrss.demon.co.uk/unicode/utilities_editors.html#word97
For additional information on Word 2000 and 2002, scroll down past the Word 97 information (which is also relevant for the later versions).
If you need to look up the number of a Unicode character, you can do so here:
http://www.hclrss.demon.co.uk/unicode/search.html
If you just want to insert typographic spaces, here are the Unicode numbers you need:
Nonbreaking space: 00A0
En space: 2002
Em space: 2003
Three-per-em space: 2004
Four-per-em space: 2005
Six-per-em space: 2006
Figure space: 2007
Punctuation space: 2008
Thin space: 2009
Hair space: 200A
Zero-width space: 200B
And you'll find additional information on spaces here:
http://www.microsoft.com/typography/developers/fdsspec/spaces.htm
With Unicode, the world (or at least its scripts) is your oyster.
_________________________________________
RESOURCES
For a dazzling array of Unicode information, see Alan Wood's Unicode Resources site:
http://www.hclrss.demon.co.uk/unicode/index.html
Check out the official Unicode site here:
The official site: http://www.unicode.org
For online samples of interesting characters, see this page:
http://home.att.net/~jameskass/scriptlinks.htm
For a free Word add-in program to help you insert Unicode characters, go here:
http://hem.fyristorg.com/dahloe/uniqoder/
For information on artificial scripts, go here:
http://www.evertype.com/standards/csur/index.html
If you're a Tolkien fan, you might be interested in the Tengwar encoding proposal:
http://www.evertype.com/standards/csur/tengwar.html and in Tolkien fonts (but not necessarily Unicode):
http://www.geocities.com/TimesSquare/4948/
http://babel.uoregon.edu/yamada/fonts/tolkien.html
and in the Resources for Tolkien Linguistics site:
http://www.elvish.org/resources.html
And if you're actually interested in Klingon, here's the scoop:
http://www.evertype.com/standards/csur/klingon.html