Information about Unicode Typefaces

Unicode typefaces (also known as UCS fonts and Unicode fonts) are typefaces containing a wide range of characters, letters, digits, glyphs, symbols, ideograms, logograms, etc., which are collectively mapped into the standard Universal Character Set, derived from many different languages and scripts from around the world. Unlike most conventional computer fonts, which are specific to a particular language or legacy character set and contain only a small subset of the UCS characters, these fonts attempt to include many thousands of possible glyphs, so that they can be used as a single typeface across multi-lingual documents.

The Unicode standard does not specify the typeface (a collection of graphical shapes called glyphs) itself, but rather instead, it defines the abstract characters as a specific number (known as a codepoint) and also defines the required changes of shape depending on the context the glyph is used in (e.g., Combining characters, precomposed characters and letter-diacritic combinations). The choice of font, which governs how the abstract UCS characters are converted into a bitmap or vector output that can be viewed on a screen or printed, is left up to the user. If a font is chosen which does not contain a glyph for a codepoint used in the document, typically a question mark ("?"), a box, or some other Substitute character is displayed.

Currently (July, 2006), no single "Unicode font" includes all the characters defined in the present revision of the ISO 10646 (Unicode) standard. Many are continually updated to incorporate characters which were previously omitted or which were added in a newer version of the standard. Additionally, fonts may be updated to correct errors in past versions.

The UCS has over 1.1 million code points, but only the first 65,536 (the Plane 0: Basic Multilingual Plane, or BMP) had entered into common use before 2000. (See the Mapping of Unicode characters article for more information on other planes, including Plane 1: SMP, Plane 2: SIP, Plane 14: SSP, Plane 15 and 16: reserved for PUA.)

The first Unicode font (with very large character set, and supporting many Unicode blocks) was Lucida Sans Unicode, it was developed by Charles Bigelow & Kris Holmes' in March, 1993 (Shipped with Windows NT 3.1). Second was Unihan font, developed by Ross Paterson in 1993. Third was Everson Mono Unicode font, released in 1995, developed by Michael Everson.

Unicode
Encodings
UCS
Mapping
Bi-directional text
BOM
Han unification
Unicode and HTML
Unicode and E-mail
Unicode typefaces

Issues

There are typographical ambiguities in Unicode, so that some of the unified Chinese characters will be typographically different in different regions. For example, Unicode point U+9AA8 (骨) is typographically different between simplified Chinese and traditional Chinese. This has implications for the idea that a single typeface can satisfy the needs of all locales.[3]

Application of Unicode typefaces

Beside all the issues, Unicode is now the base character set for many new standards and protocols, and is built into the architecture of operating systems (Microsoft Windows, Apple Mac OS X, and many versions of Unix), programming languages (Ada, Perl, Python, Java, Common LISP, APL), and libraries (IBM International Components for Unicode (ICU) along with the Pango, Graphite, Scribe, Uniscribe, and ATSUI rendering engines), font formats (TrueType and OpenType) and so on. Many other standards are also getting upgraded to Unicode compliance, day by day.

Utility software

Utility software can be used to see exactly which characters are included inside a font file:

List of Unicode fonts

Of the many Unicode fonts available, the few are listed below are the most commonly used by a majority of users around the world on mainstream computing platforms. More Unicode fonts can be found in the (List of typefaces) article's "Unicode fonts" section.

List of Unicode Fonts
FontChar(s)GlyphsKernpairs[1]VersionFont FamilyFont styleFont typeSerif styleLicenseNotes
Arial1,4191,6749093.00ArialRegularOTF+TTO[2]Normal SansProprietaryIncluded with Microsoft Windows.
Arial Unicode MS38,91750,37701.00ArialRegularOTF+TTO[2]Normal SansProprietaryIncluded with Microsoft Office.
Bitstream Cyberbit32,91029,9349352.0 betaBitstream CyberbitRomanTTFCoveFreewareFor non-commercial use only.
Cardo2,8792,8822160.098 (2004)CardoRegularTTFCoveFreewareFor non-commercial and non-profit uses only.
Caslon Roman3,6843,6860001.000 16-12-2001CaslonRomanTTF BSD-like license
Code200051,23961,8641151.16Code2000RegularTTFAnySharewareRegister after "reasonable" period (author's words).
Charis SIL1,9583,08404.002Charis SILRegularTTFAnyOFL
Chryſanşi Unicode (Chrysanthi Unicode)4,8184,38303.1ChrysanthiRegularTTFCoveFreeware
ClearlyU-9,53801.9----Freeware
DejaVu Sans5,2235,4272,5582.18DejaVuBookOTF+TTO[2]Normal SansBitstream Vera license and public domain for additions
Doulos SIL1,9583,08304.014Doulos SILRegularTTFAnyOFL
Everson Mono Unicode4,8934,89903.2b4Everson MonoRegularTTFAnySharewareMonospaced width.
FreeSerif3,9145,25701.52FreeSerifMediumTTFCoveGPLSans serif (FreeSans) and monospaced (FreeMono) variants.
Gentium Regular1,4691,6992,8571.0.2 (2005)GentiumRegularTTFAnyOFL
GNU Unifont33,58033,5830001.000UnifontMediumBitmapAnyGPL
Junicode2,2352,25600.6.12JunicodeRegularTTFAnyGPL
Linux Libertine1,9821,98502.2.0Linux LibertineRegularOTF+TTO[2]AnyGPL, OFL
Lucida Grande2,2452,82605.0d8e1 (Revision 1.002)Lucida GrandeRegularOTFNormal SansProprietaryIncluded with Mac OS X. Any proportion.
Lucida Sans Unicode1,7651,77602.00Lucida SansRegularOTF+TTO[2]Normal SansProprietaryIncluded with Microsoft Windows.
Microsoft Sans Serif2,3012,25701.41Microsoft Sans SerifRegularOTF+TTO[2]Normal SansProprietaryIncluded with Microsoft Windows.
New Gulim46,56749,28403.10New GulimRegularTTFObtuse CoveProprietaryIncluded with Microsoft Office 2000. Any Proportion.
Tahoma1,9122,0346743.14TahomaRegularOTF+TTO[2]Normal SansProprietaryIncluded with Microsoft Windows.
Times New Roman1,4191,6748673.00Times New RomanRegularOTF+TTO[2]CoveProprietaryIncluded with Microsoft Windows.
TITUS Cyberbit Basic9,34110,04403.0 (2000) (Revision 4.00)TITUS CyberbitRegularTTFCoveFreeware
Y.OzFontN21,36059,67809.41Y.OzFontNRegularTTFAnyFreewareSans-serif (for Japanese) and Monospace (for Latin).
FontChar(s)GlyphsKernpairs[1]VersionFont FamilyFont styleFont typeSerif styleLicenseNotes
Note
^†  OTF+TTO: OpenType font with TrueType outlines.
^‡  OpenType fonts sometimes don't contain a one-by-one Kernpair table but a kern-by-classes table where groups of similar characters are seen as one kern group. I.e. have V and W nearly the same left and right geometry. So “0” doesn't mean that no kerning is supported!

Comparison of fonts

Number of characters included by the above version of fonts, for different Unicode blocks (or, ranges), are listed below.

0000-077F

 N  = Numerical digits. This number of characters are included in the font for that range.
= Most or some portion out of all characters in that range are present in the font.
 X  = No characters are included in the font for that range or Unicode block.
  -  = Data not available now.


Unicode Fonts
Font 

Range 
Basic Latin (0000–007F)95951289596959595959595959595128979698959512895959595
Latin-1 Supplement (0080–00FF)9696128969696969696969696969612896969696969696969696
Latin Extended-A (0100–017F)128128128128128128128128128128128128128128128128128128128128128128128128128
Latin Extended-B (0180–024F)28148208521782081941881782081941831731781561791941831191797292818328
IPA Extensions (0250–02AF)189969694969694949696969694899494968994X219655
Spacing Modifier Letters (02B0–02FF)9578080638080636263808029565763488057910998016
Combining Diacritical Marks (0300–036F)57211211282112104828292104107728272106621066882X82510632
Greek and Coptic (0370–03FF)731051441241101271476110127141189582105801061069111273737312876
Cyrillic (0400–04FF)118226256223825520923824425520924624780230X1422441532469412211824766
Cyrillic Supplement (0500–052F)XXXX162016X16201616161XXX16X16XXX16X
Armenian (0530–058F)X85XX8586X858686X86XX85XXXXXXXX86X
Hebrew (0590–05FF)528247868386X608254X8244X82XX825152X525283X
Arabic (0600–06FF)2081946510X185X69201111X363X62XXXX208X206208185X
Syriac (0700–074F)XXXXX50XXXXXXXXXXXXXXXXX76X
Arabic Supplement (0750–077F)XXXXXXXXXXXXXXXXXXXXXXXXX

Range

Font  

Range 

0780-139F


Range

Font  

Range 
Thaana (0780–07BF)XXXXX50XX49XXX49XXXXXXXXXX50X
N'ko (07C0–07FF)XXXXXXXXX54XXXXXXXXXXXXXXX
Devanagari (0900–097F)X104XXX110X104103XXX93X104XXXXXXXX106X
Bengali (0980–09FF)X89XXX91X89XXXX91XXXXXXXXXXXX
Gurmukhi (0A00–0A7F)X75XXX77XXXXXX73XXXXXXXXXXXX
Gujarati (0A80–0AFF)X78XXX83X78XXXXXXXXXXXXXXXXX
Oriya (0B00–0B7F)X79XXX81XXXXXXXXXXXXXXXXXXX
Tamil (0B80–0BFF)X61XXX71XXXXXX49XXXXXXXXXXXX
Telugu (0C00–0C7F)X80XXX80XXXXXX42XXXXXXXXXXXX
Kannada (0C80–0CFF)X80XXX86XXXXXXXXXXXXXXXXXXX
Malayalam (0D00–0D7F)X78XXX78XXXXXX79XXXXXXXXXXXX
Sinhala (0D80–0DFF)XXXXXXXXXXXXXXXXXXXXXXXXX
Thai (0E00–0E7F)X8791X8687XX871XX87X87XX87X87X87X87X
Lao (0E80–0EFF)X65XXX65XX6565XXXX65XXXXXXXXXX
Tibetan (0F00–0FFF)X168XXXXX16855XXXXX34XXXXXXXXXX
Burmese (Mayanmar) (1000–109F)XXXXX78XXXXXXXXXXXXXXXXXXX
Georgian (10A0–10FF)X78X1X81XX7883X80XX401XXXXXXX83X
Hangul Jamo (1100–11FF)X240XXX240XX240XXXXX67XXXXX250XXXX
Ethiopic (Ge'ez) (1200–137F)XXXXX356XX345XXX346X348XXXXXXXX364X
Ethiopic Supplement (1380–139F)XXXXX26XXXXXXXXXXXXXXXXXXX

Range

Font  

Range 

13A0-1DBF


Range

Font  

Range 
Cherokee (13A0–13FF)XXXX8585XX85XX85XXXXXXXXXXXXX
Unified Canadian Aboriginal Syllabics (1400–167F)XXXXX630XX630404X630XXXXXXXXXXXX
Ogham (1680–169F)XXXX2929XX29XX29XX29XXXXXXXX32X
Runic (16A0–16FF)XXX818181X8381XX81XX8181XXXXXXX81X
Tagalog (Baybayin) (1700–171F)XXXXXXXXXXXXXXXXXXXXXXXXX
Hanunoo (1720–173F)XXXXX2XXXXXXXXXXXXXXXXXXX
Buhid (1740–175F)XXXXX20XXXXXXXXXXXXXXXXXXX
Tagbanwa (1760–177F)XXXXXXXXXXXXXXXXXXXXXXXXX
Khmer (1780–17FF)XXXXX114XXXXXXXXXXXXXXXXXXX
Mongolian (1800–18AF)XXXXX155XXXXXXXXXXXXXXXXXXX
Limbu (1900–194F)XXXXX66XXXXXXXXXXXXXXXXXXX
Tai Le (1950–197F)XXXXXXXXXXXXXXXXXXXXXXXXX
Tai Lue (1980–19DF)XXXXXXXXXXXXXXXXXXXXXXXXX
Khmer Symbols (19E0–19FF)XXXXX32XXXXXXXXXXXXXXXXXXX
Buginese (1A00–1A1F)XXXXX30XXXXXXXXXXXXXXXXXXX
Phonetic Extensions (1D00–1D7F)XXX17X109128XX10512810714XX108X108XXXXX108X
Phonetic Extensions Supplement (1D80–1DBF)XXXXXX64XX38641075XXXXXXXXXXXX

Range

Font  

Range 

1DC0-257F


Range

Font  

Range 
Combining Diacritical Marks Supplement (1DC0–1DFF)XXX2X131XX61XXXXXXXXXXXXXX
Latin Extended Additional (1E00–1EFF)96246968824624624624624624624624624624624624624624682468246962478
Greek Extended (1F00–1FFF)X233X233233233X233233233X233233233233232232233X233X233X2364
General Punctuation (2000–206F)38631126569105735677104739750397772568767272527386891
Superscripts and Subscripts (2070–209F)12848928293431283434292528282936292816112929
Currency Symbols (20A0–20CF)61348616222215162222181314141318121651661218
Combining Diacritical Marks for Symbols (20D0–20FF)X1848X2027XX204X271X201X2XXXXXX27
Letterlike Symbols (2100–214F)657801359782595975274461573353257610661375
Number Forms (2150–218F)6486444950494949504949454484950494426462849
Arrows (2190–21FF)7911121410011219921001121911273X91X322091X13X721112
Mathematical Operators (2200–22FF)1724225624242256172422422451725621922421345182421443141780256
Miscellaneous Technical (2300–23FF)412325636572282415464220720X123211410X5X48209
Control Pictures (2400–243F)X3764X3939XX392X391X37XX137XXXXX4
Optical Character Recognition (2440–245F)X1132XX11XX10XX11XX11XXXXXXXXX11
Enclosed Alphanumerics (2460–24FF)X139160X73160XX13910X15910X139160731XX82XX112160
Box Drawing (2500–257F)401281281128128X128XXX128106X128XXX128X97X40112128

Range

Font  

Range 

2580-2DDF


Range

Font  

Range 
Block Elements (2580–259F)82232X2232X22X32X3219X22XXX21X8X81032
Geometric Shapes (25A0–25FF)15809688896295889629627X801241979X346155396
Miscellaneous Symbols (2600–26FF)1110625631108163X128X176X12587X106412X1412411133146
Dingbats (2700–27BF)X160X62174217411742160160X1611013XXXXXX174
Miscellaneous Mathematical Symbols-A (27C0–27EF)XXX9X352XX72XXXXXXXXXXXXX28
Supplemental Arrows-A (27F0–27FF)X0X2X16XXX16XXXXXXXXXXXXXX16
Braille Patterns (2800–28FF)X0XXX256XX256256XXXX256XXXXXXXXX256
Supplemental Arrows-B (2900–297F)X0X6X128XXX6X111XXXXXXXXXXXX128
Miscellaneous Mathematical Symbols-B (2980–29FF)X0X2X128XXX13X62XXXXXXXXXXXX128
Supplemental Mathematical Operators (2A00–2AFF)X0XXX256XXX72X21XXXXXXXXXXXX256
Miscellaneous Symbols and Arrows (2B00–2BFF)X0XXX31XXX31XXXXXXXXXXXXXX14
ReservedXXXXX17XXXXXXXXXXXXXXXXXXX
Glagolitic (2C00–2C5F)X0XXXXXXXXXXXXXXXXXXXXX94X
Latin Extended-C (2C60-2C7F)XXXXXXXXX17XXXXXX17XXXXXXXX
Coptic (2C80–2CFF)X0XXX114XXXXXXXXXXXXXXXXX114X
Georgian Supplement (2D00–2D2F)X0XXXXXXXXXXXXXXXXXXXXX38X
Tifinagh (2D30–2D7F)X0XXX55XXX55XXXXXXXXXXXXXXX
Ethiopic Extended (2D80–2DDF)X0XXX79XXXXXXXXXXXXXXXXXXX

Range

Font  

Range 

2E00-4DBF


Range

Font  

Range 
Supplemental Punctuation (2E00–2E7F)X0X24X26XXXXXXXXXXXXXXXXXXX
CJK Radicals Supplement (2E80–2EFF)X0XXX115XXXXXXXXXXXXXXXXXX128
Kangxi Radicals (2F00–2FDF)X0XXX214XXXXXXXXXXXXXXXXXX214
Ideographic Description Characters (2FF0–2FFF)X0XXX12XXXXXXXXXXXXXXXXX12
CJK Symbols and Punctuation (3000–303F)X576412964X40XXXX18X39XXXXX17XX3145
Hiragana (3040–309F)X9096X9090X90XXX9086X87XXXXX83XX9093
Katakana (30A0–30FF)X9496X9494X94XXX9492X90XXXXX86XX9496
Bopomofo (3100–312F)X4048XX40X37XXXXXX37XXXXXXXX37X
Hangul Compatibility Jamo (3130–318F)X9496XX94XX94XXXXX93XXXXX94XXX1
Kanbun (3190–319F)X1616XX16X14XXXXXXXXXXXXXXXX16
Bopomofo Extended (31A0–31BF)X032XX24XXXXXXXXXXXXXXXXXXX
CJK Strokes (31C0–31EF)X048XX16XXXXXXXXXXXXXXXXXXX
Katakana Phonetic Extensions (31F0–31FF)X016XXXXXXXXXXXXXXXXXXXXX16
Enclosed CJK Letters and Months (3200–32FF)X202256XX233XX58XXXXX69XXXXX58XXX174
CJK Compatibility (3300–33FF)X249256XX161X105XXXXXX100XXXXX80XXX85
CJK Unified Ideographs Extension A (3400–4DBF)XX2,350XX6,582X1XXXXXXXXXXXX6,582XXX178

Range

Font  

Range 

4DC0-FE2F


Range

Font  

Range