Information about Unicode Collation Algorithm
The Unicode collation algorithm (UCA) provides a standard way to put names, words or strings of text in sequence according to the needs of a particular situation.
When used with the default Unicode collation element table (DUCET), this collation method is similar to the European ordering rules for strings in most European languages. In particular, for strings in the Latin alphabet, the ordering is the same as normal sorting order in English and similar languages, since it first looks only at letters stripped of any modifications or diacritical marks.
Note - For a detailed overview of this complex method, full specification can be found at Unicode Technical Standard #10.
In addition to providing a default sorting order, UTS #10 also specifies how to tailor the sorting behaviour to be appropriate for a given locale.
An important open source implementation of UCA is included with the International Components for Unicode, which also supports tailoring. You can see the effects of tailoring and a large number of language specific tailorings in the on-line ICU Locale Explorer.
When used with the default Unicode collation element table (DUCET), this collation method is similar to the European ordering rules for strings in most European languages. In particular, for strings in the Latin alphabet, the ordering is the same as normal sorting order in English and similar languages, since it first looks only at letters stripped of any modifications or diacritical marks.
Note - For a detailed overview of this complex method, full specification can be found at Unicode Technical Standard #10.
In addition to providing a default sorting order, UTS #10 also specifies how to tailor the sorting behaviour to be appropriate for a given locale.
An important open source implementation of UCA is included with the International Components for Unicode, which also supports tailoring. You can see the effects of tailoring and a large number of language specific tailorings in the on-line ICU Locale Explorer.
See also
External links and references
- Unicode Collation Algorithm: Unicode Technical Standard #10
- International Components for Unicode (ICU)
- Mimer SQL Unicode Collation Charts
Tools
- ICU Locale Explorer An online demonstration of the Unicode Collation Algorithm using International_Components_for_Unicode
- msort A sort program that provides an unusual level of flexibility in defining collations and extracting keys.
European ordering rules, or the EOR, provide a way to put them in sequence. The ordering rules consist of 4 normative levels.
Level 1 sorts the letters.
..... Click the link for more information.
Level 1 sorts the letters.
..... Click the link for more information.
Latin alphabet
Child systems Numerous: see Alphabets derived from the Latin
Sister systems Cyrillic
Coptic
Armenian
Runic/Futhark
Unicode range See Latin characters in Unicode
ISO 15924 Latn
Note
..... Click the link for more information.
Child systems Numerous: see Alphabets derived from the Latin
Sister systems Cyrillic
Coptic
Armenian
Runic/Futhark
Unicode range See Latin characters in Unicode
ISO 15924 Latn
Note
..... Click the link for more information.
A diacritical mark or diacritic, also called an accent, is a small sign added to a letter to alter pronunciation or to distinguish between similar words.
..... Click the link for more information.
..... Click the link for more information.
International Components for Unicode (ICU) is an open source project of mature C/C++ and Java libraries for Unicode support, software internationalization and software globalization. ICU is widely portable to many operating systems and environments.
..... Click the link for more information.
..... Click the link for more information.
Collation is the assembly of written information into a standard order. This is commonly called alphabetisation, though collation is not limited to ordering letters of the alphabet.
..... Click the link for more information.
..... Click the link for more information.
This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus