ANNEX 7

ATLANTIC SUBSET
OF THE INTERNATIONAL STANDARD
ISO/IEC 10646 UNIVERSAL MULTIPLE-OCTET CODED CHARACTER SET

1994-09-15
corrected 1998-08-04
J. W. van Wingen

NOTE: The following text presents what could have been an Atlantic Standard, if such a thing would exist. Thus it has no formal status at all. Nevertheless, it contains everything a serious user would need to know, when he wants to use ISO/IEC 10646-1 for applications restricted to the Atlantic part of the world, without spending $400 on the complete thing.

Part 1: General structure and Latin script

INTRODUCTION

It is commonly understood that the whole of the repertoire of ISO/IEC 10646 Universal Multiple-octet Coded Character Set, is not a firm requirement to large groups of users of European languages on both sides of the Atlantic. In order to present guidance to manufacturers and users that they may avoid doing their own selection a subset is defined specifying the coding of those characters having been identified as the total character repertoire needed for European languages. A larger subset than the minimum set specified here may be needed for special applications, but any extensions are not prohibited. Some recommendations are given in an Annex, for sets of characters needed with identified applications.

The text of this Atlantic Standard (AS) is based on that of ISO/IEC 10646-1:1993 where possible, after removal of everything that is not relevant to the Atlantic situation. It makes this AS a self-contained document which does not require from the reader, if he is interested only in characters used in European languages, any consultation of ISO/IEC 10646-1:1993. On the other hand, if the reader wants to understand the principles of multi-octet coding, and the way these are applied to any script, the study of ISO/IEC 10646-1:1993 is an absolute requirement, in particular where information on a transformation format (UTF-8), retransmission, octet value representation notations, character naming guidelines, is wanted (presented in its Annexes G, H, J, K, R). Thus this AS does not replace the ISO/IEC standard, not even where European scripts are in exclusive use. The text as presented just states what is needed with European languages to specify the coding of the characters contained in this AS, and to indicate requirements to conforming equipment or other character supporting product at procurement. Should, despite the great care taken in the preparation of this document, the text of this AS lead to an interpretation or to a conclusion different from that reached from reading ISO/IEC 10646-1:1993, then that from the latter will prevail.

The numbers of the original clauses of ISO/IEC 10646-1 are given between parentheses behind the number in the heading of the corresponding clause of this Subset, to facilitate comparison. Further reference to this AS will be made as to "this Subset".

Only two-octet coding is used for characters in this Subset.

No levels of implementation are specified.

11 (1) SCOPE

This Atlantic Standard specifies a subset of ISO/IEC 10646, Universal Multiple-octet Coded Character Set, required for coding the character repertoire in modern use of the listed European languages, written with Latin, Greek or Cyrillic script.

Covered are:

Official languages using Latin script:
 
Albanian 
Croat 
Czech 
Danish 
Dutch 
English 
Estonian 
Finnish 
French 
German 
Hungarian 
Icelandic 
Irish 
Italian
Latvian
Lithuanian
Luxemburgish
Maltese
Norwegian
Polish
Portuguese
Romanian
Slovak
Slovenian
Spanish
Swedish
Turkish

Official languages using Greek script:

Greek
Official languages using Cyrillic script:
Bulgarian
Byelorussian
Macedonian
Russian
Serbian
Ukrainian
Regional languages using Latin script:
Basque (France, Spain)
Breton (France)
Catalan (France, Spain, Andorra)
Faroese (Denmark)
Frisian (Netherlands)
Gaelic (UK)
Galician (Spain)
Greenlandic (Denmark)
Rumantsch (Switzerland)
Sami (Norway, Sweden, Finland)
Sorbian (Germany)
Welsh (UK)
The coding of the repertoires for Afrikaans and Esperanto is included in some normative tables for compatibility with the repertoire of ISO/IEC 6937:1994.

Part 1 of this AS covers Latin script, other parts will specify the coding of Greek and Cyrillic script.

The coding method used is that of two-octet form (UCS-2), because all required characters fit in the Basic Multilingual Plane (BMP) that is specified in ISO/IEC 10646-1:1993. No difference exists between any coding in this Subset and the UCS-2 coding of the same character.

A number of subrepertoires of this Subset is indicated, identified with a name, to enable the user to state his requirements in terms of options.

Information on the coding of some characters outside the repertoire specified in ISO/IEC 10367 for use in special applications required for restricted groups of European users is presented in a normative Annex.

2 (2) CONFORMANCE

2.1 Conformance of information interchange

A coded-character-data-element (CC-data-element) within coded information for interchange is in conformance with this Subset if all the coded representations of characters within that CC-data-element conform to the requirements of clause 6.

A claim of conformance shall identify whether the European Latin, the European Greek, the European Cyrillic or the European Special Character Repertoire, or any other subrepertoire of this Subset specified in this ENV, or a combination of these, is adopted.

2.2 Conformance of devices

A device is in conformance with this Subset if it conforms to the requirements of 2.2.1, and either or both of 2.2.2 and 2.2.3.

A claim of conformance shall identify the document which contains the description specified in 2.2.1, and shall identify whether the European Latin, the European Greek, the European Cyrillic or the European Special Character Repertoire, or any other subrepertoire of this Subset specified in this ENV, or a combination of these, is adopted.

2.2.1 Device description

A device that conforms to this Subset shall be the subject of a description that identifies the means by which the user may supply characters to the device, or may recognize them when they are made available to him, as specified respectively in 2.2.2 and 2.2.3.

2.2.2 Originating devices

An originating device shall allow its user to supply any sequence of characters from the repertoire adopted, and shall be capable of transmitting their coded representations within a CC-data-element.

2.2.3 Receiving devices

A receiving device shall be capable of receiving and interpreting any coded representations of characters that are within a CC-data-element, and that conform to 2.1, and shall make the corresponding characters available to its user in such a way that the user can identify them from among those of the repertoire adopted, and can distinguish them from each other.

13 (3) NORMATIVE REFERENCES

The following standards contain provisions which, through reference in this text, constitute provisions of this Atlantic Standard. At the time of publication, the editions indicated were valid. All standards are subject to revision, and parties to agreements based on this Atlantic Standard are encouraged to investigate the possibility of applying the most recent editions of the standards listed below. Members of IEC and ISO maintain registers of currently valid International Standards.

ISO/IEC 2022:1994 Information processing - 7-bit and 8-bit coded character sets - Code extension techniques.

ISO/IEC 6429:1993 Information processing - Control Functions.

ISO/IEC 10367:1991 Information processing - Standardized coded graphic character sets for use in 8-bit codes.

ISO/IEC 10646-1:1993 Information processing - Universal Multiple-octet Coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane.

4 (4) DEFINITIONS

The numbers of definitions are those given in ISO/IEC 10646-1:1993.

Where necessitated by the scope of this Subset definitions have been changed to avoid referring to features not included.

4.4 coded-character-data-element (CC-data-element): An element of interchanged information that is specified to consist of a sequence of coded representations of characters, in accordance with one or more identified standards for coded character sets.

4.5 cell : The place within a row at which an individual character may be allocated.
4.6 character : A member of a set of elements used for the organization, control or representation of data.
4.8 coded character : A character together with its coded representation.
4.9 coded character set; code : A set of unambiguous rules that establishes a character set and the relationship between the characters of the set and their coded representation.
4.10 code table : A table showing the characters allocated to the octets in a code.
4.14 control function : An action that affects the recording, processing, transmission or interpretation of data, and that has a coded representation consisting of one or more octets.
4.17 device: A component of information processing equipment which can transmit, and/or receive, coded information within CC-data-elements.
4.18 graphic character : A character, other than a control function, that has a visual representation normally handwritten, printed or displayed.
4.19 graphic symbol : A visual representation of a graphic character or of a control function.
4.23 octet : An ordered sequence of eight bits considered as a unit.
4.24 plane : the coding space of this Subset; of 256 rows.
4.28 repertoire : A specified set of characters that are represented by means of one or more bit combinations of a coded character set.
4.29 row : A subdivision of a plane; of 256 cells.
4.30 script : A set of graphic characters used for the written form of one or more languages.
4.32 user: A person or other entity that invokes the services provided by a device.

15 (5) THE UNIVERSAL MULTIPLE-OCTET CODED CHARACTER SET GENERAL STRUCTURE

The general structure of the Universal Multiple-Octet Coded Character Set, of which this Subset is a proper subset, is described in this explanatory clause. The normative specification of the structure is given in later clauses, (this is indicated by the use of the term "shall").

The value of any octet is expressed in hexadecimal notation from 00 to FF in ISO/IEC 10646.

The canonical form of UCS uses a four-dimensional coding space, regarded as a single entity, consisting of 256 * 256 planes. Each plane consists of 256 one-dimensional rows, each row consisting of 256 cells. A character is located and coded at a cell within this coding space or the cell is declared unused.

In the canonical form, four octets are used to represent each character. The first plane, having 00 00 as its first two octets, is called the Basic Multilingual Plane (BMP).

In addition to the canonical form, a two-octet BMP is specified. This BMP can be used as a two-octet coded character set identified as UCS-2.

Subsets of the coding space may be used to give a sub-repertoire of graphic characters. The Atlantic Subset specifies a selection of the coded characters of UCS-2, using two-octet coding only.

6 (6) CODING OF CHARACTERS

In the UCS-2, and thus in this Subset, each character shall be represented by a sequence of two octets. The most significant octet of this sequence shall be the row-octet. The least significant octet of this sequence shall be the cell-octet. Terming the octets for brevity as R-octet and C-octet, this sequence may be represented as

most-significant least-significant

R-octet C-octet

The value of any octet shall be represented by two hexadecimal digits, for examples: 31 or FE. When a single character is to be identified in terms of the values of its row and cell, this shall be represented such as

0031 for DIGIT ZERO
0041 for LATIN CAPITAL LETTER A

Within each octet the most significant bit shall be bit 8 and the least significant bit shall be bit 1. Accordingly, the weight allocated to each bit shall be

high order bits low order bits

bit: b8 b7 b6 b5 b4 b3 b2 b1
weight: 128 64 32 16 8 4 2 1

The sequence of the octets that represent a character, and the most significant and least significant ends of it, shall be maintained as shown above. When not serialized as octets, a more significant octet shall precede less significant octets. When not serialized as octets, the order of octets may be specified by agreement between sender and recipient.

17 (7) SPECIAL FEATURES OF THIS SUBSET
 


8 (13) NATURE OF THIS SUBSET

ISO/IEC 10646 provides the specification of subsets of coded graphic characters for use in interchange, by originating devices and by receiving devices.

This Subset presents a "limited" subset in the sense defined in subclause 13.1 of ISO/IEC 10646-1:1993, by consisting of a list of graphic characters in the specified subset. It contains no reference to any of the collections that are listed in Annex A of that International Standard, like there are LATIN-1 SUPPLEMENT, LATIN EXTENDED-A, LATIN EXTENDED-B or EXTENDED ADDITIONAL. Many of these collections contain characters not in the repertoire of this Subset.

9 (14) CODED REPRESENTATION FORM OF THIS SUBSET

This Subset provides only a single form, that of characters from the European repertoire with each character represented by two octets.

Within a CC-data-element conforming to the requirements of this Subset a character from the repertoire of this Subset shall be represented by two octets comprising the R-octet and the C-octet as specified in clause 6.

10 (15) IMPLEMENTATION LEVELS

This Subset does not specify implementation levels.

11 (16) USE OF CONTROL FUNCTIONS WITH THIS SUBSET

This Subset provides for use of control functions encoded according to ISO 2022, ISO/IEC 6429 or similarly structured standards for control functions, and standards derived from these. A set or subset of such control functions may be used in conjunction with this coded character set. These standards encode a control function as a sequence of one or more octets.

When a C0 control function of ISO/IEC 6429 is used with this coded character set, its coded representation as specified in ISO/IEC 6429 shall be padded to correspond with the number of octets adopted in this Subset. Thus, the least significant octet shall be the bit combination specified in ISO/IEC 6429, and the more significant octet shall consist of zeros only.

For example, the control function FORM FEED is represented by "000C" in this Subset.

For escape sequences, control sequences, and control strings (see ISO/IEC 6429) consisting of a coded control function consisting of a single bit combination, followed by additional bit combinations in the range 20 to 7F, each bit combination shall be padded by an octet with value 00.

For example, the escape sequence "ESC 02/00 04/00" is represented by "001B 0020 0040".

When using a C1 control function of ISO/IEC 6429 with this coded character set, it shall be coded as ESC Fe sequence (see ISO/IEC 6429) padded as specified above.

For example, the control function PARTIAL LINE BACKWARD - PLU (08/12 in ISO/IEC 6429 representation) is represented by "001B 004C".

Code extension control functions for the ISO 2022 code extension techniques (such as designation escape sequence, single shift and locking shift) shall not be used with this coded character set.

12 (17) DECLARATION OF IDENTIFICATION OF FEATURES

12.1 Purpose and context of identification

CC-data-elements conforming to ISO/IEC 10646 are intended to form all part of a composite unit of coded information that is interchanged between an originator and a recipient. The identification of ISO/IEC 10646, this Subset, or any subset of it, that have been adopted by the originator must also be available to the recipient. The route by which such identification is communicated to the recipient is outside the scope of ISO/IEC 10646 and this Subset.

However, some standards for interchange of coded information may permit, or require, that the coded representation of the identification applicable to the CC-data-element forms a part of the interchanged information. Such coded representations provide all or part of an identification data element, which may be included in information interchange in accordance with the relevant standard.

12.2 Specification of identification

The coded representation for the identification of this Subset, or of any of its subrepertoires, or of a control function set used with any of those, is specified in another Atlantic Standard (in preparation).

113 (18) STRUCTURE OF THE CODE TABLES AND LISTS

Clause 14 (25) sets out the detailed code tables and the list of character names for the graphic characters, their coded representation, and the character name for each character.

The graphic symbols are to be regarded as typical visual representations of the characters. ISO/IEC 10646 does not attempt (nor does this Subset) to prescribe the exact shape of each character. The shape is affected by the design of the font employed, which is outside the scope of ISO/IEC 10646.

Graphic characters specified in ISO/IEC 10646 are uniquely identified by their names. This does not imply that the graphic symbols by which they are commonly imaged are always different. Examples of graphic characters with similar graphic symbols are LATIN CAPITAL LETTER A, GREEK CAPITAL LETTER ALPHA, and CYRILLIC CAPITAL LETTER A.

The meaning attributed to any character is not specified by ISO/IEC 10646; it may differ from country to country, or from one application to another.

14 (25) CODE TABLES AND LISTS OF CHARACTER NAMES

The coding of a character shall be as specified in the tables of this clause. The characters included in these tables constitute the Atlantic Subset of ISO/IEC 10646.

14.1 The characters and their coding required for the European Latin repertoire of letters and digits are specified in Table 1.

14.2 The characters and their coding required for the European Special Characters repertoire are specified in Table 2.

NOTE:
These repertoires taken together (the European Latin subrepertoire) contain that of ISO/IEC 6937:1994 as a proper subrepertoire, the remaining characters being those needed for Welsh. A claim stating that the ISO/IEC 6937:1994 repertoire is covered may be formulated as of covering the European Latin subrepertoire without Full Welsh, or by the Latin-Telematic subrepertoire (see Table 1).

14.3 The characters and their coding required for the European Greek repertoire of letters and special characters are specified in Part 2 of this AS.

14.4 The characters and their coding required for the European Cyrillic repertoire of letters and special characters are specified in Part 3 of this AS.

14.5 The characters and their coding required for the European repertoire of box drawing characters (as included in ISO/IEC 10367) are specified in Part 4 of this AS.

14.6 The relation between the Atlantic Subset and its subrepertoires may be illustrated by the following scheme:


TABLE 1

VERSION 2.1
1995-02-15, correct. 1998-08-04
J. W. van Wingen

COMPLETE REPERTOIRE OF LETTERS AND DIGITS REQUIRED FOR LATIN WRITTEN EUROPEAN LANGUAGES

Grouped to Short Identifier (SID)
Transformation to ASCII in first column,
SGML public entities in 2nd column
Indication in columns 63-72:
Table in ISO 10367 where the character is included:
 
(Table 1/2) Basic G0 Set (as of ISO 4873)
(Table 3/4) Latin Alphabet No. 1 (as of ISO 8859-1)
(Table 5/6) Latin Alphabet No. 2 (as of ISO 8859-2)
(Table 7/8) Latin Alphabet No. 3 (as of ISO 8859-3)
(Table 9/10) Latin Alphabet No. 4 (as of ISO 8859-4)
(Table 11/12) Latin Alphabet No. 5 (as of ISO 8859-9)
(Table 21/22) Supplementary Set for Latin Alphabets
C/X  (Table A.1/2) ISO 6937, Supplementary Set (C) or Repertoire only (X)
Used in Teletex (CCITT T.61)
Used in Videotex (CCITT T.101)

Code in ISO 10646 indicated as double bytes in hexadecimal notation.

Named subrepertoires:

BASIC LATIN : requires all characters of this table marked with 0.
LATIN-1 : requires all characters of this table marked with 1.
LATIN-TELEMATIC: requires all characters of this table.
Note: These subrepertoires also require characters from Table 2.
 
Transformation to ASCII SGML public entries Short identifiers (SID) Letters name Tables in ISO 10367 ISO 10646 binary codes
LA01  LATIN SMALL LETTER A  0.......TV  0061
LA02  LATIN CAPITAL LETTER A  0.......TV  0041
LB01  LATIN SMALL LETTER B  0.......TV  0062
LB02  LATIN CAPITAL LETTER B  0.......TV  0042
LC01  LATIN SMALL LETTER C  0.......TV  0063
LC02  LATIN CAPITAL LETTER C  0.......TV  0043
LD01  LATIN SMALL LETTER D  0.......TV  0064
LD02  LATIN CAPITAL LETTER D  0.......TV  0044
LE01  LATIN SMALL LETTER E  0.......TV  0065
LE02  LATIN CAPITAL LETTER E  0.......TV  0045
LF01  LATIN SMALL LETTER F  0.......TV  0066
LF02  LATIN CAPITAL LETTER F  0.......TV  0046
LG01  LATIN SMALL LETTER G  0.......TV  0067
LG02  LATIN CAPITAL LETTER G  0.......TV  0047
LH01  LATIN SMALL LETTER H  0.......TV  0068
LH02  LATIN CAPITAL LETTER H  0.......TV  0048
LI01  LATIN SMALL LETTER I  0.......TV  0069
LI02  LATIN CAPITAL LETTER I  0.......TV  0049
LJ01  LATIN SMALL LETTER J  0.......TV  006A
LJ02  LATIN CAPITAL LETTER J  0.......TV  004A
LK01  LATIN SMALL LETTER K  0.......TV  006B
LK02  LATIN CAPITAL LETTER K  0.......TV  004B
LL01  LATIN SMALL LETTER L  0.......TV  006C
LL02  LATIN CAPITAL LETTER L  0.......TV  004C
LM01  LATIN SMALL LETTER M  0.......TV  006D
LM02  LATIN CAPITAL LETTER M  0.......TV  004D
LN01  LATIN SMALL LETTER N  0.......TV  006E
LN02  LATIN CAPITAL LETTER N  0.......TV  004E
LO01  LATIN SMALL LETTER O  0.......TV  006F
LO02  LATIN CAPITAL LETTER O  0.......TV  004F
LP01  LATIN SMALL LETTER P  0.......TV  0070
LP02  LATIN CAPITAL LETTER P  0.......TV  0050
LQ01  LATIN SMALL LETTER Q  0.......TV  0071
LQ02  LATIN CAPITAL LETTER Q  0.......TV  0051
LR01  LATIN SMALL LETTER R  0.......TV  0072
LR02  LATIN CAPITAL LETTER R  0.......TV  0052
LS01  LATIN SMALL LETTER S  0.......TV  0073
LS02  LATIN CAPITAL LETTER S  0.......TV  0053
LT01  LATIN SMALL LETTER T  0.......TV  0074
LT02  LATIN CAPITAL LETTER T  0.......TV  0054
LU01  LATIN SMALL LETTER U  0.......TV  0075
LU02  LATIN CAPITAL LETTER U  0.......TV  0055
LV01  LATIN SMALL LETTER V  0.......TV  0076
LV02 LATIN CAPITAL LETTER V  0.......TV  0056
LW01  LATIN SMALL LETTER W  0.......TV  0077
LW02  LATIN CAPITAL LETTER W  0.......TV  0057
LX01  LATIN SMALL LETTER X  0.......TV  0078
LX02  LATIN CAPITAL LETTER X  0.......TV  0058
LY01  LATIN SMALL LETTER Y  0.......TV  0079
LY02  LATIN CAPITAL LETTER Y  0.......TV  0059
LZ01  LATIN SMALL LETTER Z  0.......TV  007A
LZ02  LATIN CAPITAL LETTER Z  0.......TV  005A
/a  &aacute  LA11  LATIN SMALL LETTER A WITH ACUTE  .12345.XTV  00E1
/A  &Aacute  LA12  LATIN CAPITAL LETTER A WITH ACUTE  .12345.XTV  00C1
/c  &cacute  LC11  LATIN SMALL LETTER C WITH ACUTE  ..2....XTV  0107
/C  &Cacute  LC12  LATIN CAPITAL LETTER C WITH ACUTE  ..2....XTV  0106
/e  &eacute  LE11  LATIN SMALL LETTER E WITH ACUTE  .12345.XTV  00E9
/E  &Eacute LE12  LATIN CAPITAL LETTER E WITH ACUTE  .12345.XTV  00C9
/i  &iacute  LI11  LATIN SMALL LETTER I WITH ACUTE  .12345.XTV  00ED
/I  &Iacute  LI12  LATIN CAPITAL LETTER I WITH ACUTE  .12345.XTV  00CD
/l  &lacute  LL11  LATIN SMALL LETTER L WITH ACUTE  ..2....XTV  013A
/L  &Lacute  LL12  LATIN CAPITAL LETTER L WITH ACUTE  ..2....XTV  0139
/n  &nacute  LN11  LATIN SMALL LETTER N WITH ACUTE  ..2....XTV  0144
/N  &Nacute  LN12  LATIN CAPITAL LETTER N WITH ACUTE  ..2....XTV  0143
/o  &oacute  LO11  LATIN SMALL LETTER O WITH ACUTE  .123.5.XTV  00F3
/O  &Oacute  LO12  LATIN CAPITAL LETTER O WITH ACUTE  .123.5.XTV  00D3
/r  &racute  LR11  LATIN SMALL LETTER R WITH ACUTE  ..2....XTV  0155
/R  &Racute  LR12  LATIN CAPITAL LETTER R WITH ACUTE  ..2....XTV  0154
/s  &sacute  LS11  LATIN SMALL LETTER S WITH ACUTE  ..2....XTV  015B
/S  &Sacute  LS12  LATIN CAPITAL LETTER S WITH ACUTE  ..2....XTV  015A
/u &uacute  LU11  LATIN SMALL LETTER U WITH ACUTE  .12345.XTV  00FA
/U  &Uacute  LU12  LATIN CAPITAL LETTER U WITH ACUTE  .12345.XTV  00DA
/w  &wacute  LW11  LATIN SMALL LETTER W WITH ACUTE  * ..........  1E83
/W  &Wacute  LW12  LATIN CAPITAL LETTER W WITH ACUTE  * ..........  1E82
/y  &yacute  LY11  LATIN SMALL LETTER Y WITH ACUTE  .12...AXTV  00FD
/Y  &Yacute  LY12  LATIN CAPITAL LETTER Y WITH ACUTE  .12...AXTV  00DD
/z  &zacute  LZ11  LATIN SMALL LETTER Z WITH ACUTE  ..2....XTV  017A
/Z  &Zacute  LZ12  LATIN CAPITAL LETTER Z WITH ACUTE  ..2....XTV  0179

\a  &agrave  LA13  LATIN SMALL LETTER A WITH GRAVE  .1.3.5.XTV  00E0
\A  &Agrave  LA14  LATIN CAPITAL LETTER A WITH GRAVE  .1.3.5.XTV  00C0
\e  &egrave  LE13  LATIN SMALL LETTER E WITH GRAVE  .1.3.5.XTV  00E8
\E  &Egrave  LE14  LATIN CAPITAL LETTER E WITH GRAVE  .1.3.5.XTV  00C8
\i  &igrave  LI13  LATIN SMALL LETTER I WITH GRAVE  .1.3.5.XTV 00EC
\I  &Igrave  LI14  LATIN CAPITAL LETTER I WITH GRAVE  .1.3.5.XTV  00CC
\o  &ograve  LO13  LATIN SMALL LETTER O WITH GRAVE  .1.3.5.XTV  00F2
\O  &Ograve  LO14  LATIN CAPITAL LETTER O WITH GRAVE  .1.3.5.XTV  00D2
\u  &ugrave  LU13  LATIN SMALL LETTER U WITH GRAVE  .1.3.5.XTV  00F9
\U  &Ugrave  LU14  LATIN CAPITAL LETTER U WITH GRAVE  .1.3.5.XTV  00D9
\w  &wgrave  LW13  LATIN SMALL LETTER W WITH GRAVE  * ..........  1E81
\W  &Wgrave  LW14  LATIN CAPITAL LETTER W WITH GRAVE  * ..........  1E80
\y  &ygrave  LY13  LATIN SMALL LETTER Y WITH GRAVE  * ..........  1EF3
\Y  &Ygrave  LY14  LATIN CAPITAL LETTER Y WITH GRAVE  * ..........  1EF2

>a  &acirc  LA15  LATIN SMALL LETTER A WITH CIRCUMFLEX  .12345.XTV  00E2
>A  &Acirc  LA16  LATIN CAPITAL LETTER A WITH CIRCUMFLEX  .12345.XTV  00C2
>c  &ccirc  LC15  LATIN SMALL LETTER C WITH CIRCUMFLEX  ...3..AXTV  0109
>C  &Ccirc  LC16  LATIN CAPITAL LETTER C WITH CIRCUMFLEX  ...3..AXTV  0108
>e  &ecirc  LE15  LATIN SMALL LETTER E WITH CIRCUMFLEX  .1.3.5.XTV  00EA
>E  &Ecirc  LE16  LATIN CAPITAL LETTER E WITH CIRCUMFLEX  .1.3.5.XTV  00CA
>g  &gcirc  LG15  LATIN SMALL LETTER G WITH CIRCUMFLEX  ...3..AXTV  011D
>G  &Gcirc  LG16  LATIN CAPITAL LETTER G WITH CIRCUMFLEX  ...3..AXTV  011C
>h  &hcirc  LH15  LATIN SMALL LETTER H WITH CIRCUMFLEX  ...3..AXTV  0125
>H  &Hcirc  LH16  LATIN CAPITAL LETTER H WITH CIRCUMFLEX  ...3..AXTV  0124
>i  &icirc  LI15  LATIN SMALL LETTER I WITH CIRCUMFLEX  .12345.XTV  00EE
>I  &Icirc  LI16  LATIN CAPITAL LETTER I WITH CIRCUMFLEX  .12345.XTV  00CE
>j  &jcirc  LJ15  LATIN SMALL LETTER J WITH CIRCUMFLEX  ...3..AXTV  0135
>J  &Jcirc  LJ16  LATIN CAPITAL LETTER J WITH CIRCUMFLEX  ...3..AXTV  0134
>o  &ocirc  LO15  LATIN SMALL LETTER O WITH CIRCUMFLEX  .12345.XTV  00F4
>O  &Ocirc  LO16  LATIN CAPITAL LETTER O WITH CIRCUMFLEX  .12345.XTV  00D4
>s &scirc  LS15  LATIN SMALL LETTER S WITH CIRCUMFLEX  ...3..AXTV  015D
>S  &Scirc  LS16  LATIN CAPITAL LETTER S WITH CIRCUMFLEX  ...3..AXTV  015C
>u  &ucirc  LU15  LATIN SMALL LETTER U WITH CIRCUMFLEX  .1.345.XTV  00FB
>U  &Ucirc  LU16  LATIN CAPITAL LETTER U WITH CIRCUMFLEX  .1.345.XTV  00DB
>w  &wcirc  LW15  LATIN SMALL LETTER W WITH CIRCUMFLEX  ......AXTV  0175
>W  &Wcirc  LW16  LATIN CAPITAL LETTER W WITH CIRCUMFLEX  ......AXTV  0174
>y  &ycirc  LY15  LATIN SMALL LETTER Y WITH CIRCUMFLEX  ......AXTV  0177
>Y  &Ycirc  LY16  LATIN CAPITAL LETTER Y WITH CIRCUMFLEX  ......AXTV  0176

%a  &auml  LA17  LATIN SMALL LETTER A WITH DIAERESIS  .12345.XTV  00E4
%A  &Auml  LA18  LATIN CAPITAL LETTER A WITH DIAERESIS  .12345.XTV  00C4
%e  &euml  LE17  LATIN SMALL LETTER E WITH DIAERESIS  .12345.XTV  00EB
%E  &Euml  LE18  LATIN CAPITAL LETTER E WITH DIAERESIS  .12345.XTV  00CB
%i  &iuml  LI17  LATIN SMALL LETTER I WITH DIAERESIS  .1.3.5.XTV  00EF
%I  &Iuml  LI18  LATIN CAPITAL LETTER I WITH DIAERESIS  .1.3.5.XTV  00CF
%o  &ouml  LO17  LATIN SMALL LETTER O WITH DIAERESIS  .12345.XTV  00F6
%O  &Ouml  LO18  LATIN CAPITAL LETTER O WITH DIAERESIS  .12345.XTV  00D6
%u  &uuml  LU17  LATIN SMALL LETTER U WITH DIAERESIS  .12345.XTV  00FC
%U  &Uuml  LU18  LATIN CAPITAL LETTER U WITH DIAERESIS  .12345.XTV  00DC
%w  &wuml  LW17  LATIN SMALL LETTER W WITH DIAERESIS  * ..........  1E85
%W  &Wuml  LW18  LATIN CAPITAL LETTER W WITH DIAERESIS  * ..........  1E84
%y  &yuml  LY17  LATIN SMALL LETTER Y WITH DIAERESIS  .1...5.XTV  00FF
%Y  &Yuml  LY18  LATIN CAPITAL LETTER Y WITH DIAERESIS  ......AXTV  0178

~a  &atilde  LA19  LATIN SMALL LETTER A WITH TILDE  .1..45.XTV  00E3
~A  &Atilde  LA20  LATIN CAPITAL LETTER A WITH TILDE  .1..45.XTV  00C3
~n  &ntilde  LN19  LATIN SMALL LETTER N WITH TILDE  .1.3.5.XTV  00F1
~N  &Ntilde  LN20  LATIN CAPITAL LETTER N WITH TILDE  .1.3.5.XTV  00D1
~o  &otilde  LO19  LATIN SMALL LETTER O WITH TILDE  .1..45.XTV  00F5
~O  &Otilde  LO20  LATIN CAPITAL LETTER O WITH TILDE  .1..45.XTV  00D5

*c  &ccaron  LC21  LATIN SMALL LETTER C WITH CARON  ..2.4..XTV  010D
*C  &Ccaron  LC22  LATIN CAPITAL LETTER C WITH CARON  ..2.4..XTV  010C
*d  &dcaron  LD21  LATIN SMALL LETTER D WITH CARON  ..2....XTV  010F
*D  &Dcaron  LD22  LATIN CAPITAL LETTER D WITH CARON  ..2....XTV  010E
*e  &ecaron  LE21  LATIN SMALL LETTER E WITH CARON  ..2....XTV  011B
*E  &Ecaron  LE22  LATIN CAPITAL LETTER E WITH CARON  ..2....XTV  011A
*l  &lcaron  LL21  LATIN SMALL LETTER L WITH CARON  ..2....XTV  013E
*L  &Lcaron  LL22  LATIN CAPITAL LETTER L WITH CARON  ..2....XTV  013D
*n  &ncaron  LN21  LATIN SMALL LETTER N WITH CARON  ..2....XTV  0148
*N  &Ncaron  LN22  LATIN CAPITAL LETTER N WITH CARON  ..2....XTV  0147
*r  &rcaron LR21  LATIN SMALL LETTER R WITH CARON  ..2....XTV  0159
*R  &Rcaron  LR22  LATIN CAPITAL LETTER R WITH CARON  ..2....XTV  0158
*s  &scaron  LS21  LATIN SMALL LETTER S WITH CARON  ..2.4..XTV  0161
*S  &Scaron  LS22  LATIN CAPITAL LETTER S WITH CARON  ..2.4..XTV  0160
*t  &tcaron  LT21  LATIN SMALL LETTER T WITH CARON  ..2....XTV  0165
*T  &Tcaron  LT22  LATIN CAPITAL LETTER T WITH CARON  ..2....XTV  0164
*z  &zcaron  LZ21  LATIN SMALL LETTER Z WITH CARON  ..2.4..XTV  017E
*Z  &Zcaron  LZ22  LATIN CAPITAL LETTER Z WITH CARON  ..2.4..XTV  017D

#a  &abreve  LA23  LATIN SMALL LETTER A WITH BREVE  ..2....XTV  0103
#A  &Abreve  LA24  LATIN CAPITAL LETTER A WITH BREVE  ..2....XTV  0102
#g  &gbreve  LG23  LATIN SMALL LETTER G WITH BREVE  ...3.5AXTV  011F
#G  &Gbreve  LG24  LATIN CAPITAL LETTER G WITH BREVE  ...3.5AXTV  011E
#u  &ubreve  LU23  LATIN SMALL LETTER U WITH BREVE  ...3..AXTV  016D
#U  &Ubreve  LU24  LATIN CAPITAL LETTER U WITH BREVE  ...3..AXTV  016C

+o  &odblac  LO25  LATIN SMALL LETTER O WITH DOUBLE ACUTE  ..2....XTV  0151
+O  &Odblac  LO26  LATIN CAPITAL LETTER O WITH DOUBLE ACUTE  ..2....XTV  0150
+u  &udblac  LU25  LATIN SMALL LETTER U WITH DOUBLE ACUTE  ..2....XTV  0171
+U  &Udblac  LU26  LATIN CAPITAL LETTER U WITH DOUBLE ACUTE  ..2....XTV  0170

@a  &aring  LA27  LATIN SMALL LETTER A WITH RING ABOVE  .1..45.XTV  00E5
@A  &Aring  LA28  LATIN CAPITAL LETTER A WITH RING ABOVE  .1..45.XTV  00C5
@u  &uring  LU27  LATIN SMALL LETTER U WITH RING ABOVE  ..2....XTV  016F
@U  &Uring  LU28  LATIN CAPITAL LETTER U WITH RING ABOVE  ..2....XTV  016E

@c  &cdot  LC29  LATIN SMALL LETTER C WITH DOT ABOVE  ...3..AXTV  010B
@C  &Cdot  LC30  LATIN CAPITAL LETTER C WITH DOT ABOVE  ...3..AXTV  010A
@e  &edot  LE29  LATIN SMALL LETTER E WITH DOT ABOVE  ....4.AXTV  0117
@E  &Edot  LE30  LATIN CAPITAL LETTER E WITH DOT ABOVE  ....4.AXTV  0116
@g  &gdot  LG29  LATIN SMALL LETTER G WITH DOT ABOVE  ...3..AXTV  0121
@G  &Gdot  LG30  LATIN CAPITAL LETTER G WITH DOT ABOVE  ...3..AXTV  0120
@I  &Idot  LI30  LATIN CAPITAL LETTER I WITH DOT ABOVE  ...3.5AXTV  0130
@i  &inodot  LI61  LATIN SMALL LETTER DOTLESS I