RFC 1345 (rfc1345) - Page 2 of 103
Character Mnemonics and Character Sets
Alternative Format: Original Text Document
RFC 1345 Character Mnemonics & Character Sets June 1992
character sets, including ASCII, national variants of the ISO 646 7-
bit character set and various EBCDICs. In addition, the numeric
value of the coded representations of all these characters are the
same in all coded character sets compatible with ISO standards. All
of them except two, EXCLAMATION MARK and QUOTATION MARK, have the
same coded representation in all variants of EBCDIC. This minimal
set of characters is called the reference character set in this memo.
The mnemonics can be used in Internet standards for easy and
unambiguous reference, and they can also serve as a fallback
representation in various Internet specifications.
The coded character sets covered include all parts of ISO 8859, ISO
6937-2 and all ISO 646 conforming coded character sets in the ISO
character set registry managed by ECMA according to ISO 2375. Almost
all graphic coded character sets in the ECMA registry (1) are
covered. The graphic coded character sets not included are registry
numbers 31, 38, 39, 53, 59, 68, 71, 72, 129 and 137. In addition
many vendor defined character sets are covered, including PC
codepages (4), (7), (8), many EBCDIC character sets (4), (5), (6) and
HP, DEC and Apple character sets (8), (9), (10), (13), (14). The
East-Asian 16-bit character sets from the ECMA registry is also
included in this memo.
2. CHARACTER MNEMONICS
2.1 General Syntax
The character mnemonics are taken from the ISO committee draft (CD)
of the POSIX.2 standard (3). They are classified into two groups:
1. A group with two-character mnemonics
- Primarily intended for alphabetic scripts like Latin, Greek,
Cyrillic, Hebrew and Arabic, and special characters.
2. A group with variable-length mnemonics
- primarily intended for non-alphabetic scripts like Japanese and
Chinese, but also used for some accented letters and special
characters.
In the two-character mnemonics, all invariant graphic character in
the ISO 646 character codes except "&" are used, i.e. the following
characters:
! " % ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; ?
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _
a b c d e f g h i j k l m n o p q r s t u v w x y z
The character "_" is not used as the first character.
In the variable-length mnemonics, the character "_" is not used as
the first character. If it is used in a name, its presence is
doubled.
Simonsen