RFC 1842 (rfc1842) - Page 2 of 12
ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages
Alternative Format: Original Text Document
RFC 1842 ASCII/Chinese Character Encoding August 1995
Table of Contents
1. Introduction................................................ 2
2. Description................................................. 3
3. Formal Syntax............................................... 4
4. MIME Considerations......................................... 5
5. Background Information...................................... 5
6. References.................................................. 6
7. Acknowledgements............................................ 6
8. Security Considerations..................................... 7
9. Authors' Addresses.......................................... 7
10. Appendix: List of Software Implementing HZ Representation... 9
1. Introduction
Chinese (and other east Asia languages) characters are encoded with
multiple bytes to guarantee sufficient coding space for the large
number of glyphs these languages contain. With the prolification of
internetwork traffic around the world, it becomes necessary to define
ways to facilitate the transfer of text in multiple-byte character-
set languages (hereafter as Chinese text) over internet.
There are two layers of concerns need to be addressed by any
mechanism whose purpose is to transfer Chinese text over internet.
The first is on application layer, in which concerned applications
should be able to recognize the encoding of the text and/or discern
different character sets which might be mixed in the text and handle
it accordingly. The second layer is the actual transport of Chinese
text between point A to point B over the Internet. Because the
prevailing mail transport protocol used over internet, the Simple
Mail Transport Protocol (aka. SMTP) was designed originally for ASCII
character set only, many internet mail agents are not 8 bit clean and
therefore introduce challenges for any attempt to actually implement
a mechanism for the transport of Chinese text over internet.
Here we describe a mechanism for transmission of Chinese text over IP
network. This described mechanism has being implemented by various
software package dealing with multi-language support and has been
tested on USENET newsgroups and other types of internet forums over
the last two years. The test results shows that the HZ representation
can pass through almost all existing mail delivery agents without
being corrupted. The HZ representation currently handles GB2312-80
Chinese character set only. Further expansion to other Chinese
encoding systems and to other East Asia Language is under
consideration.
Wei, et al Informational