RFC 1842 (rfc1842) - Page 2 of 12


ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages



Alternative Format: Original Text Document



RFC 1842            ASCII/Chinese Character Encoding         August 1995


Table of Contents

   1.     Introduction................................................ 2
   2.     Description................................................. 3
   3.     Formal Syntax............................................... 4
   4.     MIME Considerations......................................... 5
   5.     Background Information...................................... 5
   6.     References.................................................. 6
   7.     Acknowledgements............................................ 6
   8.     Security Considerations..................................... 7
   9.     Authors' Addresses.......................................... 7
   10.    Appendix: List of Software Implementing HZ Representation... 9

1. Introduction

   Chinese (and other east Asia languages) characters are encoded with
   multiple bytes to guarantee sufficient coding space for the large
   number of glyphs these languages contain. With the prolification of
   internetwork traffic around the world, it becomes necessary to define
   ways to facilitate the transfer of text in multiple-byte character-
   set languages (hereafter as Chinese text) over internet.

   There are two layers of concerns need to be addressed by any
   mechanism whose purpose is to transfer Chinese text over internet.
   The first is on application layer, in which concerned applications
   should be able to recognize the encoding of the text and/or discern
   different character sets which might be mixed in the text and handle
   it accordingly. The second layer is the actual transport of Chinese
   text between point A to point B over the Internet. Because the
   prevailing mail transport protocol used over internet, the Simple
   Mail Transport Protocol (aka. SMTP) was designed originally for ASCII
   character set only, many internet mail agents are not 8 bit clean and
   therefore introduce challenges for any attempt to actually implement
   a mechanism for the transport of Chinese text over internet.

   Here we describe a mechanism for transmission of Chinese text over IP
   network. This described mechanism has being implemented by various
   software package dealing with multi-language support and has been
   tested on USENET newsgroups and other types of internet forums over
   the last two years. The test results shows that the HZ representation
   can pass through almost all existing mail delivery agents without
   being corrupted. The HZ representation currently handles GB2312-80
   Chinese character set only. Further expansion to other Chinese
   encoding systems and to other East Asia Language is under
   consideration.






Wei, et al                   Informational