RFC 2152 (rfc2152) - Page 1 of 15
UTF-7 A Mail-Safe Transformation Format of Unicode
Alternative Format: Original Text Document
Network Working Group D. Goldsmith
Request for Comments: 2152 Apple Computer, Inc.
Obsoletes: RFC 1642 M. Davis
Category: Informational Taligent, Inc.
May 1997
UTF-7
A Mail-Safe Transformation Format of Unicode
Status of this Memo
This memo provides information for the Internet community. This memo
does not specify an Internet standard of any kind. Distribution of
this memo is unlimited.
Abstract
The Unicode Standard, version 2.0, and ISO/IEC 10646-1:1993(E) (as
amended) jointly define a character set (hereafter referred to as
Unicode) which encompasses most of the world's writing systems.
However, Internet mail (STD 11, RFC 822) currently supports only 7-
bit US ASCII as a character set. MIME (RFC 2045 through 2049) extends
Internet mail to support different media types and character sets,
and thus could support Unicode in mail messages. MIME neither defines
Unicode as a permitted character set nor specifies how it would be
encoded, although it does provide for the registration of additional
character sets over time.
This document describes a transformation format of Unicode that
contains only 7-bit ASCII octets and is intended to be readable by
humans in the limiting case that the document consists of characters
from the US-ASCII repertoire. It also specifies how this
transformation format is used in the context of MIME and RFC 1641,
"Using Unicode with MIME".
Motivation
Although other transformation formats of Unicode exist and could
conceivably be used in this context (most notably UTF-8, also known
as UTF-2 or UTF-FSS), they suffer the disadvantage that they use
octets in the range decimal 128 through 255 to encode Unicode
characters outside the US-ASCII range. Thus, in the context of mail,
those octets must themselves be encoded. This requires putting text
through two successive encoding processes, and leads to a significant
expansion of characters outside the US-ASCII range, putting non-
English speakers at a disadvantage. For example, using UTF-8 together
Goldsmith & Davis Informational