RFC 1522 (rfc1522) - Page 3 of 10
MIME (Multipurpose Internet Mail Extensions) Part Two: Message Header Extensions for Non-ASCII Text
Alternative Format: Original Text Document
RFC 1522 MIME Part Two September 1993
consistency with STD 11, RFC 822. However, implementors are
warned that the character set name must be spelled "US-ASCII" in
MIME message and body part headers.
2. Syntax of encoded-words
An "encoded-word" is defined by the following ABNF grammar. The
notation of RFC 822 is used, with the exception that white space
characters MAY NOT appear between components of an encoded-word.
encoded-word = "=?" charset "?" encoding "?" encoded-text "?="
charset = token ; see section 3
encoding = token ; see section 4
token = 1*
especials = "(" / ")" / "" / "@" / "," / ";" / ":" / "
/ "/" / "[" / "]" / "?" / "." / "="
encoded-text = 1*
; (but see "Use of encoded-words in message
; headers", section 5)
Both "encoding" and "charset" names are case-independent. Thus the
charset name "ISO-8859-1" is equivalent to "iso-8859-1", and the
encoding named "Q" may be spelled either "Q" or "q".
An encoded-word may not be more than 75 characters long, including
charset, encoding, encoded-text, and delimiters. If it is desirable
to encode more text than will fit in an encoded-word of 75
characters, multiple encoded-words (separated by CRLF SPACE) may be
used.
While there is no limit to the length of a multiple-line header
field, each line of a header field that contains one or more
encoded-words is limited to 76 characters.
The length restrictions are included not only to ease
interoperability through internetwork mail gateways, but also to
impose a limit on the amount of lookahead a header parser must employ
(while looking for a final ?= delimiter) before it can decide whether
a token is an encoded-word or something else.
The characters which may appear in encoded-text are further
restricted by the rules in section 5.
Moore