RFC 2032 (rfc2032) - Page 2 of 11
RTP Payload Format for H
Alternative Format: Original Text Document
RFC 2032 RTP Payload Format for H.261 Video October 1996
2. Purpose of this document
The ITU-T recommendation H.261 [6] specifies the encodings used by
ITU-T compliant video-conference codecs. Although these encodings
were originally specified for fixed data rate ISDN circuits,
experiments [3],[8] have shown that they can also be used over
packet-switched networks such as the Internet.
The purpose of this memo is to specify the RTP payload format for
encapsulating H.261 video streams in RTP [1].
3. Structure of the packet stream
3.1. Overview of the ITU-T recommendation H.261
The H.261 coding is organized as a hierarchy of groupings. The video
stream is composed of a sequence of images, or frames, which are
themselves organized as a set of Groups of Blocks (GOB). Note that
H.261 "pictures" are referred as "frames" in this document. Each GOB
holds a set of 3 lines of 11 macro blocks (MB). Each MB carries
information on a group of 16x16 pixels: luminance information is
specified for 4 blocks of 8x8 pixels, while chrominance information
is given by two "red" and "blue" color difference components at a
resolution of only 8x8 pixels. These components and the codes
representing their sampled values are as defined in the ITU-R
Recommendation 601 [7].
This grouping is used to specify information at each level of the
hierarchy:
- At the frame level, one specifies information such as the
delay from the previous frame, the image format, and
various indicators.
- At the GOB level, one specifies the GOB number and the
default quantifier that will be used for the MBs.
- At the MB level, one specifies which blocks are present
and which did not change, and optionally a quantifier and
motion vectors.
Blocks which have changed are encoded by computing the discrete
cosine transform (DCT) of their coefficients, which are then
quantized and Huffman encoded (Variable Length Codes).
The H.261 Huffman encoding includes a special "GOB start" pattern,
composed of 15 zeroes followed by a single 1, that cannot be imitated
by any other code words. This pattern is included at the beginning of
Turletti & Huitema Standards Track