RFC 2032 (rfc2032) - Page 3 of 11
RTP Payload Format for H
Alternative Format: Original Text Document
RFC 2032 RTP Payload Format for H.261 Video October 1996
each GOB header (and also at the beginning of each frame header) to
mark the separation between two GOBs, and is in fact used as an
indicator that the current GOB is terminated. The encoding also
includes a stuffing pattern, composed of seven zeroes followed by
four ones; that stuffing pattern can only be entered between the
encoding of MBs, or just before the GOB separator.
3.2. Considerations for packetization
H.261 codecs designed for operation over ISDN circuits produce a bit
stream composed of several levels of encoding specified by H.261 and
companion recommendations. The bits resulting from the Huffman
encoding are arranged in 512-bit frames, containing 2 bits of
synchronization, 492 bits of data and 18 bits of error correcting
code. The 512-bit frames are then interlaced with an audio stream
and transmitted over px64 kbps circuits according to specification
H.221 [5].
When transmitting over the Internet, we will directly consider the
output of the Huffman encoding. All the bits produced by the Huffman
encoding stage will be included in the packet. We will not carry the
512-bit frames, as protection against bit errors can be obtained by
other means. Similarly, we will not attempt to multiplex audio and
video signals in the same packets, as UDP and RTP provide a much more
efficient way to achieve multiplexing.
Directly transmitting the result of the Huffman encoding over an
unreliable stream of UDP datagrams would, however, have poor error
resistance characteristics. The result of the hierachical structure
of H.261 bit stream is that one needs to receive the information
present in the frame header to decode the GOBs, as well as the
information present in the GOB header to decode the MBs. Without
precautions, this would mean that one has to receive all the packets
that carry an image in order to properly decode its components.
If each image could be carried in a single packet, this requirement
would not create a problem. However, a video image or even one GOB by
itself can sometimes be too large to fit in a single packet.
Therefore, the MB is taken as the unit of fragmentation. Packets
must start and end on a MB boundary, i.e. a MB cannot be split across
multiple packets. Multiple MBs may be carried in a single packet
when they will fit within the maximal packet size allowed. This
practice is recommended to reduce the packet send rate and packet
overhead.
To allow each packet to be processed independently for efficient
resynchronization in the presence of packet losses, some state
information from the frame header and GOB header is carried with each
Turletti & Huitema Standards Track