RFC 2035 (rfc2035) - Page 2 of 16
RTP Payload Format for JPEG-compressed Video
Alternative Format: Original Text Document
RFC 2035 RTP Payload Format for JPEG Video October 1996
in one or more passes. Each pass (called a frame in the JPEG
standard) is further broken down into one or more scans. Within each
scan, there are one to four components,which represent the three
components of a color signal (e.g., "red, green, and blue", or a
luminance signal and two chromanince signals). These components can
be encoded as separate scans or interleaved into a single scan.
Each frame and scan is preceded with a header containing optional
definitions for compression parameters like quantization tables and
Huffman coding tables. The headers and optional parameters are
identified with "markers" and comprise a marker segment; each scan
appears as an entropy-coded bit stream within two marker segments.
Markers are aligned to byte boundaries and (in general) cannot appear
in the entropy-coded segment, allowing scan boundaries to be
determined without parsing the bit stream.
Compressed data is represented in one of three formats: the
interchange format, the abbreviated format, or the table-
specification format. The interchange format contains definitions
for all the table used in the by the entropy-coded segments, while
the abbreviated format might omit some assuming they were defined
out-of-band or by a "previous" image.
The JPEG standard does not define the meaning or format of the
components that comprise the image. Attributes like the color space
and pixel aspect ratio must be specified out-of-band with respect to
the JPEG bit stream. The JPEG File Interchange Format (JFIF) [4] is
a defacto standard that provides this extra information using an
application marker segment (APP0). Note that a JFIF file is simply a
JPEG interchange format image along with the APP0 segment. In the
case of video, additional parameters must be defined out-of-band
(e.g., frame rate, interlaced vs. non-interlaced, etc.).
While the JPEG standard provides a rich set of algorithms for
flexible compression, cost-effective hardware implementations of the
full standard have not appeared. Instead, most hardware JPEG video
codecs implement only a subset of the sequential DCT mode of
operation. Typically, marker segments are interpreted in software
(which "re-programs" the hardware) and the hardware is presented with
a single, interleaved entropy-coded scan represented in the YUV color
space.
2. JPEG Over RTP
To maximize interoperability among hardware-based codecs, we assume
the sequential DCT operating mode [1,Annex F] and restrict the set of
predefined RTP/JPEG "type codes" (defined below) to single-scan,
interleaved images. While this is more restrictive than even
Berc, et. al. Standards Track