RFC 2035 (rfc2035) - Page 2 of 16


RTP Payload Format for JPEG-compressed Video



Alternative Format: Original Text Document



RFC 2035           RTP Payload Format for JPEG Video        October 1996


   in one or more passes.  Each pass (called a frame in the JPEG
   standard) is further broken down into one or more scans.  Within each
   scan, there are one to four components,which represent the three
   components of a color signal (e.g., "red, green, and blue", or a
   luminance signal and two chromanince signals).  These components can
   be encoded as separate scans or interleaved into a single scan.

   Each frame and scan is preceded with a header containing optional
   definitions for compression parameters like quantization tables and
   Huffman coding tables.  The headers and optional parameters are
   identified with "markers" and comprise a marker segment; each scan
   appears as an entropy-coded bit stream within two marker segments.
   Markers are aligned to byte boundaries and (in general) cannot appear
   in the entropy-coded segment, allowing scan boundaries to be
   determined without parsing the bit stream.

   Compressed data is represented in one of three formats: the
   interchange format, the abbreviated format, or the table-
   specification format.  The interchange format contains definitions
   for all the table used in the by the entropy-coded segments, while
   the abbreviated format might omit some assuming they were defined
   out-of-band or by a "previous" image.

   The JPEG standard does not define the meaning or format of the
   components that comprise the image.  Attributes like the color space
   and pixel aspect ratio must be specified out-of-band with respect to
   the JPEG bit stream.  The JPEG File Interchange Format (JFIF) [4] is
   a defacto standard that provides this extra information using an
   application marker segment (APP0).  Note that a JFIF file is simply a
   JPEG interchange format image along with the APP0 segment.  In the
   case of video, additional parameters must be defined out-of-band
   (e.g., frame rate, interlaced vs. non-interlaced, etc.).

   While the JPEG standard provides a rich set of algorithms for
   flexible compression, cost-effective hardware implementations of the
   full standard have not appeared.  Instead, most hardware JPEG video
   codecs implement only a subset of the sequential DCT mode of
   operation.  Typically, marker segments are interpreted in software
   (which "re-programs" the hardware) and the hardware is presented with
   a single, interleaved entropy-coded scan represented in the YUV color
   space.

2.  JPEG Over RTP

   To maximize interoperability among hardware-based codecs, we assume
   the sequential DCT operating mode [1,Annex F] and restrict the set of
   predefined RTP/JPEG "type codes" (defined below) to single-scan,
   interleaved images.  While this is more restrictive than even



Berc, et. al.               Standards Track