RFC 744 (rfc744) - Page 2 of 6


MARS - a Message Archiving and Retrieval Service



Alternative Format: Original Text Document




NWG/RFC  744                                   JS5 8-Jan-78 21:59  42857
MARS - A Message Archiving & Retrieval Service



II.       Using MARS
          ----------

A.  Message Indexing
    ----------------

For each message, a vector of parsed tokens is  created.   The  parsed
tokens are collected by the message-field in which they occurred -- to
be  used  as  "indexes",  i.e.,  values  of  inverted  fields,  by the
Datacomputer.

The Filer "indexes", essentially  without  analysis,  except  for  the
following:

   --  Each distinguishable section of the message is indexed
       separately; each header line is a separate inversion domain, as
       is the body of the message.

   --  The header lines which contain ARPANET addresses are analyzed in
       order to index separately on mailbox and host.

   --  The date-field is parsed and converted to the standard Tenex
       internal date/time format, which is better adapted for
       less-than/greater-than comparisons, as in retrievals which
       specify a date range.

   --  One-character words in both the subject-field and the
       message-text field arbitrarily discarded.

   --  Two-character words in the message-text field are arbitrarily
       discarded.

   --  Hyphenated phrases, i.e., words bound together by hyphens, are
       retained intact.

   --  All message formats which conform to RFC  733 standards are
       accommodated.  The minimum requirements are:  a date-field, a
       from-field, and a blank line between the message-header and
       message-body.