RFC 744 (rfc744) - Page 2 of 6
MARS - a Message Archiving and Retrieval Service
Alternative Format: Original Text Document
NWG/RFC 744 JS5 8-Jan-78 21:59 42857
MARS - A Message Archiving & Retrieval Service
II. Using MARS
----------
A. Message Indexing
----------------
For each message, a vector of parsed tokens is created. The parsed
tokens are collected by the message-field in which they occurred -- to
be used as "indexes", i.e., values of inverted fields, by the
Datacomputer.
The Filer "indexes", essentially without analysis, except for the
following:
-- Each distinguishable section of the message is indexed
separately; each header line is a separate inversion domain, as
is the body of the message.
-- The header lines which contain ARPANET addresses are analyzed in
order to index separately on mailbox and host.
-- The date-field is parsed and converted to the standard Tenex
internal date/time format, which is better adapted for
less-than/greater-than comparisons, as in retrievals which
specify a date range.
-- One-character words in both the subject-field and the
message-text field arbitrarily discarded.
-- Two-character words in the message-text field are arbitrarily
discarded.
-- Hyphenated phrases, i.e., words bound together by hyphens, are
retained intact.
-- All message formats which conform to RFC 733 standards are
accommodated. The minimum requirements are: a date-field, a
from-field, and a blank line between the message-header and
message-body.