RFC 2651 (rfc2651) - Page 2 of 19
The Architecture of the Common Indexing Protocol (CIP)
Alternative Format: Original Text Document
RFC 2651 The CIP Architecture August 1999
The indexing part of Whois++ is integrated with the data access
protocol. The goal in designing CIP is to extract the indexing
portion of Whois++, while abstracting the index objects to apply more
broadly to information retrieval. In addition, another kind of
technology reuse has been undertaken by converting the ad-hoc data
representations used by Whois++ into structures based on the MIME
specification for structured Internet mail.
Whois++ used a version number field in centroid objects to facilitate
future growth. The initial version was "1". Version 1 of CIP (then
embedded in Whois++, and not referred to separately as CIP) had
support for only ISO-8895-1 characters, and for only the centroid
index object type.
Version 2 of the Whois++ centroid was used in the Digger software by
Bunyip Information Systems to notify recipients that the centroid
carried extra character set information. Digger's centroids can carry
UTF-8 encoded 16-bit Unicode characters, or ISO-8859-1 characters,
determined by a field in the headers.
This specification is for CIP version 3. Version 3 is a major
overhaul to the protocol. However, by using of a short negotiation
sequence, CIP version 3 servers can interoperate with earlier servers
in an index-passing mesh.
For unclear terms the reader is referred to the glossary in Appendix
A.
1.2 CIP's place in the Information Retrieval world
CIP facilitates query routing. CIP is a protocol used between servers
in a network to pass hints which make data access by clients at a
later date more efficient. Query routing is the act of redirecting
and replicating queries through a distributed database system towards
the servers holding the actual results via reference to indexing
information.
CIP is a "backend" protocol -- it is implemented in and "spoken" only
among network servers. These same servers must also speak some kind
of data access protocol to communicate with clients. During query
resolution in the native protocol implementation, the server will
refer to the indexing information collected by the CIP implementation
for guidance on how to route the query.
Data access protocols used with CIP must have some provision for
control information in the form of a referral. The syntax and
semantics of these referrals are outside the scope of this
specification.
Allen & Mealling Standards Track