RFC 2517 (rfc2517) - Page 2 of 7
Building Directories from DNS: Experiences from WWWSeeker
Alternative Format: Original Text Document
RFC 2517 Building Directories from DNS February 1999
guessable name.
There are two major problems here. As the number of assigned names
increases, it becomes more difficult to get an easily guessable name.
Also, the TLD must be guessed as well as the name. While many users
just guess ".COM" as the "default" TLD today, there are many two-
letter country code top-level domains in current use as well as other
gTLDs (.NET, .ORG, and possibly .EDU) with the prospect of additional
gTLDs in the future. As the number of TLDs in general use increases,
guessing gets more difficult.
Between July 1996 and our shutdown in March 1998, the InterNIC
Directory and Database Services project maintained the Netfind search
engine [1] and the associated database that maps organization
information to domain names. This database thus acted as the type of
Internet directory that associates company names with domain names.
We also built WWWSeeker, a system that used the Netfind database to
find web sites associated with a given organization. The experienced
gained from maintaining and growing this database provides valuable
insight into the issues of providing a directory service. We present
it here to allow future implementors to avoid some of the blind
alleys that we have already explored.
2. Directory Population
2.1 What to do?
There are two issues in populating a directory: finding all the
domain names (building the skeleton) and associating those domains
with entities (adding the meat). These two issues are discussed
below.
2.2 Building the skeleton
In "building the skeleton", it is popular to suggest using a variant
of a "tree walk" to determine the domains that need to be added to
the directory. Our experience is that this is neither a reasonable
nor an efficient proposal for maintaining such a directory. Except
for some infrequent and long-standing DNS surveys [5], DNS "tree
walks" tend to be discouraged by the Internet community, especially
given that the frequency of DNS changes would require a new tree walk
monthly (if not more often). Instead, our experience has shown that
data on allocated DNS domains can usually be retrieved in bulk
fashion with FTP, HTTP, or Gopher (we have used each of these for
particular TLDs). This has the added advantage of both "building the
skeleton" and "adding the meat" at the same time. Our favorite
method for finding a server that has allocated DNS domain information
is to start with the list maintained at
Moats & Huber Informational