"Search engine" and "Web directory" are two
different search services available to the Web community; although
they are often mistakenly confused. Search engines have indices that
are built up by robots or crawlers; whereas Web directories build up
their indices through human editors. Many search engines and
directories contain both a computer-generated index and a human
generated index, and are referred to as hybrids.
Google, Inktomi, AltaVista, AlltheWeb and the
like are all forms of search engines. These search engines write
programs known as robots, crawlers and/or spiders that have the
following functions: (1) to locate Web pages, (2) to read the contents
of the Web pages and (3) report its findings back to the search
engine's indices or databases. Many search engines update their index
either on a bi-monthly or monthly basis. When Web searchers use a
search engine to locate Web sites that are relevant to the keyword
search, they are searching the search engine's index. A search engine
with a larger and more up-to-date index is a better representation of
the information available in the Web.
Yahoo!, Open Directory Project (dmoz.org), Gipsy and the like are all
forms of Web directories. These directories use human editors to
review sites that are submitted for submission to the directory.
Directories, unlike search engines, use a hierarchical tree structure
to organize their database. Another common distinction is that a
directory tends to list Web sites (root directory of a site or
homepage) whereas a search engine will list Web pages (individual
pages of a Web site). Due to the manual process of adding sites to a
directory, directories often have to supplement their search results
with a search engine partner to increase the relevancy of the produced
search results.
Search engines all have their own confidential
algorithms that determine which Web pages are to be shown first. The
algorithms assign weights to certain components or variables that it
finds within a page.
Web directories should be browsed through their
hierarchical structure and not searched. Humans assign titles and
descriptions that might not be within the source code of the page.
Also, there is normally a submission fee to be added to a directory.