SolveYourProblem
Search Engine Optimization (SEO) Series
How Do Search Engine Spiders Work?
There are hundreds of search engines available
today, but some are far more complex than others. This article
will give you an overview of how some of the most popular
ones work.
Let’s
start with a smaller engine: InfoSeek. They only index
about 200 words of your web page, so it’s important to make
sure that you have meta tags on your site, and that the most
important things are listed first. The information you put
in your meta tags will be used to display a description of
your site, and most meta tags can contain about 200 characters
of text. The keywords meta tag, however, can have up to 1,000
characters.
These simple rules are important to keep in mind for all
search engines. The more important that the information is,
the closer it has to be to the beginning of your meta tags
or even the beginning of your site’s content. Many search
engines won’t even touch your meta tags so it is important
that you have the same information in your body that you
have in your meta tags (although you obviously cannot simply
enter lists and lists of key words as this would be detrimental
to your site’s content).
The
AltaVista search engine will send Scooter, its spider,
to check out your entire site. Scooter can take as long as
three months to spider and fully index your site – the average
spider only takes 6-8 weeks. Scooter will normally spider
somewhere between two and ten pages from your site each week.
This means that the longer that your web site lasts, the
better it will be indexed which is in example of how search
engines implement Darwin’s Theory into their ideology.
Excite
used to be a search powerhouse, but has now been dropped
as the provider of AOL and Netscape search, so it’s
less important than it once was. The algorithm it uses to
determine keyword relevance is very complicated: it indexes
your pages and then attempts to summarize them by selecting
only the most relevant sentences. Expect to have your pages
reviewed roughly once every two weeks. Keep in mind, though,
that with meta tags have no meaning to Excite when it comes
to rankings, even though it will use your description tags
as long as the words are relevant to your pages’ content.
Let’s move on to Lycos. Lycos
has fully integrated the Open Directory Project (ODP)
into their mainstream results pages,
and they also use search results from AllTheWeb. Lycos also
runs click-throughs to their sister site HotBot. Lycos is
one of the harder search engines to understand, as their
submission pages say one thing but then they index your site
in a completely different way. As a general rule of thumb,
your site will be indexed in Lycos in due time as long as
you get indexed in ODP and AllTheWeb.
Even
though WebCrawler is owned by Excite, it still has its
own search engine and indexer. If you happen to be listed
with WebCrawler, you should try to stay listed with them,
as it isn’t the easiest search engine to get listed with.
Its hit-and-miss standards combined with the sporadic indexing
methods makes the submission process tough, although not
impossible.
The
biggest player is, of course, Google, who use a page
ranking system as the central basis of their index. It was
once nearly impossible to manipulate this page ranking system
to drive up your rankings, but people quickly figured out
that the more links they could generate to their site on
the rest of the net, the better Google ranked them. Google
is not thought to be using context-sensitive rankings. Context-sensitive
information is used at Yahoo, Looksmart and the ODP, however,
and Google regularly spiders those sites when it re-indexes
its own database.
MSN
is another important search engine. The holy trinity
of search engines at the moment is Google, Yahoo!, and MSN.
These three search engines combine to provide you with the
vast majority of the traffic that you will receive from search
engines. MSN will generally be the first search engine to
index your site and it will almost certainly list the most
pages the fastest.
Although no-one can tell you exactly when you will be indexed
on any search engine, it’s best to check back at least weekly.
Whatever you do, though, don’t re-submit your site more often
than every two months or so – you might not get indexed at
all if you do this.