View Single Post
Old 04-19-2004, 07:53 PM   #1
jerrywin5
Orange Mole
 
Join Date: Mar 2004
Posts: 48
Indexing duplicate descriptions and keywords causing false search results

I am working on a search engine for a niche market. Therefore, I am indexing multiple Web sites. Unfortunately, some Web developers use the same description and keywords on every page in their site. This causes the search engine to return false results as pages may meet search criteria via the description and/or keywords but do not contain any content relevant to the search. The only exception to this would be the default page for the site as the description and keywords indicate the content for the site rather than for the page.

When indexing a site, I would like to have the spider compare the description and the keywords on the default page against a description and keywords on a second page and if they are the same, not index the description or the keywords in pages other than the default page.

In addition to greatly increasing relevancy, this would also decrease the size of the database somewhat and allow the search engine to return results a bit faster.

This issue has caused me to stop all indexing as the spider is retrieving so much useless content due to the poor design of so many Web sites.

Any advice would be greatly appreciated.
jerrywin5 is offline   Reply With Quote