PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Mod Requests

Reply
 
Thread Tools
Old 04-19-2004, 07:53 PM   #1
jerrywin5
Orange Mole
 
Join Date: Mar 2004
Posts: 48
Indexing duplicate descriptions and keywords causing false search results

I am working on a search engine for a niche market. Therefore, I am indexing multiple Web sites. Unfortunately, some Web developers use the same description and keywords on every page in their site. This causes the search engine to return false results as pages may meet search criteria via the description and/or keywords but do not contain any content relevant to the search. The only exception to this would be the default page for the site as the description and keywords indicate the content for the site rather than for the page.

When indexing a site, I would like to have the spider compare the description and the keywords on the default page against a description and keywords on a second page and if they are the same, not index the description or the keywords in pages other than the default page.

In addition to greatly increasing relevancy, this would also decrease the size of the database somewhat and allow the search engine to return results a bit faster.

This issue has caused me to stop all indexing as the spider is retrieving so much useless content due to the poor design of so many Web sites.

Any advice would be greatly appreciated.
jerrywin5 is offline   Reply With Quote
Old 04-20-2004, 04:25 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. I'm not sure comparing meta description and keyword tags across pages within a site would be an efficacious process. Rather, it might be better to just exclude such tag information as shown in this thread.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 04-23-2004, 11:31 AM   #3
jerrywin5
Orange Mole
 
Join Date: Mar 2004
Posts: 48
I am indexing multiple Web sites that other Web developers have created. In sites where the Web developer has provided a description and/or keywords on each page that are related to that page, I want to include the keywords and and description in the index. It is only when a site being indexed was not created with keywords or descriptions that are related to the current page that I want to not index the keywords and the description.

Some times the keyword and description are of value. Sometimes they are not. I would like to implement logic to evaluate when to index this information and when not to. Thus, the value of the indexed information with be greatly improved.
jerrywin5 is offline   Reply With Quote
Old 05-04-2004, 08:27 AM   #4
jerrywin5
Orange Mole
 
Join Date: Mar 2004
Posts: 48
Once the spider makes an initial check, it could set a field in the sites table and then include or not include the desciption and/or keywords each time it indexes the site based on the value of the field. It could also check the setting avery so many indexes of the site after x number of indexes. Ignoring the description and keywords of all sites means not taking advantage of their value. Not ignoring the description and/or keywords when they are duplicated throughout the site means users will get bogus search results. So, there needs to be some way to handle this issue.
jerrywin5 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Certain Search Strings Causing Errors BBUCommander Bug Tracker 2 01-04-2005 10:15 PM
indexing for the 1st time but getting "duplicate of existing doc" msg with some files Morphea Troubleshooting 9 12-30-2004 03:03 PM
Duplicate/Similar search results? ChadK How-to Forum 3 08-20-2004 06:07 AM
No results when searching for the same keywords fvidu Troubleshooting 2 07-25-2004 10:04 PM
'Duplicate' Search Results siliconkibou Troubleshooting 1 01-13-2004 08:00 AM


All times are GMT -8. The time now is 07:44 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.