Spider indexes cgi pages but not its links!?
Hi!
When I run the spider on a site www.domain.com that hosts several pages in the form of www.domain.com/cgi-bin/whatever... or cgi-bin.domain.com/whatever.... I can't find those links on the database nor the search results, but checking the most common keywords gives me as 1st place the cgi-bin.domain.com keyword. What's the deal? How do i make it to add the cgi-*.* links to the database for that particular domain? Also, is there any difference between indexing http://www.domain.com and http://domain.com? Will I get duplicate pages onto the db? Thanks! |
The links domain.com, www.domain.com, and sub.domain.com are considered different. Try setting PHPDIG_IN_DOMAIN to true in the config file.
|
But then, why search results do not provide any cgi.domain.com result BUT will index it as keywords? how do i remove those false keywords and rerun the spider so it will pick cgi. as a url and not as a keyword?
|
Stored keywords and indexed links are two different things. If the text cgi.domain.com appears in a page, it is stored as a keyword, regardless of whether the link cgi.domain.com is actually indexed. If you are using PhpDig v.1.8.7, and don't want the text cgi.domain.com stored as a keyword, then edit BANNED in the config file.
|
All times are GMT -8. The time now is 08:56 PM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.