![]() |
Three things I can think of...
1) The links may not match the regex for links. Search for ([a-z]{3,5}://) in the robot_functions.php file to find two regex for links. 2) Some of the pages you are trying to crawl are encoded windows-1251 but the search results look to be using iso-8859-1 instead. 3) Some of the pages are using a whole lot of HTML entities instead of an encoding. PhpDig currently support windows-1251 for Cyrillic. |
i also think that the problem is related with the pages encoding....
what i can do in order to make them spiderable? |
Use links that match the regex, encode pages using windows-1251, set define('PHPDIG_ENCODING','windows-1251'); in the config file.
|
All times are GMT -8. The time now is 11:56 AM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.