claudiomet
08-31-2004, 12:34 PM
When spidering via web I had no problems, the spider found links and index the pages, but when I do it via ssh:
--------------------------------------------------------------------------
HTTP/1.1 404 Not Found
- http://www.chile-empresas.cl/robots.txt<br>
See http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for explanation.<br>
404s are either dead links or something looked like a link to PhpDig so PhpDig tried to crawl it.<br>
<br>
--------------------------------------------------------------------------
This is a part of my spider.log... I have a cronlist.txt file containing a list of urls, but in all sites appear this message and 0 pages indexed. My spidering settings are:
Search Depth:5
Link per:10
Days After:7
Reindex depth:5
--------------------------------------------------------------------------
HTTP/1.1 404 Not Found
- http://www.chile-empresas.cl/robots.txt<br>
See http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for explanation.<br>
404s are either dead links or something looked like a link to PhpDig so PhpDig tried to crawl it.<br>
<br>
--------------------------------------------------------------------------
This is a part of my spider.log... I have a cronlist.txt file containing a list of urls, but in all sites appear this message and 0 pages indexed. My spidering settings are:
Search Depth:5
Link per:10
Days After:7
Reindex depth:5