PhpDig.net - View Single Post

Andreas_Wien · 04-23-2004, 02:44 AM

After a quick setup and easy integration I have difficulties spidering the page http://444.docs4you.at correctly.

1.) the path portalnode/ is excluded in the database AND in the robots.txt - nevertheless it is found somehow, and spidered over and over again.

2.) OTOH links on the page are not followed in general. This behavior is different every time, in the worst case 2 pages are spidered and indexed, nothing else, and phpdig hangs spidering portalnode/.

Maybe my understanding of the  / include tags is wrong; Can I assume that the parser reads the page top down, switching off and on and off, and on and off again the indexing as it sees lines with a phpdig-tag?
And; Regardless of include/exclude tags each and every link on the page should be spidered?

Any help would be greatly appreciated!

Best Regards, Andreas