PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Troubleshooting (http://www.phpdig.net/forum/forumdisplay.php?f=22)
-   -   spider ignores links (http://www.phpdig.net/forum/showthread.php?t=1843)

Maarten Wijnen 02-15-2005 12:47 PM

spider ignores links
 
Hi,

I'm new to phpdig. The installation was quite easy, though it was not quite clear to me that cookies have to be enabled for administration. It was fairly easy to create a nice template, but at this point I'm stuck, so I come for help.

When I tried to really index my site, the spider silently ignored a lot of pages. The description of 1.8.8-rc1 states that the spider does not index directory listings and database content. And by looking at some php code I found out that it does some extensive checking on other things as well. I did some debugging and found out that indeed the spider doesn't like most of my urls, even though I have added a space, both parenthes and other characters to the $allowed_link_chars.

But in my case, I really need to index all content. Without that ability phpdig would be of no use. Is there a setting that I can use, or some code I can add or comment out in order to make the spider more greedy? I've had a look at the code of the spider but it's complex with many nested statements, so I'm reluctant to change it. I figured some of you could perhaps answer my question from the top of their heads.

Maarten Wijnen

Charter 02-17-2005 05:13 AM

Set "search depth" to a large number, set "links per" to zero, choose the "no" option, set LIMIT_TO_DIRECTORY to false, set PHPDIG_IN_DOMAIN to true.

piashaw 03-17-2005 02:23 PM

Well Maarten, I have today installed PhpDig and found the same problem as you. It does index dynamic content but what i noticed is that it will index ALL information when the query is ON the page with the variables hardcoded. If you dynamically insert the variables it doesn't seam to pass the variables. I have about 4 pages to which variables are sent as opposed to hardcoded into the query. I told PhpDig to index each of these pages manually and all is well.

However my setup is that ALL the variables are in my menus and the same menu appears on EVERY page, including the dynamically created catalogue pages in my case. Maybe that's why my workaround works.

Peter


All times are GMT -8. The time now is 05:34 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.