PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 02-15-2005, 12:47 PM   #1
Maarten Wijnen
Green Mole
 
Join Date: Feb 2005
Posts: 1
Question spider ignores links

Hi,

I'm new to phpdig. The installation was quite easy, though it was not quite clear to me that cookies have to be enabled for administration. It was fairly easy to create a nice template, but at this point I'm stuck, so I come for help.

When I tried to really index my site, the spider silently ignored a lot of pages. The description of 1.8.8-rc1 states that the spider does not index directory listings and database content. And by looking at some php code I found out that it does some extensive checking on other things as well. I did some debugging and found out that indeed the spider doesn't like most of my urls, even though I have added a space, both parenthes and other characters to the $allowed_link_chars.

But in my case, I really need to index all content. Without that ability phpdig would be of no use. Is there a setting that I can use, or some code I can add or comment out in order to make the spider more greedy? I've had a look at the code of the spider but it's complex with many nested statements, so I'm reluctant to change it. I figured some of you could perhaps answer my question from the top of their heads.

Maarten Wijnen
Maarten Wijnen is offline   Reply With Quote
Old 02-17-2005, 05:13 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Set "search depth" to a large number, set "links per" to zero, choose the "no" option, set LIMIT_TO_DIRECTORY to false, set PHPDIG_IN_DOMAIN to true.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 03-17-2005, 02:23 PM   #3
piashaw
Green Mole
 
Join Date: Mar 2005
Posts: 1
Well Maarten, I have today installed PhpDig and found the same problem as you. It does index dynamic content but what i noticed is that it will index ALL information when the query is ON the page with the variables hardcoded. If you dynamically insert the variables it doesn't seam to pass the variables. I have about 4 pages to which variables are sent as opposed to hardcoded into the query. I told PhpDig to index each of these pages manually and all is well.

However my setup is that ALL the variables are in my menus and the same menu appears on EVERY page, including the dynamically created catalogue pages in my case. Maybe that's why my workaround works.

Peter
piashaw is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
specific links not spider flanders How-to Forum 1 10-06-2004 11:12 PM
Spider site with links dell_10 External Binaries 10 09-20-2004 06:42 AM
no spider my file links lolodev Troubleshooting 21 07-16-2004 06:31 PM
Index ignores directory with space in it Gakk Simian Troubleshooting 2 04-09-2004 09:03 AM
phpDig ignores robots.txt Dragonfly Troubleshooting 1 09-12-2003 06:54 AM


All times are GMT -8. The time now is 02:57 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.