Go Back > PhpDig Forums > Troubleshooting

Thread Tools
Old 09-09-2003, 06:53 AM   #1
Green Mole
Join Date: Sep 2003
Posts: 1
phpDig ignores robots.txt

Hi, everyone,
searching for a suitable alternative to the postnuke search engine (which can't be used for a multisite setup) I've stumbled over yours.
So far it works nicely, just some things I can't resolve:

I've told the machine to index and put a robots.txt in the html-directory. But phpDIG keeps ignoring it...even when stating

User-agent: PhpDig
Disallow: /

it continues to spider into the subdirectories...

Is there any other way to exclude single directories ?It is said "Warning ! Erase is permanent" on the update form site but it isn't. This would be neat if I could just erase here all not-wanted pages but if I start reindexing the rest, again it starts to spider the just-erased pages. Adding the exclude-Tag to a single file didn't work either...again this page is indexed.

Maybe this is due to the postnukeCMS, no's a modular system and I wanted to limit access to some of the modules otherwise it would start to index without I need to restrict access to the dics.

Another problem is, that each spidering action causes damages to the postnuke-mySQL-files...I need to reinstall all tables of the site. This is weird; maybe due to the server configs (Apache 2.0) and not to phpdig.

Any ideas how to control this tool ?

Thanks for your input !


Last edited by Dragonfly; 09-09-2003 at 07:09 AM.
Dragonfly is offline   Reply With Quote
Old 09-12-2003, 06:54 AM   #2
Head Mole
Charter's Avatar
Join Date: May 2003
Posts: 2,539
Hi. The "Warning ! Erase is permanent" error is being produced because there is not a lock, i.e., $locked = 0. If you have access to raw log files, is the URL to robots.txt correct? Otherwise, in robot_functions.php there is a function called phpdigReadRobotsTxt. In that function, you might try echoing $site.'robots.txt' to see if it is correct. Not sure why the PostNuke tables are damaged. Didn't see any conflicting tables, even when the PhpDig prefix is set to nuke. What kind of damage is done to the PostNuke tables?
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
robots.txt seems to be ignored :? galacticvoyager Bug Tracker 1 11-12-2005 12:52 PM
PhpDig Ignoring Something in robots.txt Destroyer X Troubleshooting 2 06-18-2004 01:57 PM
robots.txt versus robotsxx.txt Charter IPs, SEs, & UAs 0 03-11-2004 06:00 PM
robots.txt ignored roy Troubleshooting 3 02-20-2004 08:02 PM
robots.txt renehaentjens Troubleshooting 3 12-05-2003 02:40 PM

All times are GMT -8. The time now is 01:06 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.