View Single Post
Old 02-21-2004, 11:20 AM   #13
mlerch@mac.com
Green Mole
 
Join Date: Feb 2004
Location: North Las Vegas, Nevada
Posts: 18
Hi Charter,

So I did some more detailed looking into the problem. Here is what I found.

when spidering the URL that doesn't work (stalls):


I have traced it to:

In robot_functions.php

1. function phpdigDetectDir

in this function it parses the URL in to the variable $test, then it goes through an if { then } else { then } statment. In my case it it takes the ...else path because apparently the $test['query'] is set.

Since it is taking the else { then } path. In the very first line robot_functions.php tries to define following variable:

$status = phpdigTestUrl($link['url'].$link['path'].$link['file'],'date',$cookies);

This is where it seems to stall, so I checked into this function.

2. function phpdigTestUrl

it runs all the way through the "while" routine end it ends up where:
$status = "NOFILE";

at the very end of that function $mode does not seem to be 'date', so it is supposed to:

return $status;

I guess that is where it hangs.


Here are some details about the URL/website that I am trying to spider:

http://www.mydomain.com/index.php

index.php actually has in the very beginning a piece of script that checks if there is a variable string appended to index.php, and if it is formatted correctly.

If the script finds out that there is a formatting problem, or that there is no variable string at the end of .../index.php then it will grab the correct string and do a redirect to an URL like this:

http://www.mydomain.com/index.php?na...,1,1,1,1,1,0,0

Essentially when you were to go and type in the URL http://www.mydomain.com, or http://www.mydomain.com/index.php it will redirect you to:

http://www.mydomain.com/index.php?na...,1,1,1,1,1,0,0

Do you think that this is causing the problem? Please advise.

Oh yes, I actually tried to enter the URL into the PhpDig interface just like it would redirect it, but it still hangs with a NOFILE status.

Oh yes, why is $path always /robots.txt
I don't really understand it enough I guess.

Thank you very much,

Mr. L

Last edited by mlerch@mac.com; 02-21-2004 at 11:38 AM.
mlerch@mac.com is offline   Reply With Quote