PDA

View Full Version : not digging certain links


BulForce
08-31-2004, 06:00 AM
I try this in file that contanis 1000 links to external domains, BUT the digger index only the page with the 1000 links and the first one of external links. I try all combinations
between the dig depth and links per. But the situation is same

BulForce
08-31-2004, 06:25 AM
:bang: :bang: :bang:
I think that the problem become from the specific of the links
i will give a short example of the links:

http://somewhere.com/path1/path1/file1.php?someid,1,1,1
http://somewhere.com/path2/path2/file2.php?someid,2,2,2
http://somewhere.com/path3/path3/file3.php?someid,3,3,3
http://somewhere.com/path4/path4/file4.php?someid,4,4,4
http://somewhere.com/path5/path5/file5.php?someid,5,5,5

The spider looks at them, then index http://somewhere.com/path2/path2/file2.php?someid, but after that start looking at http://somewhere.com/ and i think that he thinks that all the other links are pointed to same location.

Charter i will be very happy if you can help me :)
Thanks in advance

BulForce
09-10-2004, 06:27 AM
Is this thread forgotten by the support ?????

Charter
09-11-2004, 07:15 AM
What version of PhpDig are you using? What do you get with the following code?

<?php
$url = "http://somewhere.com/path1/path1/file1.php?someid,1,1,1";
print_r(parse_url($url));
?>

BulForce
09-15-2004, 04:02 PM
Digger version is "PhpDig v.1.8.3"

An the code output is:

Array ( [scheme] => http [host] => somewhere.com [path] => /path1/path1/file1.php [query] => someid,1,1,1 )

I hope that this will help you.

BulForce
09-19-2004, 06:48 AM
Does anyone have ideas about solution of this problem ?

BulForce
09-30-2004, 07:15 AM
Charter ? Any idea about resolving this problem.

Charter
09-30-2004, 10:03 AM
Maybe a redirect issue? I cannot say for sure.

BulForce
10-03-2004, 05:27 AM
Can you point me to some solution ?

BulForce
10-11-2004, 09:13 AM
Charter, i found that i have made a mistake when gived this example
Acctually the links are same and only ID is diffrent. Host and path are same example:

http://www.somewhere.com/index.php?id=1,2,0
http://www.somewhere.com/index.php?id=1,3,0
http://www.somewhere.com/index.php?id=1,4,0 and etc.

But this are tottaly diffrent pages, so is there any way to make the spider to index them ?

Thank you