PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 03-26-2004, 11:32 PM   #1
marb
Green Mole
 
Join Date: Mar 2004
Posts: 19
unable to parse url

Hi,
I'm spider a page and get the below error notice, what can I do on it?
I use the loop option and have no troubles before with other pages spidering.
The spider index a page and the message show up wen a other url is located, not the page wich is spider at that moment.



[quote]
+ + + + + + + + + + + + + + + + + +
63:http://www.wetcanvas.com/MediaKit/
(time : 00:54:20)

Warning: parse_url(http://www.heritageglass.com?amp;zon...itageglass.com) [function.parse-url]: Unable to parse url in /opt/guide/www.artrefer.com/HTML/web/s3/admin/robot_functions.php on line 372
+ + + + + + + +
64:http://www.wetcanvas.com/web/
(time : 00:54:51)
+ +
65:http://www.wetcanvas.com/colormixer/
(time : 00:55:21)
+ + + +

Marten
marb is offline   Reply With Quote
Old 03-27-2004, 02:06 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. There is a 1.8.0 fix in this post that should be applied.

However, even with the fix, I'm not sure parse_url will handle a URL in the query string. See below.
PHP Code:
<?php
$url
="http://www.heritageglass.com?amp;zoneid=0&source=&dest=http://www.heritageglass.com";
print_r(parse_url($url)); // without fix and with url
echo "\n<br>\n";
$url="http://www.heritageglass.com?zoneid=0&source=&dest=http://www.heritageglass.com";
print_r(parse_url($url)); // with fix and with url
echo "\n<br>\n";
$url="http://www.heritageglass.com?zoneid=0&source=&dest=";
print_r(parse_url($url)); // with fix and without url
?>
The output is as follows:

Array
(
[scheme] => http
[host] => www.heritageglass.com?amp;zoneid=0&source=&dest=http
[path] => //www.heritageglass.com
)

Array
(
[scheme] => http
[host] => www.heritageglass.com?zoneid=0&source=&dest=http
[path] => //www.heritageglass.com
)

Array
(
[scheme] => http
[host] => www.heritageglass.com
[query] => zoneid=0&source=&dest=
)


Untested, but in robot_functions.php you might try the following code:
PHP Code:
$newurl parse_url($newpath);

// add this chunk of code here
if ((isset($newurl["host"])) && (eregi("[?]",$newurl["host"]))) {
  if (!isset(
$newurl["path"])) { $newurl["path"] = ""; }
  if (!isset(
$newurl["query"])) { $newurl["query"] = ""; }
  
$newurl["query"] = substr(strstr($newurl["host"],"?"),1).$newurl["path"].$newurl["query"];
  unset(
$newurl["path"]);
  
$newurl["host"] = substr($newurl["host"],0,strpos($newurl["host"],"?"));
}

//search if relocation is absolute or relative 
Remember to remove any "word" wrapping in the above code.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Unable to Spider Corporate Website jigr69 Troubleshooting 1 12-01-2006 12:42 AM
Parse error with includes cherie Troubleshooting 1 12-14-2004 12:34 PM
RTF never parse... Ross Troubleshooting 8 07-13-2004 08:58 AM
Unable to perform phpdig installation : help pki Script Installation 9 07-12-2004 09:36 PM
Unable to complete installation mpostle Script Installation 6 10-28-2003 07:23 PM


All times are GMT -8. The time now is 04:08 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.