Thread: accent in links
View Single Post
Old 08-28-2006, 02:24 AM   #1
pepevilluela
Green Mole
 
Join Date: Apr 2004
Posts: 7
Exclamation accent in links

PHPDig is not indexing links with accents. I'm using apache in windows XP(XAMPP from apachefriends) and I've setted PHPDig 1.8.8 in spanish (es).

Example link: http://localhost/Informatica/Documen...nto/index.html

Pay attention to the word código and the accent.

Microsoft explorer gets this page. PHPDig no. Answer is 403 Forbidden.

I have seen that Microsoft explorer changes ó (oacute) for %C3%B3, instead of PHP fputs, that send %B3 only.

I've tried some code just in http request, like

$pathant=$path;
$separados=explode("?",$path,2);
$separados[0]=str_replace("%3A",":",str_replace("%2F","/",urlencode(utf8_encode($separados[0]))));
$path=implode("?",$separados);
//complete get
$request =
"HEAD $path $http_scheme/1.1".END_OF_LINE_MARKER
."Host: $host$sport".END_OF_LINE_MARKER
.$cookiesSendString
.$auth_string
."Accept: */*".END_OF_LINE_MARKER
."Accept-Charset: ".PHPDIG_ENCODING.END_OF_LINE_MARKER
."Accept-Encoding: identity".END_OF_LINE_MARKER
."Connection: close".END_OF_LINE_MARKER
."User-Agent: PhpDig/".PHPDIG_VERSION." (+http://www.phpdig.net/robot.php)".END_OF_LINE_MARKER.END_OF_LINE_MARKER;
$path=$pathant;


and I have not the error 403 Forbidden,

but then spider stops with "No links in temporary table"

Last edited by pepevilluela; 08-28-2006 at 02:28 AM.
pepevilluela is offline   Reply With Quote