View Full Version : Problems with html coments <!-- -->
uruloki
10-23-2003, 10:52 PM
When I index pages with html coments like
<!-- #begintemplate="algo" -->
the spider replace it with
< begintemplate algo >
and this is a problem because I have coments with paths to conexion entries for my DataBases
The regular expressions in robot functions that match with that kind of sentences is as it appears:
//f..k <!SOMETHING tags !!
$text = eregi_replace('(<)!([^-])','\1\2',$text);
Sorry for my english, thanks for suggestions.
BYE
<!-- #begintemplate="algo" -->
Rolandks
10-24-2003, 12:47 AM
It is a problem with with > PHP 4.3.2 . The following must work as possible solution: See this thread here: (http://www.phpdig.net/showthread.php?s=&threadid=140)
Change ONLY this in robot_functions.php Line 160:
//replace any group of blank characters by an unique space
$text = ereg_replace("[[:blank:]]+"," ",strip_tags($text));
to
//replace any group of blank characters by
$text = preg_replace('/<.*>/U', '', $text);
It works with PHP 4.3.2 and PhpDig 1.6.2.
NO html-comments are indexing !
-Roland-
uruloki
10-24-2003, 04:40 AM
The real problem I have is when I index an internet domain, the comments appear, and when I work with the intranet domain works well (no comment). We have PHP 4.3.2. I made that change in the order of eregi_replace in robot_functions.php.
BEFORE:
//replace blank characters by spaces
$text = eregi_replace("--|[{}();\"]+|</[a-z0-9]+>|[\r\n\t]+",' ',$text);
//f..k <!SOMETHING tags !!
$text = eregi_replace('(<)!([^-])','\1\2',$text);
AFTER:
//f..k <!SOMETHING tags !!
$text = eregi_replace('(<)!([^-])','\1\2',$text);
//replace blank characters by spaces
$text = eregi_replace("--|[{}();\"]+|</[a-z0-9]+>|[\r\n\t]+",' ',$text);
I test the change and seems to work fine. I will reindex all today and if results... I post another commentarie.
Again sorry for may english... :D
vBulletin® v3.7.3, Copyright ©2000-2025, Jelsoft Enterprises Ltd.