![]() |
Problems with html coments <!-- -->
When I index pages with html coments like
<!-- #begintemplate="algo" --> the spider replace it with < begintemplate algo > and this is a problem because I have coments with paths to conexion entries for my DataBases The regular expressions in robot functions that match with that kind of sentences is as it appears: //f..k <!SOMETHING tags !! $text = eregi_replace('(<)!([^-])','\1\2',$text); Sorry for my english, thanks for suggestions. BYE <!-- #begintemplate="algo" --> |
It is a problem with with > PHP 4.3.2 . The following must work as possible solution: See this thread here:
Change ONLY this in robot_functions.php Line 160: Code:
//replace any group of blank characters by an unique space Code:
//replace any group of blank characters by NO html-comments are indexing ! -Roland- |
What about that
The real problem I have is when I index an internet domain, the comments appear, and when I work with the intranet domain works well (no comment). We have PHP 4.3.2. I made that change in the order of eregi_replace in robot_functions.php.
BEFORE: //replace blank characters by spaces $text = eregi_replace("--|[{}();\"]+|</[a-z0-9]+>|[\r\n\t]+",' ',$text); //f..k <!SOMETHING tags !! $text = eregi_replace('(<)!([^-])','\1\2',$text); AFTER: //f..k <!SOMETHING tags !! $text = eregi_replace('(<)!([^-])','\1\2',$text); //replace blank characters by spaces $text = eregi_replace("--|[{}();\"]+|</[a-z0-9]+>|[\r\n\t]+",' ',$text); I test the change and seems to work fine. I will reindex all today and if results... I post another commentarie. Again sorry for may english... :D |
All times are GMT -8. The time now is 08:32 AM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.