Hi. What version of PHP do you have? Try running the following. What are the results when viewing the HTML source?
PHP Code:
<?
$text = "<!-- test -->";
$text2 = phpdigCleanHtml($text);
function phpdigCleanHtml($text) {
//htmlentities
//global $spec;
//replace blank characters by spaces
$text = ereg_replace("[\\r\\n\\t]+"," ",$text);
echo $text . "A<br>\\n";
//extracts title
if ( eregi("<title *>([^<>]*)</title *>",$text,$regs) ) {
$title = $regs[1];
}
else {
$title = "";
}
//delete content of head, script, and style tags
$text = eregi_replace("<head[^<>]*>.*</head>"," ",$text);
echo $text . "B<br>\\n";
$text = eregi_replace("<script[^>]*>.*</script>"," ",$text);
echo $text . "C<br>\\n";
$text = eregi_replace("<style[^>]*>.*</style>"," ",$text);
echo $text . "D<br>\\n";
// clean tags
$text = eregi_replace("(</?[a-z0-9 ]+>)",'\\1 ',$text);
echo $text . "E<br>\\n";
//tries to replace htmlentities by ascii equivalent
/*
foreach ($spec as $entity => $char) {
$text = eregi_replace ($entity."[;]?",$char,$text);
$title = eregi_replace ($entity."[;]?",$char,$title);
}
*/
$text = ereg_replace('&#([0-9]+);',chr('\\1').' ',$text);
echo $text . "F<br>\\n";
//replace blank characters by spaces
$text = eregi_replace("--|[{}();\\"]+|</[a-z0-9]+>|[rnt]+",' ',$text);
echo $text . "G<br>n";
//f..k <!SOMETHING tags !!
$text = eregi_replace('(<)!([^-])','\\1\\2',$text);
echo $text . "H<br>n";
//replace any group of blank characters by an unique space
$text = ereg_replace("[[:blank:]]+"," ",strip_tags($text));
echo $text . "I<br>n";
//$retour['content'] = $text;
//$retour['title'] = $title;
return $text;
}
echo $text2."J<br>";
?>
I get the following when I view the HTML source:
Code:
<!-- test -->A<br>
<!-- test -->B<br>
<!-- test -->C<br>
<!-- test -->D<br>
<!-- test -->E<br>
<!-- test -->F<br>
<! test >G<br>
< test >H<br>
I<br>
J<br>