|
10-13-2003, 03:39 PM | #16 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi Rolandks. The bug is that strip_tags is more lenient than before, meaning that certain things that used to be stipped are no longer. With preg_replace('/<.*>/sU', '', $text); and eregi_replace("<[^>]*>","",$text); everything between the < and > should be stripped. My personal preference is to use eregi_replace("<[^>]*>","",$text); over preg_replace('/<.*>/sU', '', $text); but I don't want to keep using strip_tags($text); because of problems encountered.
Hi manute. What version of PhpDig are you running? In robot_functions.php, the phpdigCleanHtml function in version 1.6.2 is as follows: PHP Code:
PHP Code:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-14-2003, 05:29 AM | #17 |
Orange Mole
Join Date: Oct 2003
Location: hamburg, germany
Posts: 52
|
hmm okay, then it's a bug. but what can i now do?
can't anyone just tell me how to get rid of "<!----------", "---------->" and everything in between? |
10-14-2003, 02:59 PM | #18 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
What version of PhpDig are you running?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-14-2003, 03:59 PM | #19 |
Orange Mole
Join Date: Oct 2003
Location: hamburg, germany
Posts: 52
|
i'm running "PhpDig 1.6.x" as it seems. that's written in the index.php.
i'm gonna try to understand the code you posted up there, but now i'm gonna go to bed... :-) cu! |
10-15-2003, 05:54 AM | #20 |
Orange Mole
Join Date: Oct 2003
Location: hamburg, germany
Posts: 52
|
okay charter, i changed the robot_functions the way you posted it up there. now i'm reindexing. hope it's gonna work now. :-/
|
10-15-2003, 06:05 AM | #21 |
Orange Mole
Join Date: Oct 2003
Location: hamburg, germany
Posts: 52
|
yes, that finally seems to work now. thank you charter!
|
10-16-2003, 05:03 AM | #22 |
Orange Mole
Join Date: Oct 2003
Location: hamburg, germany
Posts: 52
|
"§/$%&"$ i'm really going crazy with that ****. it still doesn't work.
i tried putting $text = eregi_replace("<[^>]*>","",$text); and $text = preg_replace('/<.*>/sU', '', $text); into the cleanhtml-function, but it still doesn't work. haven't you tried it yourself - does that work with your sites? |
10-16-2003, 05:53 AM | #23 |
Orange Mole
Join Date: Oct 2003
Location: hamburg, germany
Posts: 52
|
now i got an idea. :-)
in the cleanhtml-function there also happens the html-entities-replacing. so after that, of courso there is no "<" to be replaced any more, it's "<" then. i've tried to put it at the beginning of the function, before the html-entity-replacing. unfortunately spidering my site always takes a couple of hours, so i can't say if it works right now. but i'll do later. |
10-16-2003, 05:54 AM | #24 |
Orange Mole
Join Date: Oct 2003
Location: hamburg, germany
Posts: 52
|
hey, this thread more and more looks like a discussion between me and myself... ;-)
is anyone else still reading here at all? :-) |
10-16-2003, 09:39 AM | #25 | |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Quote:
The following must work as possible solution: Change ONLY this in robot_functions.php Line 160: Code:
//replace any group of blank characters by an unique space $text = ereg_replace("[[:blank:]]+"," ",strip_tags($text)); Code:
//replace any group of blank characters by $text = preg_replace('/<.*>/U', '', $text); -Roland- |
|
10-16-2003, 06:17 PM | #26 |
Orange Mole
Join Date: Oct 2003
Location: hamburg, germany
Posts: 52
|
yes! now it finally works. thanks rolandks and charter! :-)
|
10-17-2003, 03:33 PM | #27 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Great, glad it's now working.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
10-17-2003, 03:59 PM | #28 |
Orange Mole
Join Date: Oct 2003
Location: hamburg, germany
Posts: 52
|
yeah, me too! :-)
|
01-19-2004, 05:25 PM | #29 |
Green Mole
Join Date: Nov 2003
Posts: 7
|
Works for me also. I just noticed those ugly comments in my search result snippets today. I'm not sure when my host updated PHP, but as far as I'm concerned it's very naughty of them to change the behavior of a standard function so radically.
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Multi-line HTML comments incorrectly being indexed | nicrodgers | Troubleshooting | 0 | 12-22-2004 02:32 AM |
How to make phpdig index certain content, located in certain html tags?! | r3m | How-to Forum | 1 | 11-18-2004 05:27 PM |
Phpdig indexing including HTML in results | Mrsoft | Troubleshooting | 1 | 09-28-2004 04:23 AM |
PHP and Javascript in phpdig.html template file | jayhawk | How-to Forum | 1 | 06-17-2004 05:03 PM |
Indexing all HTML-Comments | Rolandks | Bug Tracker | 4 | 10-04-2003 06:38 AM |