Quote:
Originally posted by Charter
Hi. There is a ¤¢¤ combo in the $string variable where ¢¤ is not replaced with a space. Did you mean something else?
|
Errrr... I'm not sure I understand.
The script you submitted use a regular expression to prevent replacing ¢¤ if the before is ¤, right?
I meant, in the case where the character before ¢¤ is really a multi-byte character ending with ¤, ¢¤ is not replaced. But I think this has a few chance to happen.
Quote:
Originally posted by Charter
>> The script extract the longest matching word from the page text and index it.
With the mutli-byte dictionary, is it that only the longest matching word from a page gets indexed?
|
No of course, it will extract all the words comparing the page content with the longest words first. Ex : in English, it wouldn't extract "nation" from "internationalization" if "internationalization" is in the dictionnary.
But the dictionnary must be as complete as possible to do a good job.
Can it be integrated to phpdig?