PDA

View Full Version : keywords duplicates and unwanted keywords


jerrywin5
03-27-2005, 01:21 AM
I am using version 1.8.7.

My keywords table has entries that end with characters such as:
,.-:;# and 's

I would like to ensure that these values are stripped from words when indexing.

I checked the source against the instructions in this thread (http://www.phpdig.net/forum/showthread.php?t=845) and found that the new version has this code implemented.

My kewords table also includes entries that end with:
.doc .mid .shtml .html .gif .jpg .pnp

I would like to ensure that these entries are no longer indexed.

I would appreciate any help.

Charter
04-06-2005, 01:23 PM
That, fortunately or unfortunately, is how PhpDig 1.8.7 works to deal with words followed by punctuation marks. PhpDig 1.8.8 RC1 avoids the issue.

jerrywin5
04-06-2005, 01:40 PM
Hi Charter,

Due to the new requirements, I cannot use PhpDig 1.8.8 RC1. Is it possible that I can use some of the code from 1.8.8 RC1 in 1.8.7 to resolve the problem?

Charter
04-06-2005, 02:17 PM
Sorry, I don't have a list of code changes from 1.8.7 to 1.8.8 RC1, and 1.8.8 RC1 uses mb_* for multi-byte processing. :(

jerrywin5
04-06-2005, 02:54 PM
Charter,

What is muti-byte processing and what will it do?

Charter
04-06-2005, 03:20 PM
PhpDig v.1.8.8 RC1 converts text to UTF-8 and stores it as UTF-8 in MySQL. Basically, some languages use more than one byte per letter/character, and different sites use different encodings, so the mb_* functions try to deal with these things. Check out http://www.php.net/manual/en/ref.mbstring.php for an intro to multibyte functions.