PDA

View Full Version : phpdig not index numbers.


redlock
09-20-2003, 07:49 AM
phpdig not index numbers.

i have replace (found in forum)

In the file phpdig/libs/phpdig_functions.php find the function called 'phpdigEpureText'.:

$text = ereg_replace('[[:blank:]][0-9]+[[:blank:]]',' ',ereg_replace('[^[:alnum:]ðþ._&ß%/-]+',' ',$text));

with the following two lines:

$text = ereg_replace('[^[:alnum:]ðþ._&ß]+',' ',$text);

but it works not,

numbers like 2002, 477 or 2006 are not found, although they stand as plaintext in HTML. i have already update the files there numbers are = no result

please help.

--
sorry for my english, i´m german

Charter
09-20-2003, 01:54 PM
Hallo. Die PhpDig 1.6.2 Version von Funktion phpdigEpureText ist wie folgt:

function phpdigEpureText($text, $min_word_length=2, $encoding=PHPDIG_ENCODING) {
global $phpdig_words_chars;

$text = phpdigStripAccents(strtolower ($text));
//no-latin upper to lowercase - now islandic
switch (PHPDIG_ENCODING) {
case 'iso-8859-1':
$text = strtr( $text,'ÐÞ','ðþ');
break;
}
$text = ereg_replace('[[:blank:]][0-9]+[[:blank:]]', ' ', ereg_replace('[^'.$phpdig_words_chars[$encoding].'._&%/-]+', ' ', $text));
$text = ereg_replace('[[:blank:]][^ ]{1,'.$min_word_length.'}[[:blank:]]', ' ', ' '.$text.' ');
$text = ereg_replace('\.+[[:blank:]]|\.+$|\.{2,}', ' ', $text);
return trim(ereg_replace("[[:blank:]]+"," ",$text));
}

Diese Funktion läßt Sie auf Zahlen (Versuch es hier (http://www.phpdig.net/demo/search.php?query_string=200)) suchen und, in config.php, können Sie die Wortgröße einstellen, um zu ignorieren.

Hoffen Sie, daß mein Deutsch lesbar ist. :)

redlock
09-21-2003, 11:57 PM
thanks.

but, it works not correctly!

for example:
in some html-site are numbers like 2002/2003 or 481, 477, BR 476. the search found only the number 2002 but not the number 2003 although it is in the same file! i don´t understand this.
i have also the files again indexed = no result

Charter
09-26-2003, 05:52 PM
Hi. I've been able to replicate the problem with phrases and will work on it for a future release.

Charter
10-05-2003, 02:15 PM
Just curious... Are you using "words begin," "exact words," or "any words part" when performing the search on numbers?

For example, if you have 2002/2003 and search on 200 with "words begin" then you should get 2002/2003, but if you search with "any words part" than you should get 2002/2003.

redlock
10-06-2003, 07:00 AM
yes, i have the same problem

For example: 2002/2003
Search: "200

search with "words begin" = 2002/2003
search with "exact words" = only the number 200
search with "words begin" = 2002/2003

another example on the same homepage:
on one site : "481" and "1995" and "2003"
every Search it found only "481"

or

another example on the same homepage:
on one site : "481" and "477"
every Search it found nothing!

= curious

Charter
10-06-2003, 03:44 PM
Hi. Actually, that's the way I think it's supposed to work, although I can see where it can be improved. Here's what it's doing:

For a search with "words begin" => 2002/2003, this is because the 2002/2003 begins with [space]200, whereas the 200 in 2003 actual begins with [space]2002/ so it's not found.

For a search with "exact words" => only the number 200, this is because it's looking for [space]200[space] and nothing else.

For a search with "any words part" => 2002/2003, this is because it doesn't care what is at the beginning or end, just as long as 200 is in there somewhere.