|
10-23-2003, 01:54 AM | #1 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Words after SMALL_WORDS_SIZE not indexed
On pages where words AFTER a short word which are excluded by (SMALL_WORDS_SIZE = 2) separeted with - ALL word after - are NOT indexed.
Example (for Demo 1.6.2) : If-Modified -> Modified is NOT found in this page (other words on this page are indexed): http://httpd.apache.org/docs/misc/perf-tuning.html Okay, Modified is in index but NOT this "Modified" (don´t find an other word after - )! :: Other example for test:: - add at a page the words: or-juzutuziopa and index this page. juzutuziopa was NOT found and or-juzutuziopa was also not found juzutuziopa is not in keywordtable ! Any hints ?
__________________
-Roland- :: Test PhpDig 1.6.2 here :: - :: Test-Search for (little) Intelligent Php-Dig Fuzzy :: |
10-23-2003, 09:00 AM | #2 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Indexing and exclude SMALL_WORDS are in:
admin\robot_functions.php (Line 873) admin\robot_functions.php (Line 913) function phpdigEpureText($text,$min_word_length=2,$encoding = PHPDIG_ENCODING) is in: libs\phpdig_functions.php( Line 213): or-juzutuziopa must index as one word ! Perhaps it is a name, city or other .... ! I think -Roland- Last edited by Rolandks; 10-24-2003 at 08:54 AM. |
10-24-2003, 12:50 AM | #3 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
I have try it on an other machine: or-juzutuziopa are indexed and works with php 4.3.0 -> its again > PHP 4.3.2 problem !
Hmm, Move this thread to Bugs, please. -Roland- |
10-27-2003, 08:41 AM | #4 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
I am not an expert in regular ex, but i think this are the reason for all BUGS they using ereg_replace in PHP > 4.3.2:
libs\phpdig_functions.php( Line 213): PHP Code:
Can anyone change ALL ereg_replace to SGML-Conform version, because this is change since PHP 4.3.2 ! Thanks -Roland- |
11-06-2003, 08:24 AM | #5 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Hello ?!
Have no one an idea why word separeted with an - and ALL words after - are NOT indexed in PHP > 4.3.2 but index in in PHP < 4.3.2 no-index-this It's important - thanks. -Roland- |
11-07-2003, 02:12 PM | #6 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. When you run the following, what do you see when you look at the HTML source?
PHP Code:
Code:
or-juzutuziopa <- orig text<br> or-juzutuziopa <- new text<br>
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
11-08-2003, 07:54 AM | #7 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Okay, i have same result
See this search: x-compress ist NOT found. "compress" is in keyword-table because there are other word "compress" in the pages. Try to add or-juzutuziopa on one of the apache Site and reindex this site. If you are using PHP 4.3.2 or 4.3.3 on the server, the word juzutuziopa is NOT indexed and NOT in keyword-table. But with PHP 4.3.1 or PHP 4.3.0 it is indexed. I don't know why -Roland- Last edited by Rolandks; 11-08-2003 at 08:16 AM. |
11-08-2003, 08:27 AM | #8 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. In search_function.php find:
PHP Code:
PHP Code:
Of course, remove any "word" wrapping in the above code.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
11-08-2003, 09:21 AM | #9 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
In search_function.php ? This php-code (if (eregi("[^[:alnum:]^ ....) i do NOT found in complete phpdig code ?
Why search_function.php ? The words after - are NOT indexed! I think problem are: admin\robot_functions.php ! -Roland- |
11-08-2003, 09:26 AM | #10 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Oh I see. I was going off of the example search posted above.
I use the code above so it now allows dashes in the searches. Not indexed is the problem, as you posted. Silly me.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
11-08-2003, 10:34 AM | #11 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Try running the following code (remove any "word" wrapping if necessary).
PHP Code:
Code:
My t-shirt is blue. A<---<br><br> my t-shirt is blue. B<---<br><br> my t-shirt is blue. C<---<br><br> my t-shirt is blue. D<---<br><br> t-shirt blue. E<---<br><br> t-shirt blue F<---<br><br> t-shirt blue G<---<br><br>
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
11-08-2003, 01:11 PM | #12 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Hmm, a difficult problem - just the same
Code:
My t-shirt is blue. A<---<br><br> my t-shirt is blue. B<---<br><br> my t-shirt is blue. C<---<br><br> my t-shirt is blue. D<---<br><br> t-shirt blue. E<---<br><br> t-shirt blue F<---<br><br> t-shirt blue G<---<br><br> -Roland- |
11-10-2003, 05:54 AM | #13 |
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Okay, i found the problem.
t-shirt is indexed in keyword-table as: t-shirt or-juzutuziopa is indexed in keyword-table as: or-juzutuziopa BUT if you search: t-shirt or or-juzutuziopa you get: "t", are too short words and were ignored. "or", are too short words and were ignored. BUT search for: shirt or juzutuziopa are get empty results. The problem is in search_function with version PHP 4.3.2 or 4.3.3 ! With PHP 4.3.0 / 4.3.1 If you search: t-shirt you get: Results 1-2, 2 total, on "t-shirt" (0.48 seconds) If search for: shirt you get empty result. -Roland- Last edited by Rolandks; 11-10-2003 at 06:09 AM. |
11-10-2003, 06:13 AM | #14 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. First apply the patch in post five of this thread, and then apply the patch in post eight above, and make sure that in search_function.php the following line is commented out.
PHP Code:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
11-10-2003, 06:35 AM | #15 | ||
Purple Mole
Join Date: Sep 2003
Location: Kassel, Germany
Posts: 119
|
Okay thanks, this patch five is include since many weeks, also commented out the line. But what means this ?
Quote:
Quote:
|
||
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Many words not indexed | darrenm | Troubleshooting | 1 | 07-29-2005 08:54 AM |
Meta Robots = NoIndex, or already indexed : No content indexed | jerrywin5 | How-to Forum | 2 | 04-06-2005 02:50 PM |
How to specifiy whole words only in results | schizmat | How-to Forum | 2 | 03-21-2005 02:16 AM |
dig for certain words | nmott | How-to Forum | 1 | 02-26-2005 08:02 PM |
index all words | Dreamory | How-to Forum | 0 | 10-22-2004 05:34 AM |