PDA

View Full Version : What about the text_content file


marb
04-09-2004, 08:35 AM
Have install phpdig more than one time in different dirs.
The content in text_content file is in dir (set dir 777)a 644 and dir b 666.
Dir b have the keep alive text dir a not.
How can that there are differents?
By spidering the text_content is growing ferry big.
Q. 1 Is it possible to put the text_content file also in the DB.
Q.2 What happens if should empty the text_content file after
spidering. What is the function of that file?


Marten :)

Charter
04-10-2004, 01:33 PM
Hi. For question one, only the first words of a page are currently stored in the table. For question two, the text_content directory is used in the search process. You can set define('CONTENT_TEXT',1); to zero in the config file so that just the first words are used in the search results, but the highlighting currently doesn't work the same as when using the text_content directory.

marb
04-12-2004, 03:24 AM
Charter wrote;
Hi. For question one, only the first words of a page are currently stored in the table.
Thanks for the reply;
Is it possible to store more words in the table in case of use text_content?

Reasen is that I use phpdig in more than 1 url with one DB.
If I use text_content (it's a nice option) I must copy that file
evertime after new spider to the other url to get it equal.
That's not a big problem but the file grow bigger and bigger.
In my case it's now about 450 mb for more than 18000 url's.

Marten :)

renehaentjens
04-12-2004, 11:42 PM
In my config file, I have:
define('SUMMARY_LENGTH',50000); // instead of 500
define('CONTENT_TEXT',0);