![]() |
|
![]() |
#1 |
Purple Mole
Join Date: Aug 2004
Location: North Island New Zealand
Posts: 170
|
Grep command for finding Saga zero length
Charter I am still having problems with Zero content descriptions for websites that have been indexed.
It appears the site Id and Spider id are out of sequence, would you know a way of bringing them back indo sequence? I have tried to use the Grep Command to look for files with the name SAGA or that contain Saga. Within the text_content directory but as yet nothing has shown itself. When listing the files in the text_content directory a few have the file length of one. Shall I deleted to files that have just one bite file length or is there an easier way to get the database running okay again. Heaps of regards Dave A |
![]() |
![]() |
![]() |
#2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Try running the following query:
Code:
$query = " SELECT ".PHPDIG_DB_PREFIX."spider.spider_id AS spider_spider_id, ".PHPDIG_DB_PREFIX."spider.site_id AS spider_site_id, ".PHPDIG_DB_PREFIX."sites.site_id AS site_site_id FROM ".PHPDIG_DB_PREFIX."spider LEFT JOIN ".PHPDIG_DB_PREFIX."sites ON (".PHPDIG_DB_PREFIX."sites.site_id = ".PHPDIG_DB_PREFIX."spider.site_id) ORDER BY ".PHPDIG_DB_PREFIX."spider.spider_id ASC "; Code:
+------------------+----------------+--------------+ | spider_spider_id | spider_site_id | site_site_id | +------------------+----------------+--------------+ | 1 | 1 | 1 | | 2 | 1 | 1 | | 3 | 2 | 2 | | 4 | 2 | 2 | | 5 | 2 | 2 | | 6 | 1 | 1 | | 7 | 1 | 1 | | 8 | 1 | 1 | | 9 | 1 | 1 | | 10 | 1 | 1 | | 11 | 1 | 1 | | 12 | 1 | 1 | | 13 | 1 | 1 | | 14 | 1 | 1 | | 15 | 1 | 1 | | 16 | 1 | 1 | +------------------+----------------+--------------+ 16 rows in set (0.00 sec) Code:
text_content> ls 1.txt 11.txt 13.txt 15.txt 2.txt 4.txt 6.txt 8.txt keepalive.txt 10.txt 12.txt 14.txt 16.txt 3.txt 5.txt 7.txt 9.txt
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
![]() |
![]() |
![]() |
#3 |
Purple Mole
Join Date: Aug 2004
Location: North Island New Zealand
Posts: 170
|
Thanks Charter!
Thanks for that Charter,
You gave Brilliant support with that reply to my question. It did take an age to process but then the database is quite large now. The problem is now resolved. I did upgrade the server with an extra Gig of RAM and the speed of the searches has increased by around 30% which is really good. I now run a cron job on the server every 60mins which reports back to me the state of the file system on the server, often a few TMP files need removing and the speed comes back up. The speed issue with Phpdig can be helped with an increase in the memory, with the RAID system using Linux it can produce large TMP files if the memory usage rises a heap. Memory usage in my server was often around 90% or higher which made it write temp files to the hard disk and the increase of available RAM has assisted it's speed a heap. Linux Strike needed to have it's kernel changed and updated to see the free memory increase but it is now flying. Thanks for your great help and assistance. All the best Dave A |
![]() |
![]() |
![]() |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
zero length files! | Dave A | Troubleshooting | 0 | 01-10-2006 01:09 AM |
Num_page zero length | Dave A | Troubleshooting | 3 | 11-18-2005 03:37 AM |
Using phpdig for finding copyright infringements | leto | How-to Forum | 0 | 09-22-2005 12:31 AM |
Spider not finding anything. | nvahalik | Troubleshooting | 2 | 01-25-2005 01:41 PM |
finding dead links | manute | How-to Forum | 2 | 01-14-2004 03:00 PM |