PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Mod Submissions

Reply
 
Thread Tools
Old 02-22-2004, 05:31 PM   #1
tibabs
Green Mole
 
Join Date: Feb 2004
Posts: 3
An easy way to boost PhpDig ?

Hi,

Since 2 days I try to dig 20.000 html pages on my database that already contains close to 6.000.000 records in phpdig_engine table.
Unfortunately ... it last a very very long time (it could last more than 1 minute to dig a unique html file as it took around 3 seconds 2 weeks ago !!!).
After several investigation concerning my system (XP) my database limits (Mysql), my folder size (20.000 html files) ... I've found the solution.
I hope it could help somebody else.

Using time-tracker function I've discover that the time consuming code is the "Optimizing phase" of the spider.php file (PhpDig V1.8.0). As a result ... just comment this 4 lines, integrate another optimizing process your own way (every 5000 digs for example) and enjoy with your new boosted Phpdig.

=== Code to comment in spider.php
//print "Optimizing tables...".$br;
//@mysql_query("OPTIMIZE TABLE ".PHPDIG_DB_PREFIX."spider",$id_connect);
@mysql_query("OPTIMIZE TABLE ".PHPDIG_DB_PREFIX."engine",$id_connect);
@mysql_query("OPTIMIZE TABLE ".PHPDIG_DB_PREFIX."keywords",$id_connect);


Remarks: I'm only using PhpDig for inserting new html files. There is no update, no delete. By the way the original PhpDig optimization phase is less important for me.

Regards.
tibabs.
tibabs is offline   Reply With Quote
Old 02-22-2004, 05:33 PM   #2
tibabs
Green Mole
 
Join Date: Feb 2004
Posts: 3
Ooopsss ... I've miss to comment 2 lines in my post.

=== Code to comment in spider.php
//print "Optimizing tables...".$br;
//@mysql_query("OPTIMIZE TABLE ".PHPDIG_DB_PREFIX."spider",$id_connect);
//@mysql_query("OPTIMIZE TABLE ".PHPDIG_DB_PREFIX."engine",$id_connect);
//@mysql_query("OPTIMIZE TABLE ".PHPDIG_DB_PREFIX."keywords",$id_connect);


Regards,
tibabs.
tibabs is offline   Reply With Quote
Old 02-25-2004, 05:33 AM   #3
alivin70
Orange Mole
 
alivin70's Avatar
 
Join Date: Sep 2003
Posts: 40
I have the same problem.
I've started indexing many sites and noticed a slowing down of the engine.My engine table contains 1/2 million records.
Every page indexing lasts several seconds (up to 1 min)

But your solution is strange because optimization is made only once at the end of site spidring.

Any suggestion?
alivin70 is offline   Reply With Quote
Old 02-25-2004, 04:24 PM   #4
tibabs
Green Mole
 
Join Date: Feb 2004
Posts: 3
Hi,


On my way I'm only indexing differents html pages that are not necessary linked together.
It seems, as you said that the optimization phase is done only once ... but ...

What I can suggest to you
Sol #1) Try to comment the optimization phase and have a look to the result ==> 5 minutes
Sol #2) Use the phpdigTimer class to profile the source or use other profiling functions such as http://www.pear.php.net/package/Benchmark. ==> 1 hour

I thing that you can quick discover where from is coming the trouble quite quickly (1 hour) and afterwords to fix it.


Regards,
Thierry
tibabs is offline   Reply With Quote
Old 02-25-2004, 11:41 PM   #5
alivin70
Orange Mole
 
alivin70's Avatar
 
Join Date: Sep 2003
Posts: 40
Quote:
Originally posted by tibabs
Hi,


On my way I'm only indexing differents html pages that are not necessary linked together.
It seems, as you said that the optimization phase is done only once ... but ...

What I can suggest to you
Sol #1) Try to comment the optimization phase and have a look to the result ==> 5 minutes
Sol #2) Use the phpdigTimer class to profile the source or use other profiling functions such as http://www.pear.php.net/package/Benchmark. ==> 1 hour

I thing that you can quick discover where from is coming the trouble quite quickly (1 hour) and afterwords to fix it.


Regards,
Thierry
I see,
we have different problems.
I have to index large web sites, you have to index many single pages.

Optimization is run at the end of each spidering even if indexing a single page.

Maybe it's possible to add a flag in the config.php to disable automatic optimization and run it manually from the admin page.

If you do that hack post it here
alivin70 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Easy way to add most & last searched queries to web page? guinessec How-to Forum 0 12-01-2004 11:08 AM
very easy LogicMan Feedback & News 1 09-14-2004 08:16 PM
Easy RegExp Trivia Charter The Mole Hole 1 10-27-2003 10:26 AM


All times are GMT -8. The time now is 01:14 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.