PDA

View Full Version : Spidering through a script


bloodjelly
12-19-2003, 11:45 AM
Hi -

I'm trying to run the spider in the "background" through a php script, and I have this so far:

$GLOBALS['limit'] = 2;
$GLOBALS['url'] = "http://www.website.com/";
include '../search/admin/spider.php';
This seems to run the spider fine for the appropriate website, and the website is entered into the MySQL database, but searching doesn't work on it until I run an update manually.

Is there a better way to run the spider without having to directly enter in the site to be spidered? Thanks!

Charter
12-19-2003, 12:58 PM
Hi. You can run PhpDig from shell. Just set the following to the wanted search depth in the config file, make a text file with the full URLs, one per line, and use the below command.

define('SPIDER_MAX_LIMIT',2); //max recurse levels in sipder
define('SPIDER_DEFAULT_LIMIT',2); //default value
define('RESPIDER_LIMIT',2); //recurse limit for update


php -f [PHPDIG_DIR]/admin/spider.php [File containing an urls list]

More shell indexing options can be found here (http://www.phpdig.net/navigation.php?action=doc#toc8).

bloodjelly
12-19-2003, 06:37 PM
This will work well, but is there a way to do it through a PHP script? I basically want "Install.php" to do x, y and z, plus get phpDig to spider a site, all in one execution without any user input. Can't I just feed spider.php the information it needs to spider the site? Thanks for the quick reply.

durr...just remembered I can do it with exec (/usr/bin/php -f spider.php); works well!

bloodjelly
03-15-2004, 12:56 PM
OK - I used this command to spider:

exec("/usr/bin/php -f /path/to/spider.php $site >> /dev/null &");

where $site = http://www.mysite.com/

This worked great until I recently upgraded to a newer version of PHP, and now the command doesn't produce any results. I know there's a way to turn on/off PHP as an executable, but I can't find it in PHP.INI and I'm not sure where to look. Thanks for helping out.