View Single Post
Old 12-23-2003, 06:16 AM   #1
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Cron Job on Linux/Apache

Hi. Say you want to run a cron job that spiders on the 1st and 15th of every month.

First make a list of full URLs (e.g., http://www.domain.com) to be crawled, one per line, in a file called cronlist.txt (add or remove URLs in the cronlist.txt file when not indexing).

Then create a file called cronfile.txt that contains the following on one line, editing the full paths as needed:
Code:
0 0 1,15 * * /full/path/to/php -f /full/path/to/admin/spider.php /full/path/to/cronlist.txt >> /full/path/to/spider.log
Finally, make sure that ABSOLUTE_SCRIPT_PATH is correctly set in the config file, and then type the following shell command, editing the full paths as needed:
Code:
/full/path/to/crontab /full/path/to/cronfile.txt -u
When the cron job first runs, a file named spider.log gets automatically created at /full/path/to/spider.log and spider info will be appended to this file. You may wish to delete the spider.log file when not indexing should it get large or use ">" (without quotes) in place of ">>" to overwrite spider.log each time.

You may also replace "/full/path/to/cronlist.txt" (without quotes) in the cronfile.txt file with "http://www/domain.com" or "all" or "forceall" (without quotes) for different indexing options. If you have CRON_ENABLE set to true in the config file, you may use the cronfile.txt created by PhpDig in place of a manually created cronfile.txt file.

To see that your cron job is set, type /full/path/to/crontab -l from shell. If you want to delete the cron job, type /full/path/to/crontab -d from shell.

A general cron tutorial can be found at http://www.linuxhelp.net/guides/cron/
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote