Hi. Say you want to run a cron job that spiders on the 1st and 15th of every month.
First make a list of full URLs (e.g., http://www.domain.com) to be crawled, one per line, in a file called cronlist.txt (add or remove URLs in the cronlist.txt file when not indexing).
Then create a file called cronfile.txt that contains the following on one line, editing the full paths as needed:
Code:
0 0 1,15 * * /full/path/to/php -f /full/path/to/admin/spider.php /full/path/to/cronlist.txt >> /full/path/to/spider.log
Finally, make sure that ABSOLUTE_SCRIPT_PATH is correctly set in the config file, and then type the following shell command, editing the full paths as needed:
Code:
/full/path/to/crontab /full/path/to/cronfile.txt -u
When the cron job first runs, a file named spider.log gets automatically created at /full/path/to/spider.log and spider info will be appended to this file. You may wish to delete the spider.log file when not indexing should it get large or use ">" (without quotes) in place of ">>" to overwrite spider.log each time.
You may also replace "/full/path/to/cronlist.txt" (without quotes) in the cronfile.txt file with "http://www/domain.com" or "all" or "forceall" (without quotes) for different indexing options. If you have CRON_ENABLE set to true in the config file, you may use the cronfile.txt created by PhpDig in place of a manually created cronfile.txt file.
To see that your cron job is set, type
/full/path/to/crontab -l from shell. If you want to delete the cron job, type
/full/path/to/crontab -d from shell.
A general cron tutorial can be found at
http://www.linuxhelp.net/guides/cron/