Hi Charter,
We are currently using the browser interface for crawling, with the intention of using shell later when we set up the indexing as a cron job.
In either case the spidering script is accessing the pages and providing a user agent of "PHP/4.2.2" (default before PHP version 4.3.0). The code in robot_functions.php allows the setting of the User-agent header, so why is this overridden?
The php.ini has nothing set for user_agent. Is there some other way to set the user_agent to our liking?
|