View Full Version : Spidering issue with my site
pager
01-15-2004, 11:07 AM
Hello, I'm trying to set up phpdig for a web site and I can make it spider other web sites except mine.
I have tried both locally from the command line and remotely from another server.
Any time I try to spider it the web page freezes for about 30 seconds after I click on the "Dig This!" button and then goes to the result page with:
Spidering in progress...
SITE : http://dev.videx.com/
Exclude paths :
- @NONE@
No link in temporary table
links found : 0
...Was recently indexed
Optimizing tables...
Indexing complete !
[Back] to admin interface.
The site is, if you didn't notice ;) , dev.videx.com and I have managed to spider other servers in our domain (like www.videx.com).
I have removed the robots.txt file from the site but still have a .htaccess restricting use of the /search folder, but otherwise the site is a basic CSS / php based one on a Mac OS X 10.3 server and I am using phpdig version 1.6.2.
I have modified my config file to not search through .css files, but still no luck.
Any suggestions?
pager
01-16-2004, 12:56 PM
Anyone? Anyone? Bueler?
Well, I've done some more searching and it turns out that the spidering will hang on any Mac OS X 10.3 site that I configure (including a default site with one web page!).
It works fine spidering Mac OS X 10.2 servers, however, so I think it has something to do with the Apache config on the server.
The site that I can't get phpdig to spider is http://dev.videx.com/ and it is running with the following config:
OS: Mac OS X 10.3
Apache: 1.3.28
PHP: 4.3.2
phpdig: 1.6.2
I have tried turning on error logging for php, but it never creates the file. My php.ini file is:
include_path=".:/Library/WebServer/php"
log_errors = On
error_log = ".:/Library/WebServer/log.txt"
error_reporting = E_ALL
Feel free to attempt to spider http://dev.videx.com/ and let me know if it works :)
Charter
01-18-2004, 07:28 AM
Hi. Below are the results at search depth one for http://dev.videx.com/ - When you try to crawl this site, what shows up in your Apache log files?
links found : 17
http://dev.videx.com/
http://dev.videx.com/favicon.ico
http://dev.videx.com/index.html
http://dev.videx.com/products/index.html
http://dev.videx.com/News/index.html
http://dev.videx.com/about/index.html
http://dev.videx.com/products/downloads/manuals/accesscontrol/cyberaudit_manual.pdf
http://dev.videx.com/products/support.html
http://dev.videx.com/products/download.html
http://dev.videx.com/products/listing.html
http://dev.videx.com/news/tradeshows.html
http://dev.videx.com/news/careers.html
http://dev.videx.com/map.html
http://dev.videx.com/news/press.html
http://dev.videx.com/news/studies.html
http://dev.videx.com/about/privacy.html
http://dev.videx.com/about/contact.html
Optimizing tables...
Indexing complete !
pager
01-19-2004, 08:23 AM
I cleared my apache logs, restarted it, and ran an index. Here are the results in the log files:
access log:
12.17.172.219 - - [19/Jan/2004:09:12:30 -0800] "GET / HTTP/1.1" 200 7404
error log:
Processing config directory: /etc/httpd/sites/*.conf
Processing config file: /etc/httpd/sites/0000_any_80_.conf
Processing config file: /etc/httpd/sites/virtual_host_global.conf
[Mon Jan 19 09:11:22 2004] [notice] Apache/1.3.28 (Darwin) PHP/4.3.2 configured -- resuming normal operations
[Mon Jan 19 09:11:22 2004] [notice] Accept mutex: flock (Default: flock)
It doesn't look very helpful to me.
I still can't index the site from other Mac 10.3 servers. I timed the delay between when I click on the "Dig this!" button and when the spider page comes up with 0 results, and it is about 3 minutes and 20 seconds.
pager
01-19-2004, 09:13 AM
Well, I just updated my phpdig to 1.6.5 and tried out indexing the site.
It works up to a point with the web interface and then gives me the following message from the web browser:
Could not open the page “http://12.17.172.219/phpdig1/admin/spider.php” after trying for 60 seconds.
All the pages that it indexes up to that point are fine. I am going to try it from the command line, where the timeout should not apply.
pager
01-19-2004, 10:05 AM
Everything is working fine now with phpdig 1.6.5 - apparently there was something in the php code in 1.6.2 that was causing a problem.
So, in case anyone wants to know, phpdig 1.6.5 works on Mac OS 10.3.
vBulletin® v3.7.3, Copyright ©2000-2024, Jelsoft Enterprises Ltd.