PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 01-22-2005, 12:00 PM   #1
ffe
Green Mole
 
Join Date: Jan 2005
Posts: 3
Yet Another indexing question

I have one server. RH 9.0 runs the Apache, MySQL, and 5 virtual web sites. I am able to index 4 of the sites successfully. The last site, will only index 3-4 pages then quits with no error or completion messages. I suspect the failure is caused by HTML page content. It might be an HTML coding error or obsolete style etc..

My question is: Are there any known coding styles/tags, comments etc. in HTML that will cause the spider to terminate abnormally? My failing (spider) pages display and behave correctly with MSIE, Netscape 7.1, and Firefox 1.0.
ffe is offline   Reply With Quote
Old 01-22-2005, 12:21 PM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Given that your 'last site' works across browsers, I doubt it's an HTML issue. Without knowing more about this last site, all I can suggest is to select the 'no' radio button, set 'search depth' to a large value, set 'links per' to zero, and give it a whirl. Depending on this last site, you might try setting LIMIT_TO_DIRECTORY to false and PHPDIG_IN_DOMAIN to true, both in the config file.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 01-22-2005, 12:42 PM   #3
ffe
Green Mole
 
Join Date: Jan 2005
Posts: 3
Thanks for the feedback

Thanks for the interest in the question.. I have read a number of the other posts looking for clues to the problem. I have tried all optons you mention including making changes to the config.php.

Does line length in the HTML files have any affect on the spider? Like a buffer overflow perahps?
ffe is offline   Reply With Quote
Old 01-22-2005, 01:41 PM   #4
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
How many MB is the max-sized page? What's the link to the site?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 01-23-2005, 07:16 AM   #5
ffe
Green Mole
 
Join Date: Jan 2005
Posts: 3
Web address

None of the pages are particularly large. None over 50Kb. Below shows the result of the indexing process. This happens every time.

Spidering in progress... [Stop spider]
SITE : http://tulare.homelinux.net/
Exclude paths :
- @NONE@
1:http://tulare.homelinux.net/index.html
(time : 00:00:05)
+ + + + + + + + + + + + + + + + + + + + +
level 1...
2:http://tulare.homelinux.net/Chance_Phelps.html
(time : 00:00:24)

3:http://tulare.homelinux.net/underway.html
(time : 00:00:29)



The status line at the bottom of the browser screen shows "Done".

Thanks for the interest.
ffe is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
MySQL Question jackpod How-to Forum 1 09-21-2006 08:30 PM
question about the installation west Script Installation 1 02-01-2005 10:52 AM
can I do this, idiot question 2catstango How-to Forum 0 10-18-2004 05:59 PM
Question of the Day Charter The Mole Hole 1 03-11-2004 08:50 PM
indexing question mudpit How-to Forum 5 01-28-2004 10:44 AM


All times are GMT -8. The time now is 09:06 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.