PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 02-27-2004, 12:52 PM   #1
Nosmada
Orange Mole
 
Join Date: Dec 2003
Posts: 32
Missing files, indexer and strange output

The indexer has indexed all of my folders but has missed most of the files in each folder.. So I tried reindexing just one folder to see if it would get all the files and this is what is happening. Don't know it is doing after Level 1 with all of those times in brackets. Should I let it keep going. What is happening and why is it missing most files in each folder?

Duplicate of an existing document
1:http://www.posterbreak.com/americana...american.shtml
(time : 00:00:09)

Duplicate of an existing document
2:http://www.posterbreak.com/americana...tainment.shtml
(time : 00:00:24)

Duplicate of an existing document
3:http://www.posterbreak.com/americana...n-design.shtml
(time : 00:00:30)

Duplicate of an existing document
4:http://www.posterbreak.com/americana...-culture.shtml
(time : 00:00:36)

Duplicate of an existing document
5:http://www.posterbreak.com/americana/
(time : 00:00:49)

level 1...
(time : 00:01:09)

(time : 00:01:22)

(time : 00:01:36)

(time : 00:01:50)

(time : 00:02:05)

(time : 00:02:19)

(time : 00:02:33)

(time : 00:02:52)

(time : 00:03:07)

(time : 00:03:21)

(time : 00:03:35)

(time : 00:03:48)

(time : 00:04:03)

(time : 00:04:17)

(time : 00:04:31)

(time : 00:04:45)

(time : 00:04:59)

(time : 00:05:12)

(time : 00:05:25)

(time : 00:05:40)

(time : 00:05:55)

(time : 00:06:10)

(time : 00:06:24)

(time : 00:06:37)

(time : 00:06:51)

(time : 00:07:05)

(time : 00:07:19)

(time : 00:07:33)
__________________
Nosmada
Nosmada is offline   Reply With Quote
Old 02-28-2004, 04:07 PM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Hmm, I've never seen all those times in parentheses like that before. Maybe this has something to do with your MySQL being down the other day? As for the files, are there links to them?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 02-29-2004, 02:27 PM   #3
Nosmada
Orange Mole
 
Join Date: Dec 2003
Posts: 32
Are there links to which files.
__________________
Nosmada
Nosmada is offline   Reply With Quote
Old 02-29-2004, 02:28 PM   #4
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi, to the files that are not being crawled.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 02-29-2004, 02:37 PM   #5
Nosmada
Orange Mole
 
Join Date: Dec 2003
Posts: 32
Yes, there are links.

The first links in the folder are say here for example:

http://www.posterbreak.com/art/

Then if you click one of the links you will see a whole bunch of other links which are actually in the same art folder, for example:

http://www.posterbreak.com/art/c1013...ovements.shtml
__________________
Nosmada
Nosmada is offline   Reply With Quote
Old 02-29-2004, 03:15 PM   #6
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. It seems that it takes PhpDig about ten minutes, depending on machines, traffic, etcetera, to process through the Keyword QuickFind links. Below is output from crawling http://www.posterbreak.com/art/ at a search depth of one. PhpDig hits the Keyword QuickFind links first and after it gets through those it hits the other links. Try setting a search depth of one and let the spider run for a while. What do you get?

SITE : http://www.posterbreak.com/
Exclude paths :
- _private/
- cgi-bin/
- images/
- search/
- searchsite/
- templates/
1:http://www.posterbreak.com/art/
(time : 00:00:13)
+ + + + + + + + + + + + + + +
level 1...
2:http://www.posterbreak.com/art/c5964-museum-landscapes.shtml
(time : 00:11:05)

3:http://www.posterbreak.com/art/c101318251-museum-religious-art.shtml
(time : 00:11:17)

4:http://www.posterbreak.com/art/c6461-museum-still-life.shtml
(time : 00:11:28)

5:http://www.posterbreak.com/art/c101310996-museum-tours.shtml
(time : 00:11:41)

6:http://www.posterbreak.com/art/c101319580-special-mediums.shtml
(time : 00:11:52)

7:http://www.posterbreak.com/privacy.shtml
(time : 00:11:59)

8:http://www.posterbreak.com/art/c10135861-art-by-nationality.shtml
(time : 00:12:10)

9:http://www.posterbreak.com/art/c101319277-four-centuries.shtml
(time : 00:12:27)

10:http://www.posterbreak.com/art/c101319312-museum-floral.shtml
(time : 00:12:38)

11:http://www.posterbreak.com/art/c101319275-museum-figurative.shtml
(time : 00:12:49)

12:http://www.posterbreak.com/art/c101319282-museum-artists.shtml
(time : 00:12:59)

13:http://www.posterbreak.com/art/c101319594-museum-abstract.shtml
(time : 00:13:10)

14:http://www.posterbreak.com/art/c101319274-art-movements.shtml
(time : 00:13:22)

15:http://www.posterbreak.com/contact.shtml
(time : 00:13:33)

16:http://www.posterbreak.com/
(time : 00:13:39)

No link in temporary table

--------------------------------------------------------------------------------

links found : 16
http://www.posterbreak.com/art/
http://www.posterbreak.com/art/c5964-museum-landscapes.shtml
http://www.posterbreak.com/art/c101318251-museum-religious-art.shtml
http://www.posterbreak.com/art/c6461-museum-still-life.shtml
http://www.posterbreak.com/art/c101310996-museum-tours.shtml
http://www.posterbreak.com/art/c101319580-special-mediums.shtml
http://www.posterbreak.com/privacy.shtml
http://www.posterbreak.com/art/c10135861-art-by-nationality.shtml
http://www.posterbreak.com/art/c101319277-four-centuries.shtml
http://www.posterbreak.com/art/c101319312-museum-floral.shtml
http://www.posterbreak.com/art/c101319275-museum-figurative.shtml
http://www.posterbreak.com/art/c101319282-museum-artists.shtml
http://www.posterbreak.com/art/c101319594-museum-abstract.shtml
http://www.posterbreak.com/art/c101319274-art-movements.shtml
http://www.posterbreak.com/contact.shtml
http://www.posterbreak.com/
Optimizing tables...
Indexing complete !
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 03-03-2004, 12:48 AM   #7
Nosmada
Orange Mole
 
Join Date: Dec 2003
Posts: 32
Hi Charter,

Thanks for looking into it. Seems that there are still many files in the folder that are still missing from the index. Since I don't really want to follow the related search results maybe I should comment them out (from the indexer that is). So that it won't take the time to follow them?
__________________
Nosmada
Nosmada is offline   Reply With Quote
Old 03-03-2004, 01:17 AM   #8
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. On the page http://www.posterbreak.com/art/ there should be sixteen links, assuming a search depth of one, not counting javascript, offsite, or excluded path links. The exclude/include comments work like in this thread, so maybe consider sticking the QuickFind links in a separate file and include them in the shtml files using something like the following:
Code:
<!--#include file="filename.shtml" -->
Then you could 'turn off' the QuickFind links, and instead include a file with a space, when indexing. Just an idea.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Comment indexer la racine de mon site, svp ? jm3d Troubleshooting 13 11-09-2004 01:41 PM
No brackets in output roger Troubleshooting 2 06-19-2004 06:44 AM
Index problem: missing files gvelden Troubleshooting 2 04-21-2004 04:54 AM
Probleme pour indéxer le site en entier moutyk Troubleshooting 3 01-15-2004 05:31 AM
output orsogrigio Mod Requests 4 11-05-2003 02:19 PM


All times are GMT -8. The time now is 04:03 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.