PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > How-to Forum

Reply
 
Thread Tools
Old 02-07-2005, 05:21 AM   #1
kristian
Green Mole
 
kristian's Avatar
 
Join Date: Jan 2005
Posts: 9
How to index one page and nothing else

Hi

I would like to control the indexing process when I do indexing of my dynamic pages. Basically I have generated a list with all the URL's that I would like to index:

(...)
http://localhost/anatomi/index.php?v...ng=praeparater
http://localhost/anatomi/index.php?v...ng=praeparater
http://localhost/anatomi/index.php?v...ng=praeparater
http://localhost/anatomi/index.php?v...ng=praeparater
(...)

but when I paste these into the box and start spidering it finds lots of dublicate pages that have allready been indexed. I have set the "search depth" to 0 and the "link per" to 0.

Please if anyone can help me with this...
__________________
Kristian W.
www.beatnik.dk
www.mediabits.dk
kristian is offline   Reply With Quote
Old 02-07-2005, 09:30 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Use zero, zero, and also choose no.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 02-07-2005, 10:19 AM   #3
kristian
Green Mole
 
kristian's Avatar
 
Join Date: Jan 2005
Posts: 9
Thumbs up

Thanks for the quick reply! I tried with zero, zero, but I'm not sure about the No-option. I will try it later this week.
__________________
Kristian W.
www.beatnik.dk
www.mediabits.dk
kristian is offline   Reply With Quote
Old 02-11-2005, 12:03 AM   #4
kristian
Green Mole
 
kristian's Avatar
 
Join Date: Jan 2005
Posts: 9
I get the same problem with Use zero, zero, and "no" in the "Use values from Update sites table if present and use default values if values absent from table" option.

It still checks all the other links that have been indexed previously. Could it be some other setting? In the config.php maybe?
Attached Files
File Type: txt config.txt (21.8 KB, 20 views)
__________________
Kristian W.
www.beatnik.dk
www.mediabits.dk
kristian is offline   Reply With Quote
Old 02-11-2005, 03:10 AM   #5
kristian
Green Mole
 
kristian's Avatar
 
Join Date: Jan 2005
Posts: 9
Just to clarify: I would like to index one file only and not update all the other files/url's.

Example URL: http://localhost/anatomi/index.php?v...ng=praeparater

There's 1000's of pages and it takes very long time if it has to check/update all the url's that have already been indexed. I know that they haven't changend anyway.

I'm using command line as it seems to be more stable.
__________________
Kristian W.
www.beatnik.dk
www.mediabits.dk
kristian is offline   Reply With Quote
Old 02-11-2005, 06:03 AM   #6
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Is your tempspider table empty?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 02-11-2005, 06:46 AM   #7
kristian
Green Mole
 
kristian's Avatar
 
Join Date: Jan 2005
Posts: 9
Yes it's empty.

It also says "Temporary table : 0 Entries"

Info:
I'm using PhpDig v.1.8.7
Safe-mode: Off
allow_url_fopen is enabled
__________________
Kristian W.
www.beatnik.dk
www.mediabits.dk
kristian is offline   Reply With Quote
Old 02-11-2005, 06:53 AM   #8
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
>> I'm using command line as it seems to be more stable.

Missed that.. try the following config options.
Code:
define('SPIDER_MAX_LIMIT',0);          //max recurse levels in spider
define('LINKS_MAX_LIMIT',0);           //max links per each level
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 02-11-2005, 07:28 AM   #9
kristian
Green Mole
 
kristian's Avatar
 
Join Date: Jan 2005
Posts: 9
Red face

I have tried with these settings:

define('SPIDER_MAX_LIMIT',0); //max recurse levels in spider
define('RESPIDER_LIMIT',0); //recurse respider limit for update
define('LINKS_MAX_LIMIT',0); //max links per each level
define('RELINKS_LIMIT',0); //recurse links limit for an update

Same result.

__________________

Another question:
At some point I will need to spider some pages with iframes. I got that to work earlier, when I set the depth to 1 and links per to 10. I was using the web-interface... and i have also modified config.php so i can dig iframes.

Now I can't really use the web-interface because it want's to index/update everything all the time. And when i does it crashes/stops (sometimes with an apache error). Otherwise it just stops. I doesn't do that with command line.

I have tried to get PhpDig to index the content of the iframes using command line and these settings: define('SPIDER_MAX_LIMIT',1); and define('LINKS_MAX_LIMIT',10); in config.php. But it didn't index the iframes. Should I try other settings or is it not possible to do from command line?

Any help is very welcome. I'm sorry I think this i probably a hard case...
__________________
Kristian W.
www.beatnik.dk
www.mediabits.dk
kristian is offline   Reply With Quote
Old 02-11-2005, 07:56 AM   #10
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Go to the admin panel, and click the update sites link. Make sure that links and depth are both zero. Also, there is a mod here that you may find useful, although it could need tweaking. You might just try updating one page from the admin panel: click the site, update button, blue arrow, and then green check mark next to the page. PhpDig doesn't index iframe tags. Maybe you modded the robot_functions.php file to include iframe?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 02-17-2005, 03:01 AM   #11
kristian
Green Mole
 
kristian's Avatar
 
Join Date: Jan 2005
Posts: 9
I found a way to do the iframe content indexing, by indexing the folders where the iframe content files are located plus putting some JavaScript in the content-files. The JavaScript redirects the user to the right page.

The problem with indexing one page only might be due to the fact the the URL's in my site-list have query-string in them + maybe some config-settings. I don't know...
This is not a big problem for me now as I have indexed all the pages. Thanks for all the help and for a great search-tool!
__________________
Kristian W.
www.beatnik.dk
www.mediabits.dk
kristian is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PhpDig seems to only index one link per page McVirusS Troubleshooting 7 03-15-2005 09:11 AM
Reindexing site won't index certain page gman Troubleshooting 4 08-06-2004 01:05 PM
Any Idea how i can index this page? marid Troubleshooting 1 04-10-2004 03:02 PM
Index just one page ?? lighthouse How-to Forum 1 03-30-2004 08:13 AM
Exclude index page teostress How-to Forum 1 12-16-2003 08:53 AM


All times are GMT -8. The time now is 04:04 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.