|
02-07-2005, 05:21 AM | #1 |
Green Mole
Join Date: Jan 2005
Posts: 9
|
How to index one page and nothing else
Hi
I would like to control the indexing process when I do indexing of my dynamic pages. Basically I have generated a list with all the URL's that I would like to index: (...) http://localhost/anatomi/index.php?v...ng=praeparater http://localhost/anatomi/index.php?v...ng=praeparater http://localhost/anatomi/index.php?v...ng=praeparater http://localhost/anatomi/index.php?v...ng=praeparater (...) but when I paste these into the box and start spidering it finds lots of dublicate pages that have allready been indexed. I have set the "search depth" to 0 and the "link per" to 0. Please if anyone can help me with this... |
02-07-2005, 09:30 AM | #2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Use zero, zero, and also choose no.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
02-07-2005, 10:19 AM | #3 |
Green Mole
Join Date: Jan 2005
Posts: 9
|
Thanks for the quick reply! I tried with zero, zero, but I'm not sure about the No-option. I will try it later this week.
|
02-11-2005, 12:03 AM | #4 |
Green Mole
Join Date: Jan 2005
Posts: 9
|
I get the same problem with Use zero, zero, and "no" in the "Use values from Update sites table if present and use default values if values absent from table" option.
It still checks all the other links that have been indexed previously. Could it be some other setting? In the config.php maybe? |
02-11-2005, 03:10 AM | #5 |
Green Mole
Join Date: Jan 2005
Posts: 9
|
Just to clarify: I would like to index one file only and not update all the other files/url's.
Example URL: http://localhost/anatomi/index.php?v...ng=praeparater There's 1000's of pages and it takes very long time if it has to check/update all the url's that have already been indexed. I know that they haven't changend anyway. I'm using command line as it seems to be more stable. |
02-11-2005, 06:03 AM | #6 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Is your tempspider table empty?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
02-11-2005, 06:46 AM | #7 |
Green Mole
Join Date: Jan 2005
Posts: 9
|
Yes it's empty.
It also says "Temporary table : 0 Entries" Info: I'm using PhpDig v.1.8.7 Safe-mode: Off allow_url_fopen is enabled |
02-11-2005, 06:53 AM | #8 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
>> I'm using command line as it seems to be more stable.
Missed that.. try the following config options. Code:
define('SPIDER_MAX_LIMIT',0); //max recurse levels in spider define('LINKS_MAX_LIMIT',0); //max links per each level
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
02-11-2005, 07:28 AM | #9 |
Green Mole
Join Date: Jan 2005
Posts: 9
|
I have tried with these settings:
define('SPIDER_MAX_LIMIT',0); //max recurse levels in spider define('RESPIDER_LIMIT',0); //recurse respider limit for update define('LINKS_MAX_LIMIT',0); //max links per each level define('RELINKS_LIMIT',0); //recurse links limit for an update Same result. __________________ Another question: At some point I will need to spider some pages with iframes. I got that to work earlier, when I set the depth to 1 and links per to 10. I was using the web-interface... and i have also modified config.php so i can dig iframes. Now I can't really use the web-interface because it want's to index/update everything all the time. And when i does it crashes/stops (sometimes with an apache error). Otherwise it just stops. I doesn't do that with command line. I have tried to get PhpDig to index the content of the iframes using command line and these settings: define('SPIDER_MAX_LIMIT',1); and define('LINKS_MAX_LIMIT',10); in config.php. But it didn't index the iframes. Should I try other settings or is it not possible to do from command line? Any help is very welcome. I'm sorry I think this i probably a hard case... |
02-11-2005, 07:56 AM | #10 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Go to the admin panel, and click the update sites link. Make sure that links and depth are both zero. Also, there is a mod here that you may find useful, although it could need tweaking. You might just try updating one page from the admin panel: click the site, update button, blue arrow, and then green check mark next to the page. PhpDig doesn't index iframe tags. Maybe you modded the robot_functions.php file to include iframe?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
02-17-2005, 03:01 AM | #11 |
Green Mole
Join Date: Jan 2005
Posts: 9
|
I found a way to do the iframe content indexing, by indexing the folders where the iframe content files are located plus putting some JavaScript in the content-files. The JavaScript redirects the user to the right page.
The problem with indexing one page only might be due to the fact the the URL's in my site-list have query-string in them + maybe some config-settings. I don't know... This is not a big problem for me now as I have indexed all the pages. Thanks for all the help and for a great search-tool! |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
PhpDig seems to only index one link per page | McVirusS | Troubleshooting | 7 | 03-15-2005 09:11 AM |
Reindexing site won't index certain page | gman | Troubleshooting | 4 | 08-06-2004 01:05 PM |
Any Idea how i can index this page? | marid | Troubleshooting | 1 | 04-10-2004 03:02 PM |
Index just one page ?? | lighthouse | How-to Forum | 1 | 03-30-2004 08:13 AM |
Exclude index page | teostress | How-to Forum | 1 | 12-16-2003 08:53 AM |