PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Mod Requests

Reply
 
Thread Tools
Old 10-11-2003, 09:21 AM   #1
alivin70
Orange Mole
 
alivin70's Avatar
 
Join Date: Sep 2003
Posts: 40
Exclamation New feature proposal: targeted indexing

Me and JyGius are working on a new feature.
I hope to have some feedback from other developers before to start.

We want to re-index very often a big website and we want to introduce some trick to dramaticaly reduce crawling.
It's not possile to use the modification data of files to select modified ones, because there are lots of dynamic pages.

For explample I can have news generated with an ID:
news.php?nid=10001
news.php?nid=10002
news.php?nid=10003
news.php?nid=10004
news.php?nid=10005
.....
news.php?nid=20000

but only last 4 have been modified since last visit.

How can the crawler know that?

Our idea is to add a directive in robots.txt containing the url of a text file containing the list of the modified/created pages with their timestamp.
For example:
1056987466 news.php?nid=20001
1056987853 news.php?nid=20002
1056988465 news.php?nid=20003
1056995765 news.php?nid=20004

So, Phpdig read that directive and load the text file, parse it and dig only pages modified after last visit without following links.

The text file must be created and mantained by the web site software. Obviously this applies to portals totally database driven.

If the robots.txt doesn't contain that directive Phpdig can crawl the site as usual.



If you have some idea please post it here.

Alivin70
alivin70 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Feature List? paulsv The Mole Hole 1 01-31-2006 09:40 PM
Detailed feature inquiry (mainly Metadata and protected) rgrau How-to Forum 1 02-26-2005 08:13 PM
quick search feature on index page bigals How-to Forum 3 04-02-2004 04:21 AM
Bug in the "refine" feature laurentxav Mod Submissions 1 03-01-2004 08:32 AM
feature proposal: real exact searching manute Mod Requests 3 10-21-2003 11:17 PM


All times are GMT -8. The time now is 01:25 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.