PDA

View Full Version : Indexing single page as its own site


bloodjelly
10-13-2005, 12:16 PM
Lots of dynamically-generated websites offer a good deal of content by changing variables in the URI string. It'd be neat if there was a way to index these pages as if they're a site.

An example is Myspace. Users aren't viewed by subdirectory or subdomain, but by a "friendID" variable, like this: http://profile.myspace.com/index.cfm?fuseaction=user.viewProfile&friendID=28915943

The California Gas Prices website also has a similar structure, with URIs like this: http://www.californiagasprices.com/index.aspx?s=Y&fuel=A&area=SAN%20FRANCISCO

Now, these pages change enough on their own that it'd be nice to index them as a site rather than just pages in a site (so that they could be searched separately.)

Is this possible?

Charter
10-14-2005, 11:06 AM
PhpDig is currently domain based for indexing with a storage scheme as in this (http://www.phpdig.net/forum/showpost.php?p=4797&postcount=8) post. There are spots in the code that specifically bust up a link, and then parts of the link are stored in the tables. It is possible that an option could be added to not do this, whether for queries or directories, and treat the link as its own site, but then other links on the page become an issue. Right now the domain is the focal point, and links relate to the domain. If the focal point is changed to a specific link in a domain, rather than the domain itself, other links in the site won't relate to it and would probably need to be ignored.

bloodjelly
10-14-2005, 11:44 AM
Hi Charter,

Thanks for your response. What you're saying completely makes sense, but couldn't you have phpDig treat the individual page as a site, and then only follow links that shared a vital characteristic? (If friendID=x, if area=san_francisco). Maybe the variable to track could be specified for each unique domain?

Charter
10-15-2005, 09:39 AM
If you want to index just one page of a site, you can stick the full link, with query string and all, in the admin panel text box, set search depth to zero, links per to one, use no, and dig. If you want to restrict indexing to certain link characteristics, and you are familiar with regular expressions, check out this (http://www.phpdig.net/forum/showthread.php?t=1684) thread. Maybe a new field in a table for certain link characteristics would be a good addition, but right now you'd have to edit the config file each time for different sites.