PDA

View Full Version : Manually set title for spidered page


bvr
11-17-2005, 08:20 PM
I've got a list of about 120 external article links that I need to spider, but they all have the same title for each page. So obviously when a search is then performed, you can't really tell which article titles are listed in the results. So what I'd like to do is have a title override on the admin panel page which would just be a <input type="text" name="overridetitle">. It'd only be used to replace the parse robot found title if it's not blank.

Any ideas on how I'd go about doing this?


Thank you in advance,

Alex

Charter
11-18-2005, 02:41 AM
Hmm, a textbox such as that would be applied to all pages, unless you index page by page, though maybe I'm misunderstanding. A better solution might be to set the title based on another HTML tag such as in this (http://www.phpdig.net/forum/showthread.php?t=1883) thread.

bvr
11-18-2005, 04:17 PM
Sorry I guess I wasn't clear in my explanation of the problem.

- I have 110 external links to pages that house scientific articles that I need to spider
- Each page has the same title tag value as the next one (<title>same for each article</title>)

Problem: After spidering all 110 pages and performing a search against them, I get lots of responses but they all have the same title. The desired result is for each return from search.php to have the title of the article as the heading

I had a look at the posting you'd mentioned. Most of titles in the articles are surrounded by <h2>, unfortunately this is not the case 100% of the time (more like 80%). So that won't work.

Proposed Solution: What I'm thinking is that I keep the default <title> capture but have an optional override text field in the 'phpdig admin' page. So essentially, on this page I'd add a text field: <input type='text' name='manual_title'>

If this field is populated, only one URI (the first in the list) would be spidered and instead of parsing the <title> tag for that URI the title within $_POST['manual_title'] would be used.

I've been trying to get this to work, but what ends up happening is the custom title is repeated a bunch of times on search result listing. I guess it's being captured multiple times within a loop in robot_functions.php. All I really need is some code in robot_functions.php that will look for $_POST['manual_title'] and if it's populated, use its value instead of parsing.

I appreciate the help!


Alex

bvr
11-22-2005, 12:45 PM
I've implemented this functionality. If anyones interested, let me know and I'll post it here.

Alex