Sorry I guess I wasn't clear in my explanation of the problem.
- I have 110 external links to pages that house scientific articles that I need to spider
- Each page has the same title tag value as the next one (<title>same for each article</title>)
Problem: After spidering all 110 pages and performing a search against them, I get lots of responses but they all have the same title. The desired result is for each return from search.php to have the title of the article as the heading
I had a look at the posting you'd mentioned. Most of titles in the articles are surrounded by <h2>, unfortunately this is not the case 100% of the time (more like 80%). So that won't work.
Proposed Solution: What I'm thinking is that I keep the default <title> capture but have an optional override text field in the 'phpdig admin' page. So essentially, on this page I'd add a text field: <input type='text' name='manual_title'>
If this field is populated, only one URI (the first in the list) would be spidered and instead of parsing the <title> tag for that URI the title within $_POST['manual_title'] would be used.
I've been trying to get this to work, but what ends up happening is the custom title is repeated a bunch of times on search result listing. I guess it's being captured multiple times within a loop in robot_functions.php. All I really need is some code in robot_functions.php that will look for $_POST['manual_title'] and if it's populated, use its value instead of parsing.
I appreciate the help!
Alex
|