PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   How-to Forum (http://www.phpdig.net/forum/forumdisplay.php?f=33)
-   -   Title of the results - how to change from <phpdig:page_link/> (http://www.phpdig.net/forum/showthread.php?t=1060)

bforsyth 07-11-2004 06:30 AM

Title of the results - how to change from <phpdig:page_link/>
 
Hi - I have installed the app and have indexed the site. The one thing that I would like to be able to change is the the text that is held in the <phpdig:page_link/>. Currently it diplays the relative URL to the resourve that the search returns. EG:


25. [43.30 %] ?page=issueView&issueid=62

Strangely, if the search term is on the home (index) page, then the title tage of the page is displayed in its place.

I would like for each of the returned results to display the contents of the title tag. Any ideas why this is the case?

All of the pages are generated by dynamically including another page in the index page - > eg: www.mysite.com/index.php?page=articleView etc. however the Title tags for each page are dynamically generated, so they are always different.

Many thanks in advance

Ben

Charter 07-11-2004 07:03 AM

Hi. In robot_functions.php titles are found with this bit of code:
PHP Code:

//extracts title
if ( eregi("<title *>([^<>]*)</title *>",$text,$regs) ) {
    
$title $regs[1];
}
else {
    
$title "";


Also in robot_functions.php titles are set with this bit of code:
PHP Code:

//set the title in order <title>, filename, or unknown
if (isset($doc_title) && $doc_title) {
     
$titre_resume $doc_title;
}
elseif (isset(
$file) && $file) {
    
$titre_resume =  $file;
}
else {
    
$titre_resume "Untitled";


Perhaps check your dynamic titles and see if they are found.

bforsyth 07-14-2004 09:04 PM

All sorted now
 
The strangest thing was happening. On inspection of the first_words field of the digSpider table - I found that the spider wasn't actuallly crawling each page. It took ages to find out why. To make the pages W3C compliant, i was writing the dynamic URL's like:

index.php?page=articleView& amp; articleId=336 - now a browser knows to render & amp; as & , but the spider does not - so it was just getting my built in 'Page cannot be found error' for every article.

Charter 07-15-2004 08:15 AM

Hi. What version of PhpDig are you using?

bforsyth 07-15-2004 09:08 AM

Using 1.8.0

Charter 07-15-2004 09:12 AM

Do you still have a page using & amp ; so I can test on it?

bforsyth 07-15-2004 09:21 AM

Sure thing - I have reverted the code back for you:

See:
http://cgasson.truth.posiweb.net/new...&issue=current

All of the links on that page have & amp; between the GET parameters.

I will probably revert back to the working version in 24 hours or so.

Charter 07-15-2004 10:11 AM

Thanks, test over... This & amp ; issue was fixed as of version 1.8.1, I believe that's the version. Using version 1.8.3, when indexing http://cgasson.truth.posiweb.net/new/index.php with LIMIT_TO_DIRECTORY to true, search depth one, and links per twenty, the output and table content follow (note it finds a max of [search depth * links per + 1] links all within the new/ directory):

Spidering in progress...

--------------------------------------------------------------------------------
SITE : http://cgasson.truth.posiweb.net/
Exclude paths :
- @NONE@
1:http://cgasson.truth.posiweb.net/new/index.php
(time : 00:00:08)
+ + + + + + + + + + + + + + + + + + + +
level 1...
2:http://cgasson.truth.posiweb.net/new/index.php?page=eventView
(time : 00:00:28)

3:http://cgasson.truth.posiweb.net/new/index.php?page=reportSelect
(time : 00:00:35)

4:http://cgasson.truth.posiweb.net/new/index.php?page=freeTrial
(time : 00:00:43)

5:http://cgasson.truth.posiweb.net/new/index.php?page=subscribe
(time : 00:00:49)

6:http://cgasson.truth.posiweb.net/new/index.php?page=projects
(time : 00:00:56)

7:http://cgasson.truth.posiweb.net/new/index.php?page=archiveView
(time : 00:01:02)

8:http://cgasson.truth.posiweb.net/new/index.php?page=issueView&issue=current
(time : 00:01:09)

9:http://cgasson.truth.posiweb.net/new/index.php?page=about
(time : 00:01:16)

10:http://cgasson.truth.posiweb.net/new/index.php?page=advertise
(time : 00:01:22)

11:http://cgasson.truth.posiweb.net/new/index.php?page=press
(time : 00:01:29)

12:http://cgasson.truth.posiweb.net/new/index.php?page=links
(time : 00:01:35)

13:http://cgasson.truth.posiweb.net/new/index.php?page=articleSearch
(time : 00:01:44)

14:http://cgasson.truth.posiweb.net/new/index.php?page=articleView&articleId=336
(time : 00:01:50)

15:http://cgasson.truth.posiweb.net/new/index.php?page=articleView&articleId=349
(time : 00:01:57)

16:http://cgasson.truth.posiweb.net/new/index.php?page=articleView&articleId=332
(time : 00:02:03)

17:http://cgasson.truth.posiweb.net/new/index.php?page=userPwdReminder
(time : 00:02:10)

18:http://cgasson.truth.posiweb.net/new/index.php?page=contact
(time : 00:02:16)

19:http://cgasson.truth.posiweb.net/new/index.php?page=privacy
(time : 00:02:23)

20:http://cgasson.truth.posiweb.net/new/index.php?page=terms
(time : 00:02:31)

21:http://cgasson.truth.posiweb.net/new/index.php?page=copyright
(time : 00:02:37)

No link in temporary table

--------------------------------------------------------------------------------

links found : 21
http://cgasson.truth.posiweb.net/new/index.php
http://cgasson.truth.posiweb.net/new/index.php?page=eventView
http://cgasson.truth.posiweb.net/new/index.php?page=reportSelect
http://cgasson.truth.posiweb.net/new/index.php?page=freeTrial
http://cgasson.truth.posiweb.net/new/index.php?page=subscribe
http://cgasson.truth.posiweb.net/new/index.php?page=projects
http://cgasson.truth.posiweb.net/new/index.php?page=archiveView
http://cgasson.truth.posiweb.net/new/index.php?page=issueView&issue=current
http://cgasson.truth.posiweb.net/new/index.php?page=about
http://cgasson.truth.posiweb.net/new/index.php?page=advertise
http://cgasson.truth.posiweb.net/new/index.php?page=press
http://cgasson.truth.posiweb.net/new/index.php?page=links
http://cgasson.truth.posiweb.net/new/index.php?page=articleSearch
http://cgasson.truth.posiweb.net/new/index.php?page=articleView&articleId=336
http://cgasson.truth.posiweb.net/new/index.php?page=articleView&articleId=349
http://cgasson.truth.posiweb.net/new/index.php?page=articleView&articleId=332
http://cgasson.truth.posiweb.net/new/index.php?page=userPwdReminder
http://cgasson.truth.posiweb.net/new/index.php?page=contact
http://cgasson.truth.posiweb.net/new/index.php?page=privacy
http://cgasson.truth.posiweb.net/new/index.php?page=terms
http://cgasson.truth.posiweb.net/new/index.php?page=copyright
Optimizing tables...
Indexing complete !

Table content:
Code:

+-----------------+------------------------------------------+
| path            | file                                    |
+-----------------+------------------------------------------+
| new/            | index.php                                |
| new/            | index.php?page=eventView                |
| new/            | index.php?page=reportSelect              |
| new/            | index.php?page=freeTrial                |
| new/            | index.php?page=subscribe                |
| new/            | index.php?page=projects                  |
| new/            | index.php?page=archiveView              |
| new/            | index.php?page=issueView&issue=current  |
| new/            | index.php?page=about                    |
| new/            | index.php?page=advertise                |
| new/            | index.php?page=press                    |
| new/            | index.php?page=links                    |
| new/            | index.php?page=articleSearch            |
| new/            | index.php?page=articleView&articleId=336 |
| new/            | index.php?page=articleView&articleId=349 |
| new/            | index.php?page=articleView&articleId=332 |
| new/            | index.php?page=userPwdReminder          |
| new/            | index.php?page=contact                  |
| new/            | index.php?page=privacy                  |
| new/            | index.php?page=terms                    |
| new/            | index.php?page=copyright                |
+-----------------+------------------------------------------+


Charter 07-15-2004 10:48 AM

The test did bring about another issue though... blank titles in the search results.

In robot_functions.php find:
PHP Code:

//extracts title
if ( eregi("<title *>([^<>]*)</title *>",$text,$regs) ) {
    
$title $regs[1];
}
else {
    
$title "";


and replace with:
PHP Code:

//extracts title
if ( eregi("<title *>([^<>]*)</title *>",$text,$regs) ) {
    
$title trim($regs[1]);
}
else {
    
$title "";



bforsyth 07-15-2004 03:08 PM

Hey Charter - I have to hand it to you, your enthusiasm for this product is amazing - just a re-affirmation of everything I love about the open source community.

To hit the code that is producing the & amp; problem, You would have had to set your search depthe to 3 or 4. This will index about 370 links. I will leave the offending code up for another day or so if you want to try and replicate the problem that I was having.

Thanks for the heads up on the empty title tags - I had only produced the code to dynamically generate the < title > for the article pages. The rest will be done shortly.

Charter 07-15-2004 05:05 PM

Hi. The page at http://cgasson.truth.posiweb.net/new/index.php has the following link in it:

<a href="index.php?page=issueView&amp;amp;issue=current">Current Issue</a>

This link was followed, the content indexed, and the link stored in the database table as:

http://cgasson.truth.posiweb.net/new/index.php?page=issueView&issue=current

Upgrade and the problem should go away. ;)

bforsyth 07-15-2004 08:43 PM

Hey Charter - thanks. I will have a go at upgrading to the 1.8.3 version and test .

Incidentally, I was looking at the code change that you posted to deal with untitled documents (3 posts up ^). The way that it seems to work now is that if there is no title, then the URL is displayed in place of the title in the search results. :

PHP Code:

//extracts title
extracts title
if ( eregi("<title *>([^<>]*)</title *>",$text,$regs) ) {
    
$title trim($regs[1]);
}
else {
    
$title "";


If I were to change this to:

PHP Code:

//extracts title
extracts title
if ( eregi("<title *>([^<>]*)</title *>",$text,$regs) ) {
    
$title trim($regs[1]);
}
else {
    
$title "Untitled";


would this display "untitled" as the title in the search results? (only asking because I am away from my computer and can't test it!

Thanks again for all of your support.

Charter 07-15-2004 08:53 PM

PHP Code:

//extracts title 
if ( eregi("<title *>([^<>]*)</title *>",$text,$regs) ) { 
    
$title trim($regs[1]); 

else { 
    
$title "Untitled"
}
if (!(
$title)) { $title "Untitled"; } // account for regex match 



All times are GMT -8. The time now is 02:17 PM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.