PDA

View Full Version : Displayed Results formatting etc


rafarspd
12-04-2003, 03:31 AM
Hi everyone

'Snippets' displays found search words in [meta name='KEYWORDS'], also the found search words on the page text.

'Summary' displays what is in [meta name='DESCRIPTION'] then starts with the text on the page, listing it from the body of the page until it reaches the display total set in [define('SUMMARY_DISPLAY_LENGTH',150);].

If a search word is in [meta name='KEYWORDS'] but not within the text of a page then that page will still be listed showing the snippet from the KEYWORDS. I am getting pages returned with two instances of the search word but no instances in the content and showing a relevance of 81.65 % when infact the relevance is 0%.

Is it possible to have the search words 'snippets' displayed without the [meta name='KEYWORDS'].
(Perhaps I should put the [meta name='KEYWORDS'] in <!-- phpdigInclude --><!-- phpdigExclude --> tags) or should I make sure that [meta name='KEYWORDS'] only has words that are contained in the text of the document.

If you wish to use PhpDig as a page content search engine then you realy only need the text contained in the viewable page to be indexed.

On summary, it looks like I am asking two things.
1. Is it possible to have the [meta name='DESCRIPTION'] displayed without the page text.
2. Can a page be ignored if there are no instances of the search word on the page even if the word appears in [meta name='KEYWORDS']. (I suppose this will affect the ranking percentage, perhaps [meta name='KEYWORDS'] should be ignored).

Unfortunately I also have another problem. As an example, look at 'http://www.rafars.org/information/council.html'. I can enter search words and get a returned page up to 'GENERAL SECTRETARY' but 'TECHNICAL OFFICER' or any below this appear not to return a search result.

Perhaps I expect to much!

Charter
12-04-2003, 08:17 AM
Hi. For the first portion, parts one and/or two, you might try commenting out array_push($text,$add_text); in the chunk of code given in this (http://www.phpdig.net/showthread.php?threadid=250) thread.

For the second portion, there seems to be an extraneous <!-- in the HTML.

rafarspd
12-06-2003, 09:17 AM
Yes there was an extra <!--, caused by commenting out a portion of text that included some java script.
I have now removed that script and it will be stored in a text file untill I wish to use it again.
------------------------

Answer to the missing search results?

I have implemented a method of stopping email address spiders from extracting email addresses from a page. It uses java script.

Example:-

<script language="javascript">
<!-- // Hide
var showtext = "text-to-show"; var mp1 = "email-part1"; var mp2 = "email-part2.org"; document.write("<a href=" + "mail" + "to:" + mp1 + "@" + mp2 + ">" + showtext + "</a>") //-->
</script>

I beleive that this javascript is what is causing PhpDig to fail in indexing portions of the text on the page.

I have succeeded in getting the required indexing by implementing the following:-

<!-- phpdigExclude -->
<script language="javascript">
<!-- // Hide
var showtext = "text-to-show"; var mp1 = "email-part1"; var mp2 = "email-part2.org"; document.write("<a href=" + "mail" + "to:" + mp1 + "@" + mp2 + ">" + showtext + "</a>") //-->
</script>
<!-- phpdigInclude -->
<font color="#FFFFFF">text-to-show</font>

The last line is infact dummy text for PhpDig to find which replaces the text inside the java script, I have used a colour of white which blends in with my background.

Not the best way but acceptable as a work aroud.

Any other ideas!

Charter
12-06-2003, 12:33 PM
Hi. You could also set define('CHUNK_SIZE',2048); in the config file to be a lower value. Something like define('CHUNK_SIZE',200); should catch all of the titles. There is a trade off: small chunk size yields more processing. Of course, you could leave it the way you have it now too.