PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Feedback & News (http://www.phpdig.net/forum/forumdisplay.php?f=25)
-   -   New Features Inquiry (http://www.phpdig.net/forum/showthread.php?t=125)

Charter 10-05-2003 10:59 AM

New Features Inquiry
 
Hi. Besides the information posted on these forums, I've been thinking about perhaps integrating my commercial search script, or rather parts of it, into PhpDig. You can demo the commercial search script here. It has boolean and phrase searching capabilities, but it is not GPL and it does not index. If I do integrate it into PhpDig, then the integrated parts would then become GPL. Anyway, what I am wondering is if there is anything in particular from the commercial search script that you'd like to see in PhpDig. If nothing, then I'll forget this idea, but if there is something, it'd be good to know in case I decide to undertake the task.
;)

alivin70 10-06-2003 12:41 AM

Re: New Features Inquiry
 
Quote:

Originally posted by Charter
Hi. Besides the information posted on these forums, I've been thinking about perhaps integrating my commercial search script, or rather parts of it, into PhpDig. You can demo the commercial search script here. It has boolean and phrase searching capabilities, but it is not GPL and it does not index. If I do integrate it into PhpDig, then the integrated parts would then become GPL. Anyway, what I am wondering is if there is anything in particular from the commercial search script that you'd like to see in PhpDig. If nothing, then I'll forget this idea, but if there is something, it'd be good to know in case I decide to undertake the task.
;)

There are some very interesting features in yuor search engine.
I have a list of new fatures we are working on.
For example "Ad links".
My idea is to integrate Phpdig with a text banner server. We just need to create an "hook" between words searched by user and keyword of the banners.

To allow an easy integration with many ad servers I propose theese steps:
1) Allocation of the space on the right side of the results page phpdig. This column should collapse (or be absent) if the are no "sponsored links" or the feature is disabled.
2) Creation of an "hook function" inside Phpdig written for a specific AdServer. This function takes the Ads from the AdServer and shows them.
Then everything about ads is made by the AdServer: counts, statistics on clicks and so on

In this way it's easy to integrate phpdig with any AdServer. I have my own one, but is possible to do it with PhpAdsNew or others.

:D

alivin70 10-06-2003 12:48 AM

Re: New Features Inquiry
 
Quote:

Originally posted by Charter
Hi. Besides the information posted on these forums, I've been thinking about perhaps integrating my commercial search script, or rather parts of it, into PhpDig. You can demo the commercial search script here. It has boolean and phrase searching capabilities, but it is not GPL and it does not index. If I do integrate it into PhpDig, then the integrated parts would then become GPL. Anyway, what I am wondering is if there is anything in particular from the commercial search script that you'd like to see in PhpDig. If nothing, then I'll forget this idea, but if there is something, it'd be good to know in case I decide to undertake the task.
;)

A feature that I would improve in Phpdig is the calculation of relevance of the pages.
We are studying the algorithms that do that.
If you developed your own algorithm you have the right skills to help us.

We can work togheter to define new more powerful rules to calculate the "ranking" of a page.
We have to develop an extensible code to add new features as we hack Google algorithms ;)

Let me know what do you think about
Alivin70

Rolandks 10-06-2003 04:25 AM

Hey,
boolean and phrase searching is a good idea :)

But i think the curent ranking is OK and it is not so important, because my Site-statistic shows that users often search for one or two words. And the Google ranking algorithms is not interesting on ONE Website, or what ranking will you create for ONE search word ?
"hook function" and Addserver - hmm, i don´t know who need this and for what, does this work international (US, European, etc. )

My favorite feature is to get word suggestions in the case of User-errors. documnetation must find documentation or Downlaod must find Download . It works well, problems are word-parts: manageroperating not suggest: manager operating like Google.

See my "Test Intelligent Php-Dig Fuzzy " in Signature, or this thread for the full story:
http://www.phpdig.net/showthread.php?s=&threadid=77

I think is not with difficulty to include this as phpDig Results table tags.

Not important :confused:

alivin70 10-06-2003 04:42 AM

Quote:

Originally posted by Rolandks
Hey,
boolean and phrase searching is a good idea :)

But i think the curent ranking is OK and it is not so important, because my Site-statistic shows that users often search for one or two words. And the Google ranking algorithms is not interesting on ONE Website, or what ranking will you create for ONE search word ?
"hook function" and Addserver - hmm, i don't know who need this and for what, does this work international (US, European, etc. )


I'm interested in it and also Charter, i guess ;)

Anyway I'll do that and release it GPL for Phpdig :D


Quote:

My favorite feature is to get word suggestions in the case of User-errors. documnetation must find documentation or Downlaod must find Download . It works well, problems are word-parts: manageroperating not suggest: manager operating like Google.

See my "Test Intelligent Php-Dig Fuzzy " in Signature, or this thread for the full story:
http://www.phpdig.net/showthread.php?s=&threadid=77

I think is not with difficulty to include this as phpDig Results table tags.

Not important :confused:
That's a great idea and I agree with you.
I've already read the thread and see your test page.
Thanks for let me discover the nice function SOUNDEX() and related. I will help to develop this feature, if I can.

The reason why I need certain features is because I'm building not a single site search engine, but a "few sistes search engine".
Where few means 10-20, depending on Phpdig capacity and speed.

Alivin70

PS Please read my post about documentation of the code ASAP, to avoid some work going lost.

druesome 10-15-2003 07:05 AM

Hi All,

I would gladly help in developing an algorithm for PHPDig. I want to find out first though, where in the scripts is the variable $weight being computed? I'm not that satisfied with the current relevance ranking. I want to give more weight/importance to the titles than the text. Thank you.

alivin70 10-15-2003 08:05 AM

Quote:

Originally posted by druesome
Hi All,

I would gladly help in developing an algorithm for PHPDig. I want to find out first though, where in the scripts is the variable $weight being computed? I'm not that satisfied with the current relevance ranking. I want to give more weight/importance to the titles than the text. Thank you.

Hi drue
i'm also interested in hacking the page weighting, but I dindn't start it yet.

Maybe the documentation on my website could help you to find
the relevant piece of code.
look at this thread for more details.

I think it could be useful to add more parameters to adjust the weight of a result.
I'm not completely sure, but at the moment it's possible to change the relative weight of a page if the the keyword is found in the title. Looking at config.php i've found
define('TITLE_WEIGHT',3); //relative title weight

We can add weight for meta keywords or for other parameters.

The best thing to do is to put the weighting method in a function or class that can be developed separately from a person or a team. That function could be also easily customized for special purposes.

I the future we can think to implement the simplest Google algorithms of page ranking, for example the weight associated to links: if a page A contains a link named "word" to the page B and you search "word" in google, you will find page B before A, even if page B doesn't contain the keyword "word".
That's reasonable and is the base of Google power!



bye for now
Alivin70

druesome 10-16-2003 04:16 AM

Hey Alvin,

I think I figured out a hack that gives a higher score to a result if the query terms match the title. I will share it with everyone soon, because it's still kind of sloppy, but it does the job and I'm quite happy with it. What I'll try next is to give each site a pagerank, much like Google's, and to make it have some effect on the search results. Later, and wish me luck.

alivin70 10-16-2003 04:55 AM

Quote:

Originally posted by druesome
Hey Alvin,

I think I figured out a hack that gives a higher score to a result if the query terms match the title. I will share it with everyone soon, because it's still kind of sloppy, but it does the job and I'm quite happy with it. What I'll try next is to give each site a pagerank, much like Google's, and to make it have some effect on the search results. Later, and wish me luck.

I wish you lots of luck! :)

Anyway, what do you mean with pagerank? A number calculated by the spider (Google style) of assigned by the administrator (dummy but simpler)?

sid 10-16-2003 09:02 PM

Hi, I'd like to see the Boolean Capibiltis and the "" phrase search, please.

Can't wait to see the next version of PHPDIG!

Wayne McBryde 10-27-2003 07:23 PM

I would really like to see a option where you install the software for those of us that know very little about installing scripts on our servers. Of course this option would not be free, but I would pay a reasonable amount to have you install it.

Thanks

pittster 11-04-2003 10:31 AM

I'm thinking of adding a feature to log commonly searched keywords and provide a report that could be emailed or viewed online.

This is beneficial to site administrators so they can make commonly searched for items more visible on the site.

If it is already in the works please let me know

drjoju 11-05-2003 01:38 AM

Hi all!

I think some people are not focusing in the final objective of phpdig. Search and Index Engine!!

If you think that this is the most important objective, them the new features must be :

1.- the boolean capabilities and the "" exact phrase.
2.- Add new file types. If necessary.
3.- The Rolandks idea of word suggestion. Good Idea.
4.- Repair bugs and modify the spider to sniff local directories. (It doesn't work to me or I don't know how to do it)
5.- Integrate new external engines. wvware for example.
6.- Add a commit hook system to index new files without reindex.

As you can see there is a lot of work.

I Know that exists a registered version, but I believe in GPL and the open source.

Best regards!

alivin70 11-05-2003 01:58 AM

Quote:

Originally posted by drjoju
Hi all!
[...]
1.- the boolean capabilities and the "" exact phrase.
2.- Add new file types. If necessary.
3.- The Rolandks idea of word suggestion. Good Idea.
4.- Repair bugs and modify the spider to sniff local directories. (It doesn't work to me or I don't know how to do it)
5.- Integrate new external engines. wvware for example.
6.- Add a commit hook system to index new files without reindex.
[...]
I agree, 1) is the most important.

4 is easy, if your web server is not public, configure your apache to have web access to files you need and spider it with phdig.
Be careful to permissions, use some .htaccess if you want to protect your dirs

3 is a great idea, but quite difficult. I hope Rolandks will give us good news soon.

6 I proposed that feature and thinking for its implementation. I will inform you as soon as I will hane some news.

2 Needs some external parser (link for PDF or Word files), you can propose some if you know.

5 I didn't understand what you mean .... :(

drjoju 11-05-2003 11:13 AM

Hi Alivin70,

with point 5 I want to say that exists other engines to parse files like wvware.sourceforge.net that parses doc files.

Best regards.


All times are GMT -8. The time now is 09:18 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.