![]() |
|
![]() |
#16 |
Green Mole
Join Date: Apr 2004
Posts: 16
|
better spidering UI... it seems to be a little buggy and sometimes when it's actually spidering.. it doesn't show that it's working in progress.. some users dont' know better and exit it.. which causes probs
|
![]() |
![]() |
#17 |
Green Mole
Join Date: Apr 2004
Location: home
Posts: 7
|
I love the script,
but can't use it because for some odd reason I can't get phpdig to parse php, cgi pages fully. It seems to only want to parse: www.domain.com/page.php and not: www.domain.com/page.php?837983* etc. so my suggestions are as followed: 1.) better support for fully parsing dynamic links. 2.) like mentioned before, show the full Url in the admin url window. 3.) a bulk spidering option, so when can just copy and paste a bunch of urls to be spidered. 1.a) better support for parsing dynamic pages.
__________________
me = love phpdig |
![]() |
![]() |
#18 |
Green Mole
Join Date: Apr 2004
Posts: 12
|
As others have said, better bot functions such as:
Spider Functions: Ability to index exact urls instead of having to index the raw domain. this is a total pain when trying to index pages from a yahoo store or Hometown addy. Ability to approve and edit listings that are harvested by the spider Ability to ignore base url, in order to harvest only off domain url links (handy for spidering link farms and DMOZ directories) Results Display Functions: Ability to ignore/not harvest page text but only Meta Tags Ability to assign weights to displayed URL's. Adult Filter External Functions: Ability to import and export XML feeds from other engines such as Google API, SearchFeed and RevenuePilot. Jig and Alvins Keyword driven ad system, i third the motion to integrate it into the next Dig release.. ---------------------------------------------------- Completed Mods: Visitor tracking system module including IP, delivers stats in many formats. Who's Online in realtime Download counter module (we have a downloadable IE tollbar that interfaces with PhPDig allowing searches from anywhere on the net. Automated template changer: configurable to change skins automatically at any set time of year, currently set to change skins like Google, Christmas theme, 4th of July theme etc. Mod WorkBoard: Integrated SQL functions in Admin, such as Backup/Restore Integrated External Results function (as stated above, searchfeed, revenuepilot etc) Multi URL Add in Admin: will allow for text list to be pasted into Admin that the spider will index incrimentally (in other words, paste 30 domains into admin and go make a sammich, it'll do the rest) I never sleep lol |
![]() |
![]() |
#19 |
Green Mole
Join Date: Feb 2004
Posts: 3
|
Soundex & Unicode
Hello !
Here is the place to say : "PhpDig is Wwwooooonnnnddddeeeerrrrrfffffuuuulllllllll !!!!!!!!!!!!!!!!!!" It will be very nice if the features below will be integrated : - Unicode support - Soundex support Thk ! |
![]() |
![]() |
#20 |
Green Mole
Join Date: Apr 2004
Location: is
Posts: 3
|
![]()
I also think phpdig is awsome...
One thing i'd like to see are options in the config file to completely ignore meta tags during indexing, and/or during displaying snippet results. |
![]() |
![]() |
#21 |
Green Mole
Join Date: Mar 2004
Posts: 1
|
add ons
integated PDF document indexing would be nice.
It is an essential feature for CMS in the the US, particularly those in the .edu and .gov domain space. |
![]() |
![]() |
#22 |
Green Mole
Join Date: Dec 2003
Posts: 4
|
XML data import/export should be a good tool.
|
![]() |
![]() |
#23 |
Green Mole
Join Date: Mar 2004
Posts: 1
|
Multiple Indexes on same site, unicode
I also agree with people that i'm quite impresed by phpDig. Keep up the good work.
I would vote for 2 additions, that many other people have also mentioned: 1. Multiple Indexes for One Domain this would allow me to index my sites, which often have multiple language versions, without having to create multiple installs. However, some people have suggested sites being structured like this: http://site.com/en/whatever.php http://site.com/fr/whatever.php But keep in mind that it doesn't always work that way....on some of my projects, the URL remains the same, but a variable like $_COOKIES['lang'] is set 2. Unicode Support thanks, and good luck with the updates. |
![]() |
![]() |
#24 |
Green Mole
Join Date: Mar 2004
Posts: 10
|
1. Sponsor records support. Search displays results from limited (server option) number of sponsor records first and normal search results follow.
2. Side bar sponsor link for advertisement support from search results of mysql advertisement records. 3. Allows .pdf only index. Crawl through the web, but indix .pdf files only. 4. Allows http index from a file that has a list of target urls. Prevent timeout in this page by more status feedback. 5. Allows to crawl a single branch only. I am new in using this wonderful tool. I hope my thought is not off the wall. |
![]() |
![]() |
#25 |
Orange Mole
Join Date: Mar 2004
Posts: 48
|
1. Indexing duplicate descriptions and keywords causing false search results. See thread.
2. Reduce duplicates in keywords table through more intelligent indexing. See thread. 3. Admin approval for spider to index external URLs. See thread. 4. Better support for PHP sessions. 5. Ability to set weight of data in config file. For example: Title [1-5] Description [1-5] Keywords [1-5] Content [1-5] 6. Soudex support of similar function 7. Implement the many mods in the mods forum and spread throughout the various forums as either new features or as options that can be turned on or off in the config file. This will make things much easier for people to implement new versions without having to compare each line of code for differences. |
![]() |
![]() |
#26 |
Green Mole
Join Date: Oct 2003
Posts: 17
|
Hi.
This is a summary of the additions we have made (me and alivin70) 1)search.php If someone search using a form, this form has a static site id <input type="hidden" name="site" value="2">. 2 => www.phpdig.org. But if you delete www.phpdig.org from the admin area, you have to update the site's value (in our example 2). In the form you can put <input type="hidden" name="site" value="phpdig.org"> and phpdigGetSiteidFromUrl() gets the correct site id. 2) url.php - When a user click on a result, url.php logs: a - the position of the clicked result, b - the url (redundant), c - the query, d - date. Very useful for statistics. 3) admin/index.php When you start the spider, you have to select a limit > 0 because with some site the option 0 doesn't start the spider. 4) admin/limit_upd.php Cron management via web. It manages max number of pages per site as well; useful when you don't want to index all the pages of a web site (a my God...this site has thousands of pages). 5) admin/robot_functions.php Erased a bug when you want to index pdfs. We have added a lot of logging at the end. Among others, you have the statistics of clicks made by users... 6) admin/spider.php When you start the spider you can tell it the max number of pages to index. How long does it take to index a huge web site? Using the admin area you can see which indexing has been interrupted, and which one has been completed. 7) includes/config.php You can configure: a - sponsored links: show it; don't show. b - cron 8) libs/function_phpdig_form.php Added form elements. 9) libs/phpdig_functions.php Added phpdigGetSiteidFromUrl() 10) libs/search_functions.php - The big HTML page at the end is in a separate file to save parsing time. - Sponsored links - .... 11) libs/time.php a function for logging the lasting of various events 12) templates/phpdig2.html template modification 13) sql/init.sql others tables. 14) libs/google.php If you a user search something in a given web site and there are no results... Google help us pleaz :-) All our adjuncts should be fully explained... but if you are curious you can understand them thru the diffs I sent you. Additional explanations will coming soon... Bye bye. JyGius |
![]() |
![]() |
#27 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi all, and thanks for the suggestions. Thread closed.
![]()
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
![]() |
![]() |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Requests other template engine | janhsh | Mod Requests | 0 | 06-27-2005 06:50 AM |
Code Requests | Charter | Feedback & News | 0 | 02-28-2004 11:45 PM |
funny requests | fzxdude | Troubleshooting | 2 | 01-24-2004 09:15 PM |
Update Index taking 11 hours.... | tester | Troubleshooting | 14 | 01-23-2004 10:10 AM |