PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Feedback & News (http://www.phpdig.net/forum/forumdisplay.php?f=25)
-   -   Version 1.8.1 Alpha (http://www.phpdig.net/forum/showthread.php?t=942)

Charter 05-16-2004 11:52 AM

Version 1.8.1 Alpha
 
Hi. Download PhpDig 1.8.1 alpha <removed>. What do you think?

A couple of notes on feedback:

Please, no mod requests here, just feedback on what's done so far in the alpha version itself. Also, please, no what else are you going to add questions, as I'm not sure, but read on for possibilities.

What's in the alpha version:

Some things from this thread include "did you mean X" instead, different keyword storage, search by site or directory, click tracking, cron job management, limit spider to max of Y number of links per depth per site, and other config options.

Other things include removes '-' index pages, RSS feeds by search, robots.txt reading updated, read base href tags for indexing, tis-620 support added, allow some extra characters in URLs, bug fixes, possible https support.

Remember this is an alpha version - some things might not work as expected. Suggestion: install in a test directory rather than overwriting your current version.

Some things that may or may not be added:

Banner abilities, simultaneous spiders, add your site form, GET request modification, admin panel changes, different authentication method, thumbnail support, different searching options, allow for more than one encoding, treat directories within a site as different domains, and whatever else.

Something that probably won't be added:

Multi-byte support - Why? See this and other funcitons where it says, "This function is EXPERIMENTAL. The behaviour of this function, the name of this function, and anything else documented about this function may change without notice in a future release of PHP. Use this function at your own risk."

Note: version 1.8.1 alpha contains three new tables (clicks, site_page, and sites_days_upd) that you will need to create. See the init_dq.sql file for these tables, and remember to add your PhpDig prefix to the tables if you used one.

Special thanks to all who made suggestions and contributions!!!

EDIT: PhpDig version 1.8.1 released.

sktest 05-16-2004 12:53 PM

Hi,

i have download the alpha version.
at the spidering, i get the following errors:

Warning: parse_url(http://?modul=): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=aboutme): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=projekte): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=bilder): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=sonstiges): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=wohnort): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=gastbuch): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=verkaufe): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=impressum): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=kontakt): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=skdownloader): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=simpleamp): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=simpleampskins): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=skcoverdesigner): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=vbruntime): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=skdeineip): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=skcam): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=skscreenmatrix): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=skscreenhypnotic): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=sknetsender): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=skgta2cheater): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=skvirtualdrive): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=skspider): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=sksendlater): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=skclassroom): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=mjh-pong): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=uebersicht): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=leer): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=leer): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

Warning: parse_url(http://?modul=verkaufe): Unable to parse url in /home/pub_hogus_de/www/search/admin/robot_functions.php on line 1479

sorry, for my bad english. i come from germany :-)

Charter 05-16-2004 03:36 PM

Hi. I see the issue. Until I get a fix, just set define('PHPDIG_IN_DOMAIN',false); in the config file.

vinyl-junkie 05-16-2004 07:10 PM

Quote:

Originally posted by Charter
Hi. I see the issue. Until I get a fix, just set define('PHPDIG_IN_DOMAIN',false); in the config file.
When I did what you suggest here, even with a search depth of 10, I only get the root indexed. When I set this value to true, as sktest was doing, I get those parse errors.

Isn't alpha testing fun? :D

sktest 05-17-2004 02:57 AM

Yes, i have the same problem as vinyl-junkie

Wayne McBryde 05-19-2004 06:39 PM

All I get is:
Unable to connect to database : Check the connection script.

I had installed 1.8.1 on a new clean website. All it has is index.html. I uploaded the .zip file and unziped it. I ran http://domain.com/search/admin/install.php there is no install.php file. I ran the index.php file and got the error above.

Then I removed all PHPDig files. Repeated the above process with 1.8.0. I ran http://domain.com/search/admin/install.php and it worked fine, no errors. I spidered 2 websites, no errors. Then I uploaded 1.8.1 again and unziped it. I ran http://domain.com/search/admin/ and got the error again.

What am I missing?

Charter 05-19-2004 07:54 PM

Hi. Looks like I forgot to include the install file in the zip. :eek:

I'll post an update once I get some kinks worked out of the alpha version.

bloodjelly 05-21-2004 11:13 AM

I was wondering about that.:)

shinji 06-06-2004 06:57 AM

Hi,

on some sites i get this error:

HTTP/1.1 404 Not Found See http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for explanation.

HTTP/1.1 404 Not Found See http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for explanation.

between the "Spidering in progress..." and the spidering itself
(i tried it with many domains - on about the half sites such errors apper)

as example 1 domain: http://www.otaku-forum.net/

sktest 06-08-2004 04:26 AM

@ Charter: When come the next release out?

shinji 06-08-2004 07:41 AM

another bug(?) i've found:

whenever someone searched for something and clicks on one of the results he gets the htpassword-prompt "Administration-1736 PhpDig" and the result

Charter 07-04-2004 02:13 PM

Hi. In the <removed> file are two replacement files: the spider.php and robot_functions.php files. If you are alpha testing PhpDig version 1.8.1, then just copy over the old alpha files with those in the attached file, and let me know how it goes. Thanks.

EDIT: PhpDig version 1.8.1 released.

vinyl-junkie 07-04-2004 03:01 PM

I didn't get any errors (well, one minor one - see below) but it only indexed 28 pages. I have almost 1,500 pages that get indexed with 1.8.0! FWIW, I made sure my tables were totally empty before I started.

The minor error:
I had been getting some strange 404 errors in my server log which I hadn't been able to figure out - a missing location.href file. I figured out where that was on my pages (based on the message that comes out on this new version of phpdig - thanks for that!), so I did this to my pages:
PHP Code:

<!-- phpdigExclude -->
<
script><!--
<!-- 
Get me out of this frame
if (window!=window.top)
top.location.href=location.href;
// -->
// -->
</script>
<!-- phpdigInclude --> 

except the new 1.8.1 is still trying to spider that and gives me a 404.

Charter 07-04-2004 03:11 PM

Hi. A 'links_per' (links per depth) set to zero means to crawl all links at each seach depth. A 'links_per' set to ten means to crawl at most ten links per depth. To try to crawl all pages just set 'search depth' to ten and 'links_per' to zero. The phpExclude/Include comments work as in this post, meaning that PhpDig will follow whatever it deems a link. With version 1.8.1, the 404s now show onscreen, whereas they didn't before. Thanks for the feedback.

vinyl-junkie 07-04-2004 06:10 PM

OK, I did as you suggested. First I cleared out my tables, set the "links-per" and search depth as you suggested, then re-spidered my site. It might be looking for the proverbial needle in the haystack to find the differences between 1.8.0 and 1.8.1, but I'm still a bit short of the pages that should have been indexed.

1.8.0 - 1,514 pages
1.8.1 - 1,262 pages

If it would help, I can nose around and find some pages that weren't picked up by 1.8.1.


All times are GMT -8. The time now is 11:52 PM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.