PDA

View Full Version : Bugs, and missing Features in V. 1.6.2


Rolandks
09-09-2003, 12:07 AM
Okay, here are my Bugs which i found. (Testetd: W2K, IIS 5, PHP 4.3.1, MySQL 4.0.14, 2 URLs, 12 Subdirectory, 1450 Pages)

1.) BUG: CGI-Timeout in big Websites (major)

Desc.: In the current version the spider try to index the complete site. If you have NO root-server (webspace at provider) you can't change the CGI-timeout from the Webserver. Default in Apache and IIS
it is 300 Sec. But in 300 Seconds spider.php is NOT indexing > 100 Sites. so you get CGI-Timeout.
Solution: The spider.php must stop and start a new Session, before 300 Seconds because this is default in all Webservers with Save_mode.

One way is to call with $PHP_SELF and queries with LIMIT !

$Start = $_GET['Start'];
if($Start == "") {
$Start = 0;
}
$End = $Start + 50;

// here are the queries ex.: ("SELECT * FROM tabelle LIMIT $Start,$End");

$Start += 50;
echo "<script>location.href(\"$_SERVER[PHP_SELF]?Start=$Start\")</script>";

Or perhpas there is a better solution, but i think it is major.


2.) Statistic Error and not show time (normal should be fix)
In Install.sql there is one field missing: ADD: "l_time" to init_db.sql

Change:

# Structure de la table 'logs'
CREATE TABLE logs (
l_id mediumint(9) NOT NULL auto_increment,
l_includes varchar(255) NOT NULL default '',
l_excludes varchar(127) default NULL,
l_num mediumint(9) default NULL,
l_mode char(1) default NULL,
l_ts timestamp(14) NOT NULL,
l_time timestamp(14) NOT NULL,
PRIMARY KEY (l_id),
UNIQUE KEY l_id (l_id),
KEY l_includes (l_includes),
KEY l_excludes (l_excludes)
);

But time update in this field also not work after this update ? -- "l_time" is always empty: 000000000.


3.) Special Character are not show in Statistic - DB:Field l_includes AND l_excludes (minor)
If you have europe Characters as ISO-8859-1 (ä, ö, ü, ß, or Spain á é ) they are not display in Statistics Fields:
Table: logs Fields: l_includes AND l_excludes --> Search was: "münchen" display in Statistic is "munchen"

4.) Exclude Path before Dig an URL (Feature)
At the moment if you enter a new URL all Link-Paths are indexed, You can exclude Paths after the first index.
Feature: After press "Dig This URL" you first must get a list of all Sub-Directories, and you can exclude some, perhaps if you have 300 Pages-documentation in a subdir and will not index them. or press "Index ALL".

5.) Size and Maxlength not in config.php (minor Feature)
At the moment the "size" and "maxlength" of the Serach-Input Field can't be change.
Put these 2 fileds as variable in config.php. (they must set in: libs\function_phpdig_form.php ( Linie 47)

"<input type='text' class='phpdiginputtext' name='query_string' SIZE='$input_size' MAXLENGTH='$input_length' ....."

Nice, that PhpDig is continue :)
Thanks -Roland-

Iltud
09-13-2003, 05:26 AM
Hello,

In you second point about the column "l_time" :

The l_time column in logs table isn't TIMESTAMP.

According to the script /sql/update_db_to_1_6_1.sql, this column should be a FLOAT and not a TIMESTAMP as you said.

It's probably the reason why you always have "l_time" empty: 000000000.


/sql/update_db_to_1_6_1.sql said :

ALTER TABLE logs ADD l_time FLOAT DEFAULT '0' NOT NULL ;



Thanks,
Nicolas.

Rolandks
09-14-2003, 11:21 PM
Okay, thanks. It was a wrong posting in the old english-Forum.

ALTER TABLE logs ADD l_time FLOAT DEFAULT '0' NOT NULL ;
Is right.

R.

laurentxav
01-21-2004, 05:44 AM
Hello,

I want to know if the major BUG CGI-Timeout in big Websites already exists ?
I've read all history of changes made to PhpDig but no info about that.

I must crawl a site with more than 3000 files !! Is there any problem ?

Thanks,

Laurent

Rolandks
01-23-2004, 07:01 AM
CGI-Timeout only exist if you can't change the default setting at your webserver ! default php-script time-out is 300 seconds, but 3000 files need 1-2 hours, so if you don't have a root-server you can't do this.

-roland-