PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Troubleshooting (http://www.phpdig.net/forum/forumdisplay.php?f=22)
-   -   Indexing by command line interface (http://www.phpdig.net/forum/showthread.php?t=78)

Skop 09-17-2003 11:59 PM

Indexing by command line interface
 
Hi,

i installed phpdig 1.6.2 in a linux machine and now i'm trying to index by command line.

PHP Code:

/usr/bin/php4 -[path]/search/admin/spider.php forceall >> /tmp/phpdig.log 

nothing happend! the phpdig.log includes something like

PHP Code:

848old priority 0, new priority 18 

and the indexing (reindexing of existing hosts) doesn't work.

Some ideas?

Thanks a lot.
JS

Charter 09-18-2003 07:19 AM

Hi. Here are some suggestions.


If CGI mode, perhaps try the following:
Code:

#!/usr/bin/php4 -f [path]/search/admin/spider.php forceall >> /tmp/phpdig.log
If not CGI mode, and PHP can run anywhere, cd to the search dir and try the following:
Code:

php -f admin/spider.php forceall > phpdig.log
If this is the first time indexing, change forceall to http://www.domain.com


In the config file, change the following to one if updating before seven days have past:
PHP Code:

define('LIMIT_DAYS',7); //default days before reindex a page 

To start over and index from scratch, do the following:
  1. empty all the PhpDig database tables
  2. delete all files that may be in the temp dir
  3. delete all files in the text_content dir except keepalive.txt
  4. run spider.php from a browser or command prompt
Before running spider.php from the command prompt, in the config file, change the following to one if only one level is wanted:
PHP Code:

define('SPIDER_MAX_LIMIT',20); //max recurse levels in sipder
define('SPIDER_DEFAULT_LIMIT',3); //default value
define('RESPIDER_LIMIT',4); //recurse limit for update 


Skop 09-19-2003 01:17 AM

Quote:

Originally posted by Charter

Code:

php -f admin/spider.php forceall > phpdig.log
If this is the first time indexing, change forceall to http://www.domain.com

Nothing, nothing happend. I take a look on spider.php source, and i think that the program hang on line 80:

PHP Code:

    print @exec('renice 18 '.getmypid()).$br

I try also to clean the tables etc like you write; but the db stay empty, and the spider.php don't works.

Thanks a lot.

Rolandks 09-19-2003 03:29 AM

Hmm,
command Line is something with difficulty. I also have many attempts until it works.
I think it shoult be change in the one of the next versions to work better with all Operating Systems, because it is important that it works fine, when you will indexing frequently Content Sites daily with Cron jobs or Windows Tasks.

Read this:
http://www.phpdig.net/showthread.php?s=&threadid=56

-Roland-

Skop 09-19-2003 05:20 AM

Quote:

[...]
Read this:
http://www.phpdig.net/showthread.php?s=&threadid=56

-Roland- [/b]
I red this, but unfortunally don't help me ;) Now i'll try to hack a little the code... If you have other ideas, i'm here! :D

Thanks a lot
JS

Charter 09-19-2003 07:06 AM

Hi. It looks like the renice command is working as 848: old priority 0, new priority 18 appears in the log file, but you could try commenting that line out. The renice command is for setting the priority of the spidering process.

Are there any files besides keepalive.txt in the text_content dir?

Skop 09-19-2003 07:37 AM

Quote:

Originally posted by Charter
[...]
The renice command is for setting the priority of the spidering process.

Are there any files besides keepalive.txt in the text_content dir? [/b]
I commented out this line, but as how i write, nothing happend.

The text_content dir is empty (except keepalive.txt [2 b])

For now i've this solution: I use the lynx for call the function:


PHP Code:

lynx -dump -auth=yourlogin:yourpwd '[url]/pathtosearch/admin/update.php?site_id=XXX&exp=1' >/tmp/uotput 2>/tmp/erroroutput 

this works :cool:

JS

Charter 09-19-2003 07:50 AM

Quote:

Originally posted by Skop
PHP Code:

lynx -dump -auth=yourlogin:yourpwd '[url]/pathtosearch/admin/update.php?site_id=XXX&exp=1' >/tmp/uotput 2>/tmp/erroroutput 


Great! Glad it's working. Interesting that lynx will work but php won't. Are you able to do the following from the command prompt?
Code:

php -f test.php
where test.php is the below:
PHP Code:

<?php
echo "test";
?>


Skop 10-14-2003 02:23 AM

Quote:

Originally posted by Charter
Code:

php -f test.php

Hi, sorry for late answer. I try what you suggest to me, and works. I think the problem is the spider.php file, and how get the inputs from STDIN.


All times are GMT -8. The time now is 07:55 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.