PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Troubleshooting (http://www.phpdig.net/forum/forumdisplay.php?f=22)
-   -   Spider test for me (http://www.phpdig.net/forum/showthread.php?t=535)

DrKamikaze83 02-17-2004 01:13 AM

Spider test for me
 
Can somebody spider http://www.ebay.com and http://www.dovebid.com and show me the result.


Spidering doesn't work for myself.

thanks
Alex

DrKamikaze83 02-17-2004 05:36 AM

hi,

i have tried it for many sites in the Inet and it doesn't work.

As a last i tried it on my localhost.
On my localhost everything works wonderful.


I don't know what the problem is.

Charter 02-17-2004 11:16 AM

Hi. Do you get any errors when trying to index online? Is safe_mode set to on?

DrKamikaze83 02-17-2004 11:08 PM

in phpinfo safe mode is off, but maybe there is something in the script tat i have forgotten to change.

Online there is no spidering possible. At any site in the internet he only detect the host like www.ebay.com and no pages. It is reagardlessof which page. It's always the same.

hi have read and tried all threads for safe_mode, but i can't arrive to get it work.
Please help me.


thanks
Alex

Charter 02-17-2004 11:49 PM

Hi. Setup a small three page demo like below and then index the main.html page using a search depth of one, and then wait several minutes before touching the browser. What do you see onscreen after several minutes?

http://www.domain/testdir/main.html

<html>
<body>
main page
<a href="page1.html">page1</a>
<a href="page2.html">page2</a>
</body>
</html>

http://www.domain/testdir/page1.html

<html>
<body>
page one
</body>
</html>

http://www.domain/testdir/page2.html

<html>
<body>
page two
</body>
</html>

DrKamikaze83 02-18-2004 12:39 AM

hi i tried it, but it didn't work. I atarted phpdig from my localhost.


site (spidering): http://maggiv8.funpic/Test/main.html


result:
Spidering in progress...

--------------------------------------------------------------------------------
SITE : http://maggiv8.funpic.de/
Exclude paths :
- @NONE@
No link in temporary table

--------------------------------------------------------------------------------

links found : 0
...Was recently indexed
Optimizing tables...
Indexing complete !
--------------------------------------------------------------------------------
[Back] to admin interface.


What can i try next?


Regards
Alex

Charter 02-18-2004 12:54 AM

Hi. Did you configure the connect.php file that is online and try to crawl http://maggiv8.funpic.de/Test/main.html from online? The database variables in the online connect.php file need to match the online database.

DrKamikaze83 02-18-2004 12:58 AM

i don't understand, what i should do know.

I have only loaded the 3 Test-files up. The other things, like database and phpdig, are on my localhost on my PC.

Regards
Alex

Charter 02-18-2004 01:06 AM

Hi. Perhaps try editing your hosts file like in this thread or in this thread.

DrKamikaze83 02-18-2004 01:28 AM

hii charter,

i looked the two at threads.

i think, this on is the problem. http://www.phpdig.net/showthread.php?threadid=514
I didn't understand, what oscure is mentioning.

Can you give me a exact description what i have to do.


Thanks
Alex

DrKamikaze83 02-19-2004 03:53 AM

hi,

i have uploaded now all to this site http://maggiv8.funpic.de/
from that site i spidered www.ebay.com.

Results:

Warning: set_time_limit,getmyuid,getmypid,dl,leak() has been disabled for security reasons in /usr/export/www/vhosts/funnetwork/hosting/maggiv8/admin/spider.php on line 16


Spidering in progress...

Warning: set_time_limit,getmyuid,getmypid,dl,leak() has been disabled for security reasons in /usr/export/www/vhosts/funnetwork/hosting/maggiv8/admin/robot_functions.php on line 97

--------------------------------------------------------------------------------
SITE : http://www.ebay.com/
Exclude paths :
- help/confidence/
- help/policies/
- disney/

Warning: getmypid,dl,leak() has been disabled for security reasons in /usr/export/www/vhosts/funnetwork/hosting/maggiv8/admin/robot_functions.php on line 655
1:http://www.ebay.com/
(time : 00:00:08)
+ +
level 1...

Warning: getmypid,dl,leak() has been disabled for security reasons in /usr/export/www/vhosts/funnetwork/hosting/maggiv8/admin/robot_functions.php on line 655
2:http://www.ebay.com/mainc1.html?ssPageName=VisitorPage
(time : 00:00:20)
+

Warning: getmypid,dl,leak() has been disabled for security reasons in /usr/export/www/vhosts/funnetwork/hosting/maggiv8/admin/robot_functions.php on line 655
3:http://www.ebay.com/PayPal/
(time : 00:00:27)

level 2...

Warning: getmypid,dl,leak() has been disabled for security reasons in /usr/export/www/vhosts/funnetwork/hosting/maggiv8/admin/robot_functions.php on line 655
4:http://www.ebay.com/es/
(time : 00:00:38)

No link in temporary table

--------------------------------------------------------------------------------

links found : 4
http://www.ebay.com/
http://www.ebay.com/mainc1.html?ssPageName=VisitorPage
http://www.ebay.com/PayPal/
http://www.ebay.com/es/
Optimizing tables...
Indexing complete !




Now i need to get it work on my PC. Help me please.


Thanks
Alex

Charter 02-19-2004 06:40 AM

Hi. The warnings from your online account are because your host has disabled certain functions. You can remove set_time_limit from line 16 of spider.php and from line 97 of robot_functions.php and remove the commented out line 655 in robot_functions.php.

As to crawling from your PC, perhaps try editing your Hosts file. Just do a search for the Hosts file and then add a line to the file with a text editor, something like the following:
Code:

127.0.0.1          localhost
put.the.ip.here    maggiv8.funpic.de


DrKamikaze83 02-19-2004 06:56 AM

are the host data

HOST-RESOURCES-(TYPES/MIB)

or are it the http_vhost files?


is it import where it have to be written in the files?


Thanks
Alex

Charter 02-19-2004 07:05 AM

Hi. I've seen it as just Hosts, no extension, but I'm not sure with your OS/setup. The first entry should probably be the localhost one, but again it might depend on your OS/setup.

DrKamikaze83 02-19-2004 07:13 AM

what do you mean with OS? Operating System? i have Win2000 and Apache Server 1.3.29 !


All times are GMT -8. The time now is 08:17 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.