PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Troubleshooting (http://www.phpdig.net/forum/forumdisplay.php?f=22)
-   -   links found : 0 w/ example (http://www.phpdig.net/forum/showthread.php?t=987)

squatty 06-09-2004 04:58 AM

links found : 0 w/ example
 
My apologies for starting a new thread...I know this topic has been covered multiple times over. However, being that I'm giving a real world example I thought it best to isolate this discussion.

Now…for the problem…

I’m trying to index a public web site my company is affiliated w/. I’m primarily interested in ONLY indexing the http://www.ophsource.org/periodicals/ophtha portion of the site.

The site does use robots.txt however, the section I’m interested in indexing is NOT disallowed. The home page (http://www.ophsource.org ) also includes links to /periodicals/ophtha. I’ve tried setting the 'LIMIT_DAYS' to 0, index depth to 10, and emptying the database (all suggestions in other threads). However, I consistently get "Links found 0".

My question is two fold:

1) Can anyone tell my why I can’t index the site and/or help me find a workaround?

2) Can anyone tell me how to ONLY index the /periodicals/ophtha sub directory of the site?


SITE : http://www.ophsource.org/
Exclude paths :
- article/
- medline/
- search/
- user/
- claim/
- ecommerce/
- retrieve/
- webfiles/

Starting to index web pages...
No link in temporary table

links found : 0
...Was recently indexed

Optimizing tables...

Indexing complete !

Charter 06-19-2004 05:33 AM

Hi. For one, uncomment //print $answer."<br>\n"; in robot_functions.php and then index and see what's onscreen. For two, PhpDig currently spiders all links allowed, but after the spider is done, you can exclude certain directories from further index in the admin panel.

squatty 06-19-2004 10:50 AM

Thanks for the response! I tried what you suggested and still can not index the site. This is what I saw on the indexing page...

Server: Microsoft-IIS/5.0
Date: Sat, 19 Jun 2004 18:48:46 GMT
Content-Type: text/plain
Accept-Ranges: bytes
Last-Modified: Tue, 04 Nov 2003 14:33:30 GMT
ETag: "0d9c79de0a2c31:816"
Content-Length: 179



HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Sat, 19 Jun 2004 18:48:48 GMT
Content-Type: text/plain
Accept-Ranges: bytes
Last-Modified: Tue, 04 Nov 2003 14:33:30 GMT
ETag: "0d9c79de0a2c31:819"
Content-Length: 179


--------------------------------------------------------------------------------
SITE : http://www.ophsource.org/
Exclude paths :
- article/
- medline/
- search/
- user/
- claim/
- ecommerce/
- retrieve/
- webfiles/


No link in temporary table

--------------------------------------------------------------------------------

links found : 0
...Was recently indexed

Optimizing tables...

Charter 06-21-2004 05:00 AM

Hi. Check that HEAD requests are allowed on the server for the site, as a HEAD request send to the site gives the following:

> telnet www.ophsource.org 80
Trying 129.35.xx.xxx...
Connected to www.ophsource.org.
Escape character is '^]'.
HEAD / HTTP/1.0

Connection closed by foreign host.
>

Also check that allow_url_fopen is On in the php.ini file or that allow_url_fopen is On in the PHP info.


All times are GMT -8. The time now is 05:54 PM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.