PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 06-09-2004, 04:58 AM   #1
squatty
Green Mole
 
Join Date: Jun 2004
Posts: 2
links found : 0 w/ example

My apologies for starting a new thread...I know this topic has been covered multiple times over. However, being that I'm giving a real world example I thought it best to isolate this discussion.

Now…for the problem…

I’m trying to index a public web site my company is affiliated w/. I’m primarily interested in ONLY indexing the http://www.ophsource.org/periodicals/ophtha portion of the site.

The site does use robots.txt however, the section I’m interested in indexing is NOT disallowed. The home page (http://www.ophsource.org ) also includes links to /periodicals/ophtha. I’ve tried setting the 'LIMIT_DAYS' to 0, index depth to 10, and emptying the database (all suggestions in other threads). However, I consistently get "Links found 0".

My question is two fold:

1) Can anyone tell my why I can’t index the site and/or help me find a workaround?

2) Can anyone tell me how to ONLY index the /periodicals/ophtha sub directory of the site?


SITE : http://www.ophsource.org/
Exclude paths :
- article/
- medline/
- search/
- user/
- claim/
- ecommerce/
- retrieve/
- webfiles/

Starting to index web pages...
No link in temporary table

links found : 0
...Was recently indexed

Optimizing tables...

Indexing complete !
squatty is offline   Reply With Quote
Old 06-19-2004, 05:33 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. For one, uncomment //print $answer."<br>\n"; in robot_functions.php and then index and see what's onscreen. For two, PhpDig currently spiders all links allowed, but after the spider is done, you can exclude certain directories from further index in the admin panel.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 06-19-2004, 10:50 AM   #3
squatty
Green Mole
 
Join Date: Jun 2004
Posts: 2
Thanks for the response! I tried what you suggested and still can not index the site. This is what I saw on the indexing page...

Server: Microsoft-IIS/5.0
Date: Sat, 19 Jun 2004 18:48:46 GMT
Content-Type: text/plain
Accept-Ranges: bytes
Last-Modified: Tue, 04 Nov 2003 14:33:30 GMT
ETag: "0d9c79de0a2c31:816"
Content-Length: 179



HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Sat, 19 Jun 2004 18:48:48 GMT
Content-Type: text/plain
Accept-Ranges: bytes
Last-Modified: Tue, 04 Nov 2003 14:33:30 GMT
ETag: "0d9c79de0a2c31:819"
Content-Length: 179


--------------------------------------------------------------------------------
SITE : http://www.ophsource.org/
Exclude paths :
- article/
- medline/
- search/
- user/
- claim/
- ecommerce/
- retrieve/
- webfiles/


No link in temporary table

--------------------------------------------------------------------------------

links found : 0
...Was recently indexed

Optimizing tables...
squatty is offline   Reply With Quote
Old 06-21-2004, 05:00 AM   #4
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Check that HEAD requests are allowed on the server for the site, as a HEAD request send to the site gives the following:

> telnet www.ophsource.org 80
Trying 129.35.xx.xxx...
Connected to www.ophsource.org.
Escape character is '^]'.
HEAD / HTTP/1.0

Connection closed by foreign host.
>

Also check that allow_url_fopen is On in the php.ini file or that allow_url_fopen is On in the PHP info.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
No links found... pwoc Troubleshooting 0 11-10-2004 08:05 PM
No links found antoonvdr Troubleshooting 0 10-10-2004 06:19 PM
Another: links found : 1 majestique Bug Tracker 11 07-12-2004 12:19 AM
0 links found, yes, another one juzzi Troubleshooting 5 07-05-2004 07:31 AM
Links found: 1 CafeenMan Troubleshooting 10 05-12-2004 08:35 PM


All times are GMT -8. The time now is 08:24 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.