![]() |
|
![]() |
#16 |
Green Mole
Join Date: Dec 2003
Posts: 16
|
First thanks for all your help.
Real Web Host I can remove that because I have files and directories that can not be crawled. On the other it is crawling now, but even though it has a redirect in it there are still directories in there for that domain. It is not looking at them at all still. it just jumped over that domain and went to the others. So maybe just have to do those sub directories manually like I did before I guess. |
![]() |
![]() |
![]() |
#17 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Add the following to the top of the robots.txt file and then make the code change listed in this thread.
Code:
User-agent: PhpDig Disallow: # whatever else below this
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
![]() |
![]() |
![]() |
#18 |
Green Mole
Join Date: Dec 2003
Posts: 16
|
Any way around the other problem with that domain not reading because it gets redirected
|
![]() |
![]() |
![]() |
#19 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Post fifteen on the first page of this thread should deal with the redirect.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
![]() |
![]() |
![]() |
#20 |
Green Mole
Join Date: Dec 2003
Posts: 16
|
Ok made that change and I put in the main domain it looks like this notice it does not even try to get sub directories under main domain it does not get anything then goes to number 2 which is the redirect domain name so it gets nothing from main domain name.
SITE : http://www.mansfield-tx.gov/ Exclude paths : - @NONE@ 1:http://www.mansfield-tx.gov/ (time : 00:00:00) Ok for http://www.ci.mansfield.tx.us/ (site_id:49) No link in temporary table -------------------------------------------------------------------------------- links found : 1 http://www.mansfield-tx.gov/ -------------------------------------------------------------------------------- SITE : http://www.ci.mansfield.tx.us/ Exclude paths : - @NONE@ 2:http://www.ci.mansfield.tx.us/ it is still running as we speak over 50 minutes now and on number 41. |
![]() |
![]() |
![]() |
#21 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. PhpDig can't index subdirectories/files if there are no links to such. The only thing PhpDig sees at http://www.mansfield-tx.gov/ is the below so, with the changes made in this thread, the only place PhpDig can go to is http://www.ci.mansfield.tx.us and then follow the links from there.
Code:
<html> <head> <meta http-equiv="refresh" content="0;url=http://www.ci.mansfield.tx.us"> </head> </html>
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
![]() |
![]() |
![]() |
#22 |
Green Mole
Join Date: Dec 2003
Posts: 16
|
Ok understand that
|
![]() |
![]() |
![]() |
#23 |
Green Mole
Join Date: Dec 2003
Posts: 16
|
Just to follow up. I made the temp index.html file itworks getting pages now, for some reason when it got done with domain name it got the same pages using the ip.
|
![]() |
![]() |
![]() |
#24 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Are you crawling shell or from the browser interface, with FTP on or off? Is there a link somewhere that uses the IP instead of the domain name?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
![]() |
![]() |
![]() |
#25 |
Green Mole
Join Date: Dec 2003
Posts: 16
|
From IE Browser and FTP ON.
Figure there are links with ip in his files, not sure well let him look at them. |
![]() |
![]() |
![]() |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
pdftotext no solution | Art | External Binaries | 7 | 04-11-2005 04:39 AM |
Dynamic Link Bug with Short Tags (and solution) | Zee | How-to Forum | 0 | 12-10-2004 07:41 AM |
someone help me diggin a solution please | nitril | Troubleshooting | 2 | 12-24-2003 05:47 AM |
Add PDF files to be indexed - Solution | chazter | Mod Submissions | 0 | 10-07-2003 06:42 AM |