PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Troubleshooting (http://www.phpdig.net/forum/forumdisplay.php?f=22)
-   -   Indexing local phpbb forum (http://www.phpdig.net/forum/showthread.php?t=936)

Siava 05-14-2004 12:38 AM

Indexing local phpbb forum
 
Hello to all!

I have any problems with indexing my local phpbb forum.
I create robots.txt, include into this file
Disallow: /forum/posting.php
and other php files, BUT during the spidering I see that this files successfully indexing and generate the pages with this files!
robots.txt use a Mac encoding, file robot_function is updated, robots.txt is locate in root directory of the site.
:bang:
(sorry for my not good english)

Pulsar-san 05-14-2004 03:23 AM

Your robots.txt is not in the root of your website, when typing
hrrp://siava.spb.ru/robots.txt I get a 404 page instead of getting the listing of the robots.txt file.

vinyl-junkie 05-14-2004 03:24 AM

Your robots.txt needs to look like this:
PHP Code:

User-agent: *
Disallow: /forum

Hope this helps. :)

Siava 05-14-2004 04:21 AM

Pulsar-san
No, topic about LOCAL forum installed on my computer in local network ;)

vinyl-junkie
I know, but I need to indexing some files in the forum folder (viewtopic.php, viewforum.php and somthing...).

My robots.txt exclude folders and files that I don't want indexing:

User-agent: *
Disallow: /forum/admin
... other folders ...(I'm skiping other folders)
Disallow: /forum/posting.php
... other files ....

With disallow folders all ok, but this "other files" is not disallow and there are indexing :( Why??

During spidering I see:

Exclude paths :
- forum/admin
........................(I'm skiping other folders)
- forum/posting\.php
.....................(and other php files with .\)

What a posting.\php ??? Why with .\ ??

Pulsar-san 05-14-2004 05:20 AM

Oups ! Sorry. I missed the "local"

If it is "\." I'd say that the point is escaped, it is just a missing stripslashes()

Now, why it is indexed, from what I know
/posting.php?mode=newtopic&f=7
and
/posting.php?mode=newtopic&f=5
are considered as 2 different pages by spiders, so
Disallow: /forum/posting.php
will only forbid access to specifically posting.php, without any params.

I'm not sure about that, but this is what I have understood.
I have not seen the possibility to use wildchars in filenaming for robots.txt
Except, perhaps, adding posting.php in the
PHP Code:

// regular expression to ban useless external links in index
define('BANNED','^ad\.|banner|doubleclick'); 


Siava 05-14-2004 07:47 AM

Pulsar-san
Quote:

Disallow: /forum/posting.php will only forbid access to specifically posting.php, without any params.
:(

Quote:

Except, perhaps, adding posting.php in the
PHP Code:

define('BANNED','^ad\.|banner|doubleclick|posting.php'); 

Yes, I was add this file into config and make BANNED rule, but this file successfully indexing :D


All times are GMT -8. The time now is 01:37 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.