PDA

View Full Version : Indexing local phpbb forum


Siava
05-14-2004, 12:38 AM
Hello to all!

I have any problems with indexing my local phpbb forum.
I create robots.txt, include into this file
Disallow: /forum/posting.php
and other php files, BUT during the spidering I see that this files successfully indexing and generate the pages with this files!
robots.txt use a Mac encoding, file robot_function is updated, robots.txt is locate in root directory of the site.
:bang:
(sorry for my not good english)

Pulsar-san
05-14-2004, 03:23 AM
Your robots.txt is not in the root of your website, when typing
hrrp://siava.spb.ru/robots.txt I get a 404 page instead of getting the listing of the robots.txt file.

vinyl-junkie
05-14-2004, 03:24 AM
Your robots.txt needs to look like this:User-agent: *
Disallow: /forum/Hope this helps. :)

Siava
05-14-2004, 04:21 AM
Pulsar-san
No, topic about LOCAL forum installed on my computer in local network ;)

vinyl-junkie
I know, but I need to indexing some files in the forum folder (viewtopic.php, viewforum.php and somthing...).

My robots.txt exclude folders and files that I don't want indexing:

User-agent: *
Disallow: /forum/admin
... other folders ...(I'm skiping other folders)
Disallow: /forum/posting.php
... other files ....

With disallow folders all ok, but this "other files" is not disallow and there are indexing :( Why??

During spidering I see:

Exclude paths :
- forum/admin
........................(I'm skiping other folders)
- forum/posting\.php
.....................(and other php files with .\)

What a posting.\php ??? Why with .\ ??

Pulsar-san
05-14-2004, 05:20 AM
Oups ! Sorry. I missed the "local"

If it is "\." I'd say that the point is escaped, it is just a missing stripslashes()

Now, why it is indexed, from what I know
/posting.php?mode=newtopic&f=7
and
/posting.php?mode=newtopic&f=5
are considered as 2 different pages by spiders, so
Disallow: /forum/posting.php
will only forbid access to specifically posting.php, without any params.

I'm not sure about that, but this is what I have understood.
I have not seen the possibility to use wildchars in filenaming for robots.txt
Except, perhaps, adding posting.php in the
// regular expression to ban useless external links in index
define('BANNED','^ad\.|banner|doubleclick');

Siava
05-14-2004, 07:47 AM
Pulsar-san
Disallow: /forum/posting.php will only forbid access to specifically posting.php, without any params. :(

Except, perhaps, adding posting.php in the
define('BANNED','^ad\.|banner|doubleclick|posting.php');
Yes, I was add this file into config and make BANNED rule, but this file successfully indexing :D