Wildcard for banned external links?
I was looking over this part in the config and wondered if there is a way to use a wildcard such as banner* so it works for banners also or other plurals.
Code:
// regular expression to ban useless external links in index The ^ represents what? The \. represents what? |
^ means the regular expression starts with the characters following it.
\. is escaping the period. Putting this regular expression back together and interpreting it in English, it means: An expression that begins with the characters "ad." (without the quotes), and is followed by one of the following words: banner banners doubleclick links forum affiliates Expressed another way, it's looking for one of the following strings of characters: ad.banner ad.banners ad.doubleclick ad.links ad.forum ad.affiliates Hope this helps. |
Thanks vinyl-junkie,
You explained that very well. I'm am wanting to ban links like "links" as in a links page or links/index.html or forum directories. Will i have to make a new line and a brand new define and then simply try to imitate what BANNED is doing? Line 1264 in robot_functions.php is the only reference I found to BANNED Code:
if ($regs[5] && $regs[5] != $localdomain && !eregi(BANNED,$regs[5]) && ereg('[a-z]+',$regs[5])) { then would Line 1264 in robot_functions.php be written this way? Code:
if ($regs[5] && $regs[5] != $localdomain && !eregi(BANNED,$regs[5]) && !eregi(BANNED2,$regs[5]) && ereg('[a-z]+',$regs[5])) { A little info: 800 sites @ level 1 depth has me at a 20 mb database size. Time to downsize and then decide to get more database space if needed. Thanks again foryour reply |
Quote:
|
I think I gave you some incorrect information with regard to just what BANNED means. I've been struggling to learn regular expressions. What that is saying is that BANNED is looking for one of the following strings:
"ad." (without the quotes) at the beginning of the string, or "banner" (without the quotes) anywhere in the string, or "banners" (without the quotes) anywhere in the string, or "doubleclick" (without the quotes) anywhere in the string, or "links" (without the quotes) anywhere in the string, or "forum" (without the quotes) anywhere in the string, or "affiliates" (without the quotes) anywhere in the string Just wanted to set that straight. |
That was the way I was seeing it reading. Thank you so much for clarifying it for me. I did go to php.net and see example of what you are now saying it reads as.
I will start crawling all over again and see if it ignores links directories now. I'm trying very hard to reduce the size of the MysQL Database and getting rid of non-informative links. Thanks again p.s. I don't mind getting a response if even with only some correct information. A response to a question at all is much appreciated. :) Thanks for being here |
All times are GMT -8. The time now is 10:54 AM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.