![]() |
|
![]() |
#1 |
Green Mole
Join Date: Jan 2004
Posts: 3
|
Exclude links with certain url variabls
Hi there,
Every page on my website has a link to a printer-friendly version of the same page, done with [thispage.php?print=y] I need to exclude these links from the spidering process, but without excluding other url variables such as [news.php?story=11] Basically I need a way to tell the spidering process not to follow links containing a specific string (in this case '?print=y'). I can't find this feature already there, so can someone guide me to the right fuction and how to modify it? Thanks |
![]() |
![]() |
![]() |
#2 | |
Green Mole
Join Date: Jan 2004
Posts: 2
|
Quote:
$content['file'] = preg_replace("print=y'si","", $content['file']); (line before: $url = eregi_replace("([a-z0-9])[/]+... ) This strips "print=y" away. Bad thing is that you get double when searhing searching (those without "print" and those with "print" -> only url is filtered). Lets keep up looking... |
|
![]() |
![]() |
![]() |
#3 |
Green Mole
Join Date: Jan 2004
Posts: 3
|
Thanks, that's a useful start.
I'm looking at function phpdigExplore in robot_functions.php, but I can't figure it out yet. |
![]() |
![]() |
![]() |
#4 |
Green Mole
Join Date: Jan 2004
Posts: 3
|
Got it!
In robot_functions.php, I've added a test at the end of function phpdigDetectDir. This is how I've done it for the test I need, showing lines 537 onwards. My addition is at line 543: //test the exclude with robots.txt if (phpdigReadRobots($exclude,$link['path'].$link['file']) == 1 || isset($exclude['@ALL@']) ) { $link['ok'] = 0; } //exclude if specific variable set if (strpos($link['file'],'print=y')) { $link['ok'] = 0; } //print "<pre>"; print_r($link); print "</pre>\n"; return $link; |
![]() |
![]() |
![]() |
#5 |
Green Mole
Join Date: Jan 2004
Posts: 2
|
I got it too... somehow
Edited "search_function.php" a bit. It is a bit messy, so i wont post it here. Anyway it works pretty well, not perfect. This feature would be a nice add on future versions. I have different language versions, so I dont want to rip off search results permanently. |
![]() |
![]() |
![]() |
#6 |
Green Mole
Join Date: Feb 2004
Posts: 1
|
Use config.php constant?
Found this in the config.php file:
PHP Code:
PHP Code:
Have the same problem... but not tested this possible solution yet... will be back with the result. // JoNtE |
![]() |
![]() |
![]() |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Exclude links from indexing, keep text | digger_123 | How-to Forum | 0 | 12-20-2006 04:14 AM |
exclude filenames | felyx | Troubleshooting | 0 | 11-20-2006 09:29 PM |
How can i exclude pages?? | onlytrue | How-to Forum | 2 | 03-19-2004 02:47 PM |
Exclude list? | antun | How-to Forum | 5 | 03-10-2004 11:38 AM |
exclude after spidering | baskamer | Troubleshooting | 2 | 03-01-2004 02:17 AM |