PDA

View Full Version : spidering links but not their text


haxored
01-13-2004, 06:41 AM
I've got a bunch of pages with a navigation menu on it, as well as some text I don't want indexed.

I want to exclude everything but a <div> tag that contains all the text I want to index.. However...

it seems that if I do that, all of the links in my navigation menu aren't spidered. That is not helpful.

How can I get phpDig to spider my entire site, but only index one section of every page?

Charter
01-14-2004, 10:32 AM
Hi. You might try using PHPDIG_EXCLUDE_COMMENT and PHPDIG_INCLUDE_COMMENT from the config file, each on their own line, to exclude a portion of a page. Depending on your navigation menu, it might not be getting indexed because PhpDig excludes certain tags from index. If this is the case, you might try setting up a simple HTML page with links to your site, and then crawl this page. Once the crawl is done, you can delete the simple page from the admin panel.

MaXius
01-27-2004, 02:23 AM
I have a similar problem,

I have a pulldown menu, (div tags) and phpdig pulls all the menu options into the search results, hence most words you would search for are on alll pages. in a big ugly mess.

Does phpdig still crawl the links within a PHPDIG_EXCLUDE_COMMENT area even tho it doesnt put these into its database?

Thanks

Charter
01-27-2004, 09:58 AM
Hi. The exclude/include comments are for omitting parts of a page from indexing.

MaXius
01-27-2004, 12:49 PM
Yeh, gathered that... does it still spider the links within an excluded part tho...?

Ta

Charter
01-27-2004, 05:01 PM
Hi. Using the below simple example, PhpDig works as follows:

<html>
<body>
This text is indexed
<!-- phpdigExclude -->
<a href="http://www.this_link.com/is_followed.html">This text is ignored</a>
This text is also ignored
<!-- phpdigInclude -->
</body>
</html>

To change this behavior, the phpdigExplore and/or phpdigIndexFile functions in robot_functions.php could be modified.