![]() |
limit search to contents of HTML tags?
Hi all,
I'm testing PhpDig for the first time, & while this forum is a great resource, having trawled through all the messages I can't find a solution to my problem, so any help would be greatly appreciated. Say I have a number of HTML files with the same structure, e.g. articles with a title in <h2></h2> tags, sub-heading in <h3></h3> tags & the main content in <p class="main"></p> paras. Is it possible to set up PhpDig so that, for example, users can query title text only? Or is there an indexing solution to this issue? Thanks in advance. |
Welcome to the forum, beesman. :D
Searching by title within a page is not something can phpdig was designed to do. I don't know how much interest there would be in doing so, but phpdig could probably be easily modified to search by web page titles, but that's probably the only type of change like this that Charter would be willing to make. Hope this helps. |
Hi, & thanks for the speedy reply :)
Just say no if it's a request too far, but could you point me in the direction of the relevant file &/or chunk of code that I'd have to play with? Many thanks |
Look at admin/spider.php. That is probably what you'd need to modify to make the kind of search index you want.
Hope this helps. :) |
@beesman: you are one day ahead of me posting this question. I will look in to it, but I'm a php-newbie. If you or somebody writes the solution I like to use it too.
|
Look at the phpdigCleanHtml function in the robot_functions.php file.
|
I placed this in robot_functions.php at line 161
Code:
$text = eregi_replace("<td[^>]*>.*</td>"," ",$text); |
Did you reindex?
|
Yes, emptied the database and reindexed.
|
So you have the following?
PHP Code:
|
Thanks Charter, the phpdigExclude and phpdigInclude does it for me! I didn't see that function till now.
:santa: |
All times are GMT -8. The time now is 08:15 PM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.