If you look at the site
http://www.fact.co.uk/phpdig/search.php, and search, you will see HTML sometimes gets included in the page results.
The only thing I can think of is that in some of the pages I have been required to use JavaScript with escape characters so the site validates in XHTML.
The actual stripping of tags doesn't change the results, I think it is in the reading of the file, as I have echoed out the contents of the $content_file variable and it "breaks" at certain points.
The script works great aside from this and I would be very grateful for any advice.