PDA

View Full Version : searching PDF files


bcunico
02-21-2006, 07:43 PM
I would like to be able to search PDF files (in a folder named SECURE, located directly under $HOME) on my web. This is a "secure folder" that requires a userid and password to access. I'm hoping this won't be an obstacle to doing a search?? I'm new to PhpDig (vers 1.9) and I don't know where to begin. I was really hoping I could set up the spider to search my $HOME/SECURE folder ONLY rathe rthan searching my entire web. Whenever I try to specify a URI other than my $HOME, it defaults back to $HOME. Also, the spider does not find my PDF files. HELP!!!

bcunico
02-24-2006, 02:14 AM
OK, reading other posts for "external binaries", I have had some success. As I mentioned in my first thread, the folder with my PDF files is secure and requires a userid and password. BTW, I'm on a Linux system and my PhpDig is version 1.8.8 not 1.9.If I turn the security feature off, the spider finds my PDF files, but when I actually do a search, no results are found. I did copy a version of "pdftotext" into my PhpDig folder and I believe the permissions are correct. Now my question is, why don't I get matches when I do a search, and is there a way to leave my folder secure and still have PhpDig do the search? I hope someone replies while my 30 days of viewing hidden posts is valid!

bcunico
02-24-2006, 02:23 AM
One more thing... I would like to be able to index a file that contains a full PATH to each PDF file. That will be much easier for me to do admin on because the actual PDF files change from time to time.

bcunico
02-24-2006, 02:40 AM
Back in 2003, a green mole named Chazter came up with some code that I think I need to implement, but I'm not sure which "php" file the code goes in??? Here's what he wrote
======

I hope this helps for future reference. In one of my PHP pages, I created a variable in PHP to hold my list of PDF files to be indexed from an array and put that variable in a hidden html tag.

====================
<?php

//Create Query

$sql = ("SELECT * FROM newsdetail");
$mysql_result=mysql_query($sql,$connection);

$num_rows=mysql_num_rows($mysql_result);

//Initialize $listURLAll variable

$listURLAll="";

//Check to see if Query returns Records

if ($num_rows != 0) {

//If records exist create array

while ($row = mysql_fetch_array($mysql_result)) {


//$pID represents a specific category of PDF files

$pID = $row["NewsDetail_ID"];

//Optional Switch Statement. I have pdf files in different locations


switch($pID) {
case 1:
$dir = "filings/";
break;
case 2:
$dir = "advisories/";
break;
case 3:
$dir = "headlines/";
break;
case 4:
$dir = "newsletter/";
break;
case 5:
$dir = "press/";
break;
case 6:
$dir = "reports/";
break;

}

$pTitle = $row["NewsDetail_Title"];
$pFile = $row["NewsDetail_FileName"];
$pext = ".pdf";

//$listURL represents a URL path of a pdf file per record

$listURL=("<a href=$dir$pFile$pext target=blank>$pTitle</a><br>");

//$listURLALL represents a list of URL paths to be appended after each record pass

$listURLAll = $listURLAll.$listURL;

} // End While

} // End If
?>

<!-- hidden text tag, the VALUE represents the $listURLAll to be indexed for the PHPdig spider. -->

<input type="hidden" name="hiddenURL" value="<?php echo $listURLAll ?>">
==============

I have an htm file (pdf-list.htm) with a complete list of URL's I'd like PhpDig to search. Is there any way to incorporate my file with Chazter's code?