PDA

View Full Version : Add PDF files to be indexed


chazter
09-30-2003, 01:45 PM
Maybe I'm not reading the documentation correctly but Ill go ahead and ask another question since I cant seem seem to grasp this yet.

I realize that I can index my pdf file if its coded on a link of a page, but how do I index pdf files that isnt coded on a link. For example in my test site:

http://www.ricalliance.org/newrica/news/subnewsarchive.php?ID=1&Title=Filings (http://)

This page has list of PDF files within the last 30 days but there is another link called ARCHIVES

If you click on the link ARCHIVES, it will take you to

http://www.ricalliance.org/newrica/news/subnewsarchive.php?ID=1&Title=Filings (http://)

On this page a user must specify the how far do they need to go back for a list of PDF files. Once it is specified it will return a results page with the PDF files that fall into that range.

Its those PDF files I would love to have indexed.

The question is how do I do it? They all reside in a specified directory in my website.

I appreciate any assistance and Thanks in advance.

Charter
10-01-2003, 07:08 PM
If you have access to shell, you could make a text file with the full URL to each PDF, each URL on one line. Otherwise, you should be able to type the full URL to a PDF file into the browser interface to crawl one PDF at a time.

chazter
10-02-2003, 07:54 AM
Originally posted by Charter
If you have access to shell, you could make a text file with the full URL to each PDF, each URL on one line. Otherwise, you should be able to type the full URL to a PDF file into the browser interface to crawl one PDF at a time.

At first your suggestion didnt make sense but after a nights sleep I was able to figure out what you were saying.

What I did was something similar to your solution. In my PHP page I created an array variable that captured all the PDF files and the associate URL. I then linked that variable to a hidden form tag. The phpdig spidered that particular page and was able to index all of my URL's and the associated PDF's from the hidden tag. This way if I ever add to my PDF table, it updates it automatically.

Thanks for pointing my in the right direction.

Have a great day.

Charter
10-05-2003, 10:26 AM
Great, glad it's working. :)

If you would, could you write up what you did and post it in the Mod Submissions (http://www.phpdig.net/forumdisplay.php?forumid=24) forum in case others might want to try it?

chazter
10-07-2003, 07:43 AM
Sure thing I hope it makes sense. I posted as Add PDF files to be indexed - Solution