For me it only index the titlte of pdf file and the hour of the indexation and also the weight of the pdf file in the database in table keywords but there is no content of the pdf in the database.
It is strange because when I index a site with pdf files it seems to index see below :
Is result test http an array: 1
What is result test http status: PDF
Is result test an array: 1
What is result test status: PDF
Use is executable is set to: 1
Index the pdf is set to: 1
Parse the pdf is set to: /usr/bin/pstotext
Does parse pdf exist: 1
Is parse pdf executable: 1
Command is: /usr/bin/pstotext -cork ../admin/temp/13874292.tmp
Result contains: Array ( [0] => Hébergement [1] => Facture [2] => partners -- 5 Sq de tuile_ 78000 Versailles -- Tél. / Fax : 0666666666 -- Email :
contact@partners.com [3] => SARL au capital de 3000# -- Siret545454445RCS Versailles -- APE 222Z -- Web :
www.partners.com [4] => [5] => FACTURE [6] => partners CLIENT [7] => 5 Sq de tuile Adzd MAdzNdzAS [8] => 78000 Versailles [9] => Tél./fax. : 01 3226222626 [10] => Prestation : Hébergement [11] => Facture du: 01/04/2004 au 31/06/2004 [12] => N° de Facture: 12122/66 [13] => Article Objet Quantité [14] => / [15] => Slots [16] => Prix [17] => unitaire / [18] => Trimestre [19] => Montant TVA [20] => Hébergement Serveur [21] => Total HT 122.36 [22] => Total TVA 23.61 [23] => Total TTC 122.00 [24] => A payer 122.00 EUROS [25] => Mode de paiement : A réception de facture [26] => )
Return value is: 0
5:
http://monsiteweb.fr/pdf/01123SOC2004013.PDF
(temps : 00:01:49)
Pas de liens dans la table temporaire