View Single Post
Old 07-07-2006, 01:08 AM   #1
c4x
Green Mole
 
Join Date: Mar 2004
Location: 101110001010101
Posts: 3
PDF/X-PDF "Bug" in 1.8.9 and below

Hello!

There is some bug with different Content-Types in the server response for PDF files.

One of our servers is responding with "Content-type: application/x-pdf". (See below).
The PhpDig search allows only application/pdf. And only if this header is found the PDF gets parsed.

The x-pdf is nothing different and just some IE fix... try google



Maybe anyone should rewrite the line 510 (v1.8.8) of the robot_functions.php:

Change:
PHP Code:
else if ($regs[2] == 'pdf' && PHPDIG_INDEX_PDF == true) { 
To:
PHP Code:
else if (($regs[2] == 'pdf' || $regs[2] == 'x-pdf') && PHPDIG_INDEX_PDF == true) { 
And everything will work fine for both response types of the server



Here is the sample response of our Apache server:

Quote:
Status Code: 200
Date: Fri, 07 Jul 2006 08:22:09 GMT
Server: Apache/2.0.46 (Red Hat)
Last-modified: Thu, 06 Jul 2006 10:40:59 GMT
Etag: "12b081e-a3044-622f64c0"
Accept-ranges: bytes
Content-length: 667716
Connection: close
Content-type: application/x-pdf


Greetings, MR
c4x is offline   Reply With Quote