PDA

View Full Version : URLs containing single quotes


mmaattttt
08-26-2004, 06:06 PM
Hi everyone,

I'm trying to index a server that has a bunch of MSWord documents on it. The indexing works fine except until it reaches a file or directory name with single quotes in it. For example, when trying to index http://www.domain.com/pathtodocs/John's Resume/Resume.doc it would fail and return a 404 error because it tries to find http://www.domain.com/pathtodocs/Johns Resume/Resume.doc (Checked through observing Apache logs)

Obviously quotes etc are being stripped out for sanity, but has anyone encountered this scenario before and what would be the best way to work around it? I'm happy to modify PHP code if someone can point me in the right direction but convincing 150 people to avoid using quotes and to find/change filenames etc containing single quotes would be an unlikely task.

Look forward to any good suggestions!

Cheers,

Matt

vinyl-junkie
08-26-2004, 08:40 PM
Short of finding and changing them all, as you say, you might try using a redirect in your .htaccess file, assuming you're on Linux. It would look like this for the example you gave:Redirect /pathtodocs/Johns Resume/Resume.doc http://www.domain.com/pathtodocs/John's Resume/Resume.docIt's possible that doing this might throw phpdig into a loop, so you might want to watch it while it runs.

Let us know how it goes. :)

mmaattttt
08-26-2004, 09:03 PM
I think alternatively if anyone could point me to or provide a script that will scour through a directory tree and rename files to remove the ' I might just go with that method. Then anything that is missed will be indexed when I run that script periodically and rename anything with a single quote in the name.

Bloody consultants! :D