PDA

View Full Version : getting '400 Bad Request' in some search results titles


jdell_nv
04-07-2005, 02:28 PM
Hi,

I saw another thread about this problem relating to XHTML and some other stuff, but I don't have any of that.

If you have a look at my search site http://casat.unr.edu/search/
and enter search term of 'FRN' (you can also try 'PIC') you will see results with this in the title. However, I've reviewed the pages very carefully and can't see any bad links or other problems that might cause this.

The only thing I can think to suspect is that I have 2 similar URL's:

http://casat.unr.edu/frn/
http://casat.unr.edu/frn.html

and I have the setting PHPDIG_DEFAULT_INDEX set to TRUE.

Any ideas?

Best Regards,
John

jdell_nv
04-07-2005, 02:32 PM
Also, I'm not seeing any errors in the spider.log

Charter
04-08-2005, 07:50 AM
It looks like the server is not happy with certain requests, don't know why, but here's a quick and dirty way to get that out of the title.

In the phpdigCleanHtml function of robot_functions.php find:

$title = trim($regs[1]);

And replace with something like the following:

$title = trim(str_replace("400 Bad Request","",$regs[1]));

jdell_nv
04-08-2005, 10:55 AM
Cool! Thanks for the quick/dirty fix.

Do you have any suggestions for how I might try to debug the real cause of the problem? I'm happy to dig in and I'm fairly proficient with PHP. Server is Linux/Apache 1.3.33/PHP 4.3.10

Regards,
John

Charter
04-08-2005, 04:10 PM
Hmm, perhaps start with the phpdigGetUrl function, echo the requests, and try them by hand via shell.