PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Troubleshooting (http://www.phpdig.net/forum/forumdisplay.php?f=22)
-   -   converted from html pages to php pages now no pages will index!!! help!! (http://www.phpdig.net/forum/showthread.php?t=764)

bigals 04-01-2004 02:17 AM

converted from html pages to php pages now no pages will index!!! help!!
 
I have recently converted all my html pages into php pages and now php dig will not index any of them at all!

the pages are extremely important and need indexing so how do i sort this out, also the pages are only little bits of code that link to a template page which is then populated with data, so phpdig doesn't seem to be able to spider these pages now, can anyone explain a way round this???

cheers,

Alex.

Charter 04-01-2004 02:34 AM

Hi. What is the code from one of these PHP files? Does it have a header redirect? If so, try the ZIP file in this thread.

bigals 04-01-2004 02:37 AM

the code from the files are as follows:

PHP Code:

<?php
    
// Strip the path from the current script location 
    
$path dirname($_SERVER['PHP_SELF']); 
    
// Explode out the directors from the path 
    
$dirs explode("/"$path); 
    
$numdirs count($dirs) - 1
    
// Directory closest to the php page 
    
$region $dirs[$numdirs];
    
// Directory before dir1 
    
$country $dirs[$numdirs 1];
    
// Set Status
    
$url="http://www.mysite.com/templates/region_template.php";
$url.="?region=".urlencode($region)."&country=".urlencode($country);
$file_output=file_get_contents($url);
echo 
$file_output;
?>

thats all that is in each index.php file, so how can these be indexed?

I'm not very knowledgable so could you possibly explain?

thanks.

Charter 04-01-2004 03:12 AM

Hi. Are there links to these PHP files?

bigals 04-01-2004 03:20 AM

the links to the pages are generated by the pages themselves...

its a directory of the UK

ie. an index page placed in a county folder will create the links to all the towns/cities within that, these links would be somethhing like

leicestershire/leicester/index.php

so the links only exist after the php page has been compiled, i think i read somewhere that phpdig compiles all php then spiders it afterwards.

the index.php pages become html in content but only when compiled

hope that helps explain it!

Charter 04-01-2004 03:26 AM

Hi. I mean when you spider, are you spidering a page that has links to these PHP files, like a directory listing?

BTW, PhpDig doesn't compile PHP; it's compiled server-side. PhpDig checks and reads what is output from the server. ;)

bigals 04-01-2004 03:33 AM

no i'm just spidering the main folder, i.e.

http://www.mysite.com/database/world/uk/

the first page it will find will be a index.php page this page will display the countries within the uk its laid out like so:

world/
--------uk/index.php
--------uk/england/index.php
--------uk/england/west_midlands/index.php

Charter 04-01-2004 03:48 AM

Hi. So does the http://www.mysite.com/database/world/uk/england/west_midlands/index.php page call up the http://www.mysite.com/templates/region_template.php?region=west_midlands&country=england page? If so, what do you get when you uncomment //print $answer."<br>\n"; from the robot_functions.php file and then index?

bigals 04-01-2004 04:03 AM

yes that is what happens, bang on...

i tried uncommenting that line and it indexed just the main html pages as before, missing the entire database folder out (as this only consists of these index.php file and folders)

it was returning strange info like this:

Quote:

5:http://www.mysite.com/features/featurepage.html
(time : 00:00:17)
HTTP/1.1 404 Not Found
HTTP/1.1 200 OK
Date: Thu, 01 Apr 2004 12:56:48 GMT
Server: Apache/1.3.20 Sun Cobalt (Unix) mod_jk mod_ssl/2.8.4 OpenSSL/0.9.6 PHP/4.3.0 FrontPage/5.0.2.2510 mod_auth_pam_external/0.1 mod_perl/1.26
Last-Modified: Mon, 29 Mar 2004 11:21:42 GMT
ETag: "180433c-2156-406806c6"
Accept-Ranges: bytes
Content-Length: 8534
Content-Type: text/css
HTTP/1.1 404 Not Found
HTTP/1.1 200 OK
Date: Thu, 01 Apr 2004 12:56:48 GMT
Server: Apache/1.3.20 Sun Cobalt (Unix) mod_jk mod_ssl/2.8.4 OpenSSL/0.9.6 PHP/4.3.0 FrontPage/5.0.2.2510 mod_auth_pam_external/0.1 mod_perl/1.26
Last-Modified: Mon, 29 Mar 2004 11:21:42 GMT
ETag: "180433c-2156-406806c6"
Accept-Ranges: bytes
Content-Length: 8534
Content-Type: text/css
HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
each time saying HTTP/1.1 404 Not Found afew times at the end of each of these blocks, as you can see above.

i've commented that line back to the way it was now.

Charter 04-01-2004 04:16 AM

Hi. The 404 means PhpDig is not finding the pages. Are you using a base href tag? If so, there is some code in this thread to account for base href tags.

bigals 04-01-2004 04:24 AM

no i'm not using base h ref i don't think, i'm not sure what that means exactly, but if a search for <BASE HREF in my template pages nothing is returned so that isn't in any of my pages.

argh this is getting confusing!!

Charter 04-01-2004 04:43 AM

Hi. It seems that there may be a mislink somewhere in the new PHP code, maybe dealing with the $_SERVER['PHP_SELF'] variable. What do you get onscreen when you try the following?

In robot_functions.php right after:
PHP Code:

//print $answer."<br>\n"; 

stick the following:
PHP Code:

echo "Page: ".$host.$path."<br>\n"

and see what pages are generating the 404s on index.

bigals 04-01-2004 05:18 AM

i get all of this stuff happening: that double// looks a bit suspicious, and then it goes back to one /

Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com//
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
Page: www.mysite.com/
+ Page: www.mysite.com/database/world//
Page: www.mysite.com/database/world//
Page: www.mysite.com/database/world//
Page: www.mysite.com/database/world//

Charter 04-01-2004 05:40 AM

Hi. The double slash is okay. It's removed when it needs to be removed. Maybe the thing to notice is that none of the pages have things like uk/england/west_midlands/index.php in them. Without actually seeing/testing your site, I doubt that I can get this narrowed down.

bigals 04-01-2004 05:49 AM

1 Attachment(s)
ok, well heres one of the pages that an index.php page is replaced with:

this might be more help to you, as you can then see how things are accessed by my pages n stuff, all the templates are the same in dynamics...

i hope this can help!!!! :)

see attachment...its a php file


All times are GMT -8. The time now is 12:05 PM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.