PDA

View Full Version : Indexing Issue


tajmahal
01-21-2005, 10:07 AM
I appologize for starting another thread on what seems to be a very common topic, but i have looked through all other threads and have not been able to resolve my problem. My PHPdig wont index any sites i put into it. First, i put in my domain, and it seemed to pick up my home page and one subdirectory. I have tried to re-index for the whole site, but it has been unable to do so. I tried indexing other websites, and haven't even been able to get any pages from them. I have chmod-ed all of the required directories to 777, and my config file is as follows:

define('SEARCH_DEFAULT_LIMIT',20);
define('SPIDER_MAX_LIMIT',100);
define('RESPIDER_LIMIT',100);
define('LINKS_MAX_LIMIT',100);
define('RELINKS_LIMIT',100);
define('LIMIT_TO_DIRECTORY',false);
define('PHPDIG_IN_DOMAIN',true);

also my control panel gives me:
Spidering in progress... [Stop spider]
SITE : http://www.mysite.org
Exclude paths :
- @NONE@
links found : 0
...Was recently indexed
Optimizing tables...
Indexing complete !

Any ideas?

Dave A
01-21-2005, 01:49 PM
It may be because you have recently indexed the web site, you could try to delete the domain from the admin panel and then re index it.
There are some related functions in the config.php script that relates to re indexing, for my part I usualy drop the domain and then tidy up the index, keywords and common words and then spider it again.

tajmahal
01-21-2005, 03:14 PM
i tried deleting and re-indexing, but it is to no avail. I'm getting similar results. One thing i forgot to mention, my pages are all .shtml - would that make any difference?

Another question:
The footers on my website hold links to other areas of the website. I would greatly desire for the footer to be spidered. Unfortunately, I insert the footer into the pages via Server-Side Includes (SSI). Is there any way i can spider these dynamically-generated links?

Charter
01-21-2005, 07:02 PM
The SHTML pages and SSI should not matter. You'd need to spider your site from a page with links to other pages. Starting from the main Flash page, without links elsewhere, won't index the site, as PhpDig follows links. Enter a link from an inner page in the textbox, click the 'no' radio button, set 'Search depth' to a large value, set 'Links per' to zero, and give that a shot. Also, as an aside, the home.shtml page seems to be missing an end DIV tag somewhere.

tajmahal
01-22-2005, 05:20 AM
Thanks! the spider seems to be working well now. How often will it respider? By the way, how'd you know about that div tag?

Charter
01-22-2005, 12:26 PM
PhpDig respiders as often as you do it, or as often as you set a cron job to do it. As for the DIV tag, I am magic... ha, ha... just a view of the HTML source is all.

tajmahal
01-23-2005, 06:24 AM
set cron job to do it :confused:? Can i have a link to the proper documentation on that? And by the way magic man, how'd you know my domain in the first place?

jmitchell
01-23-2005, 06:26 AM
you hit the nail on the head - charter is magic :D

tajmahal
02-19-2005, 11:03 AM
ok so when my cron job reindexes, does it delete the old index and then make a new one? if i rearrange some pages by moving them to different folders, it seems that the new page as well as the old page is listed in the index. any way around this?