PDA

View Full Version : vrtually no results at all


mistafeesh
11-11-2003, 04:26 PM
I've just tried to index my first site, and I'm really disappointed!

It only seems to have indexed the first page. It's the only static page. all the rest are PHP, and it's all within a frameset.

the site can be accessed at http://www.act4business.com/test.html

any ideas?

mistafeesh
11-12-2003, 03:08 AM
just sorted it , kinda.

I followed the instructions on this thread (http://www.phpdig.net/showthread.php?s=&threadid=185) to delete the index and reindex. I guess something went pear shaped the first time around.

Only thing is, it seems to have only used a few keywords. I tried searching for keywords I knew were in there, and it came up with no results. I looked in the database, and it's only indexed 30 words.

Charter
11-12-2003, 07:26 AM
Hi. In the admin panel, click the site, then the update button, and then the blue arrow to see a list of indexed links.

How many levels did you crawl? What are SMALL_WORDS_SIZE and MAX_WORDS_SIZE set to in the config file?

mistafeesh
11-12-2003, 07:53 AM
Hi.

the small and max word sizes are fine (2 and 30) I set it to crawl 20 levels deep.

there is a calendar on the site, and it's mostly just indexed that. I need to get it to not bother indexing this.

One part of this is quite tricky. I have a 'mini-calendar' built into the nav frame.
is it possible to forbid it to follow links in the format "../nav.php?month=blah"??

also, how can I get it to re-index without losing the exclusion list?

Charter
11-12-2003, 09:08 AM
Hi. Just curious, how long does the crawl take with 20 levels, or do you stop the crawl?

You could try using the PHPDIG_EXCLUDE_COMMENT and PHPDIG_INCLUDE_COMMENT values set in the config to exclude part of a page, each on their own line, or you could move certain scripts to their own directory and exlude that directory from the admin panel or via a robots.txt file.

In the admin panel, if you click a site, click the update button, and click a red exclude symbol for a certain path, that path shouldn't be included in a reindex. To reindex, click a site, click the update button, and click a green check mark.

mistafeesh
11-12-2003, 02:59 PM
Thanks.

I've just tried the comments. They seem to work. It's still indexing the calendar somehow, but not nearly as much.

the results are better but not particularly satisfactory. there are still an awful lot of keywords not picked up. Is there a way I can increase the number of keywords?

20 levels deep took a few minutes on my local server, but it's only got one client, so it may well take a lot longer.

Charter
11-12-2003, 03:01 PM
Hi. What's the link to your search page? What happens if you index say http://www.act4business.com/nav.php directly?

mistafeesh
11-12-2003, 03:01 PM
Also, is there a way I can exclude the navigation frame from the results, but still follow the links on it?

Charter
11-12-2003, 03:05 PM
Hi. To delete a link, go to the control panel, click a site, click the update button, click a blue arrow, and on the right hand side of the screen, click the red X to delete a link.

mistafeesh
11-12-2003, 04:19 PM
sorry. didn't notice the cross-post.

I have two copies of the site at the moment. One on my local network, which I am working on, and one on the internet, which will go live very soon.
I haven't yet implemented a search page on the internet version of the site, but they are otherwise near identical (apart from one or two small tweaks). So I can't point you to the search page yet!
I'll try and upload the search bits in a bit, just after I've done some other tweaks...