![]() |
Indexing Help...I am missing something
Hello All:
I have not been lucky finding any posts exactly like this and need a little help. Unfourtanatly my site is on an intranet and I can not provide a link for you to review so I will do the best that I can to explain this. PhpDig v.1.8.7 All of my data is stored in a directory on the site that is broken into a year directory then a month directory. So it looks like this -->Archives -->2005 -->January All of the month directories contain a bunch of html files that are a listed in a html file called publised.html that is also in the month directory. Everything seems to go fine when I set up Phpdig and index database looks fine. However when I go to search evey link takes you to the published.html file and not the html page that has the data you really want. What am I doing wrong? Am I choosing something wrong in the search depth? When I enter what should be indexed I do put in something like this... http://archive/archives/1990/jan/published.html http://archive/archives/1990/feb/published.html http://archive/archives/1990/mar/published.html http://archive/archives/1990/apr/published.html http://archive/archives/1990/may/published.html http://archive/archives/1990/jun/published.html http://archive/archives/1990/jul/published.html http://archive/archives/1990/aug/published.html http://archive/archives/1990/sep/published.html http://archive/archives/1990/oct/published.html Is that wrong? Any help or advice that anyone could offer would be GREATLY appriciated and I thank you in Advance! Tom Scholle tjscholle@cbs.com |
So each published.html page contains links to other pages in the archives/year/month/ directory? Try setting LIMIT_TO_DIRECTORY to false and PHPDIG_IN_DOMAIN to true (both in the config file) and then, from the admin panel, use a large search depth, set links per to zero, and use the no option.
|
no luck...
I am afraid everything still comes back pointing to published.html. Should I change how I set the dig
from http://archive/archives/1990/jan/published.html to http://archive/archives/1990/jan Would that fix it? am I limiting it too much? |
What does the HTML from one of the published.html look like? Just attach one of the published.html files, if you will, so I can have a look-see. Also, if you can, attach a screenshot showing the trouble area. This will help me get a better understanding.
|
1 Attachment(s)
I have added a zip file with a published.html and a screen shot of the results. I hope that helps. Thank you for your help!
|
When you click the published.html link, like the one shown in the screenshot, where are you taken? Also attach one of those 16634f0b.html type files so I can look-see and test.
|
Here you go....
1 Attachment(s)
I guess a this point it would help you to know that these pages get created by our NRCS (newsroom computer system). This is an archive of a shows rundown.
When I click on a link like the one in the screenshot I am taken directly to the published.html for that month and not to the story it is refrencing. I hope am answering the questions correctly here... Again I thank you for this help! Tom |
1 Attachment(s)
Okay, I did a test using the following setup:
Code:
http://www.phpdig.net/temp/published.html Code:
<A HREF="16634f0b.html">chinese orch </A><BR> PhpDig printed out the following: Spidering in progress... [Stop spider] SITE : http://www.phpdig.net/ Exclude paths : - @NONE@ 1:http://www.phpdig.net/temp/published.html (time : 00:00:06) + level 1... 2:http://www.phpdig.net/temp/16634f0b.html (time : 00:00:16) No link in temporary table links found : 2 http://www.phpdig.net/temp/published.html http://www.phpdig.net/temp/16634f0b.html Optimizing tables... Indexing complete ! [Back] to admin interface. A test search on orch yielded the attached image. What happens if you directly index the following: http://archive/archives/????/???/16634f0b.html (replacing the ?'s with year and month info) If you want to see 16634f0b.html, what do you type in your browser: http://archive/archives/YYYY/MMM/16634f0b.html or something else? |
All times are GMT -8. The time now is 05:25 AM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.