PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   How-to Forum (http://www.phpdig.net/forum/forumdisplay.php?f=33)
-   -   suggestions? (http://www.phpdig.net/forum/showthread.php?t=1286)

mdrdlp 09-10-2004 08:36 AM

suggestions?
 
ok... here is the situation... i have a site that i need a fairly customized search on. its not a traditional spider job like the rest as i have content that needs to be excluded, so my spidering ability is extremely limited.

so... i have limited out the sections i dont want phpdig to index, and can run this... however, those sections are for the most part, the only way that the spider can travel from link to link. (please dont ask.. its a loooong explanation)

what i need to do is this: i have a list (in spreadsheet... can be dropped easily into a db) of the exact files (both dynamic and static) that i need spidered and indexed. that list is has 1,598 pages. entering them one at a time will take me forever (literally). is there anyway to drop the info i already have laid out into a fashion to automate this process? and if so, how?

vinyl-junkie 09-10-2004 09:48 AM

Welcome to the forum, mdrdlp. :D

If you have the exact URLs in a spreadsheet, just copy and paste them into a text file, then spider from shell. The phpdig documentation gives an example of how to index specific pages like that.

Hope this helps. :)

mdrdlp 09-10-2004 09:59 AM

just tried that... simple from the command line, i put in:
php -f httpdocs/find/admin/spider.php http://www.my-site-url.com



and was met with:
PHP Warning: Function registration failed - duplicate name - imap_open in Unknown on line 0
PHP Warning: Function registration failed - duplicate name - imap_popen in Unknown on line 0
PHP Warning: Function registration failed - duplicate name - imap_reopen in Unknown on line 0
PHP Warning: Function registration failed - duplicate name - imap_close in Unknown on line 0

and that error list went on for pages. says it completed successfully at the end of the line, but nothing was done.

mdrdlp 09-10-2004 10:49 AM

there is always an easy way to do things... sometimes you just have to break it to find it. :eek: :)

did a sample run on a couple of url's to see what data, and where, was being stored in the db. took the 'spider' table... dumped it locally, took my url list, pasted it in and gave them id's, and pushed it back up. then just hit the web admin interface, update site link... and done. :banana: :banana: :banana:


thank you for your help trying to help though. i sincerely appreciate it!


All times are GMT -8. The time now is 12:14 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.