Where to store the url of a webpage for indexing and searching?

Dear Java gurus,
We have a set of html files stroing in a file system. We can use Lucene to index those files with two fields "path" and "content ". Then using Lucene we can search and the result will be the relevant content and its path (path in the file system).
As each of these html file is a real web page, we know its url in Internet too. However, I don’t know where to store this real url and let Lucene to index not only the path, content but also this url. If this is possible, then the search result will display the url also.
Do you have any idea about this ?
This is the last obstacle for us to develop a small Google like search engine. We have already a crawler that works well.
Thanks for any suggestions.
Pengyou

pengyou wrote:
jschell wrote:
pengyou wrote:
However, if I just want store the html file in a file system for quich test purpose, how can I store the url and keep a link to the related file.Instead of storing the content in the database you instead store a file system location (either a path or a file url.)
That file system location is where you store the content.Indeed, the file system location is where I store the content which I crawled from Internet. However, it is not the initial url from which I crawled this content. I would like to store the initial url of this content too. This is still a problem.No it isn't.
You have two pieces of data: Url and content.
If you want to store the content on the file system then your database table would have two columns: url and file_location.
You then do the following
1. Save the content to the file system. Derive a file path from that process.
2. Write a record to the database consisting two data items : url and that file path.

Similar Messages

  • Where to store the servlet class files ?

    If, I store the class files for servlets under WEB-INF/classes folder,
              i get file not found exception while using WL 6.1 sp2. But, if i store
              the class file under DefaultWebApp folder, it works fine.
              Any help about where to store the class files for servlets would be
              great help.
              Thanks.
              hiren
              

    Copy Servlet in DefaultWebApp/Web-Inf/classes directory.
              Configure Servlet in web.xml deployment descriptor.
              <servlet>
              <servlet-name></servlet-name>
              <servlet-class></servlet-class>
              </servlet>
              <servlet>
              <servlet-name></servlet-name>
              <servlet-class></servlet-class>
              </servlet>
              <servlet-mapping>
              <servlet-name></servlet-name>
              <url-pattern></url-pattern>
              </servlet-mapping>
              hiren dossani wrote:
              > If, I store the class files for servlets under WEB-INF/classes folder,
              > i get file not found exception while using WL 6.1 sp2. But, if i store
              > the class file under DefaultWebApp folder, it works fine.
              > Any help about where to store the class files for servlets would be
              > great help.
              >
              > Thanks.
              >
              > --
              > hiren
              

  • Where to store the attachments before sending a mail

    can any one help me
    when user clicks on attachfiles where to store the attachments before sending an email
    thanks in advance

    pengyou wrote:
    jschell wrote:
    pengyou wrote:
    However, if I just want store the html file in a file system for quich test purpose, how can I store the url and keep a link to the related file.Instead of storing the content in the database you instead store a file system location (either a path or a file url.)
    That file system location is where you store the content.Indeed, the file system location is where I store the content which I crawled from Internet. However, it is not the initial url from which I crawled this content. I would like to store the initial url of this content too. This is still a problem.No it isn't.
    You have two pieces of data: Url and content.
    If you want to store the content on the file system then your database table would have two columns: url and file_location.
    You then do the following
    1. Save the content to the file system. Derive a file path from that process.
    2. Write a record to the database consisting two data items : url and that file path.

  • Where to store the servlet classes

    can anyone tell me where to store the servlet classes(inside a package)in oracle9iAS used with oracle8i database ?
    Also how to deploy an ejb in oracle9iAS. should we have to use oracle8i deployment guide or any other procedure??
    thanks in advance

    Copy Servlet in DefaultWebApp/Web-Inf/classes directory.
              Configure Servlet in web.xml deployment descriptor.
              <servlet>
              <servlet-name></servlet-name>
              <servlet-class></servlet-class>
              </servlet>
              <servlet>
              <servlet-name></servlet-name>
              <servlet-class></servlet-class>
              </servlet>
              <servlet-mapping>
              <servlet-name></servlet-name>
              <url-pattern></url-pattern>
              </servlet-mapping>
              hiren dossani wrote:
              > If, I store the class files for servlets under WEB-INF/classes folder,
              > i get file not found exception while using WL 6.1 sp2. But, if i store
              > the class file under DefaultWebApp folder, it works fine.
              > Any help about where to store the class files for servlets would be
              > great help.
              >
              > Thanks.
              >
              > --
              > hiren
              

  • Where Firefox stores the FLV files in windows 7

    Where Firefox stores the FLV files in windows 7
    == This happened ==
    Not sure how often
    == Where Firefox stores the FLV files in windows 7

    FLV files played by the Flash plugin are usually streamed and not saved to the hard drive.
    They may be kept in the memory cache during the playing.
    Video DownloadHelper: https://addons.mozilla.org/firefox/addon/3006

  • Where microsoft stores the information of each file? like .....

    I would like to know where microsoft stores the information about each file in my computer,
    because I want to search a specific folder in my computer that contains more than 400 files
    and stores the name of each file and its size and the modified date of it in an
    Excel sheet?
    Please, tell how to do it
    it's urgent

    Firstly the platform independance of Java is maintained (in part) by allowing the JVM to negotiate with the Operating System and the developer to command only the JVM.
    Secondly the Security Mangager of Java would not being functioning correctly if you were able to access arbitary memory in this way.
    Thirdly, do you have no compassion for Linux users?
    It is possible to conduct this search using the Java Core API
    Bamkin

  • Any way to change the location where itunes stores the backup files?

    I am running low on my C: drive in XP. Any way to change the location where itunes stores the backups of my various devices?

    i am currently in the same situation. i purchased an external hard drive and set itunes to save to it in the preferences, and this worked, but only for new additions. all of my current music still plays from my mac HD and not the external. i am assuming i will need to re-add every song. the only plus is that when i transferred my songs to the external, it saved my settings. hopefully it saves the play counts.

  • TS3988 I changed my Apple ID to my new email address successfully for the iTunes and App store but it won't work for iCloud and it won't recognize my password. I read that this can't be done. How am I supposed to get into iCloud?

    I changed my Apple ID to my new email address successfully for the iTunes and App store but it won't work for iCloud and it won't recognize my password. I read that this can't be done. How am I supposed to get into iCloud? I plan on getting rid of my old email address which is my old Apple ID so how is that going to work?

    Same question Wish someone had replied!
    I changed my Apple ID to my new email and now cannot find any way to access icloud. Unfortunately I had allowed icloud to hijack my airbook files, so of course I am afraid I will lose them tomorrow when I exchange my iphone for a new one and cannot keep an icloud account i cannot access. What a poor sync system! Really atypical for apple!

  • Upgraded 2 FFv.4. Used 2 type a utility's name in2 url field & it would usually take me 2 that utility's website w/out knowing the url. Now takes me 2 crap search page. Why is that feature gone & how do I get it back?

    Upgraded 2 FFv.4. Used 2 type a utility's name in2 url field & it would usually take me 2 that utility's website w/out knowing the url. Now takes me 2 crap search page. Why is that feature gone & how do I get it back?

    Typing into url field used to send me to google search, now it's a total crap generic search. wtf? What happened? I pretty much hate v.4 right now and wish I never upgraded :P

  • I only see the phone numbers of my contacts when using iMessage on my MAC, whereas I see the names on my iPhone. Why, and how can I make it so I see the names (and pics) on my MAC?

    I only see the phone numbers of my contacts when using iMessage on my MAC, whereas I see the names on my iPhone. Why, and how can I make it so I see the names (and pics) on my MAC?

    Thanks Eric, but that was already turned on. I just now tried turning it off and on again, no change. Actually, I"m more interested in seeing the names than the pictures. I now only see the phone numbers and generic avatars where pics would be.

  • The URL that was set for my homepage redirects to a network authorization page even though I cannot connect to that network.

    I am no longer within range of a network that I was ''attempting'' to connect to as ''guest'' and now the guest registration page for the network overrides the url that was set for my home page in Firefox. (I never actually established a connection to the internet on that network because the guest username and password repeatedly failed.) The redirect executes whenever I type the URL, www.yahoo.com, into the address bar or when I follow a link from a search engine to that URL, or when I click the home button. How can I stop the
    redirect? I deleted all network locations that contained that network and cookies that contained the network name but that did not stop the page redirection. I do not believe this to be a virus or malicious network because it was at an research institute. I looked in prefs.js for any keywords from that network but I did not find anything obvious.
    Thank you for your help.

    Easily fixed: clear the cache.

  • Which Mac Pro? More cores=slower speeds? And most of us know the speed matters or FPU for music and I don't understand the faster is for the least amount of procs. And while I get the whole rendering thing and why it makes sense.

    Which Mac Pro? More cores=slower speeds? And most of us know the speed matters or FPU for music and I don't understand the faster is for the least amount of procs. And while I get the whole rendering thing and why it makes sense.
    The above is what the bar says. It's been a while and wondered, maybe Apple changed the format for forums. Then got this nice big blank canvas to air my concerns. Went to school for Computer Science, BSEE, even worked at Analog Devices in Newton Massachusetts, where they make something for apple. 
    The bottom line is fast CPU = more FPU = more headroom and still can't figure out why the more cores= the slower it gets unless it's to get us in to a 6 core then come out with faster cores down the road or a newer Mac that uses the GPU. Also. Few. I'm the guy who said a few years ago Mac has an FCP that looks like iMovie on Steroids. Having said that I called the campus one day to ask them something and while I used to work for Apple, I think she thought I still did as she asked me, "HOW ARE THE 32 CORES/1DYE COMING ALONG? Not wanting to embarrass her I said fine, fine and then hung up.  Makes the most sense as I never quite got the 2,6,12 cores when for years everything from memory to CPU's have been, in sets of 2 to the 2nd power.  2,4,8,16,32,64,120,256,512, 1024, 2048,4196,8192, 72,768.  Wow. W-O-W and will be using whatever I get with Apollo Quad. 
    Peace to all and hope someone can point us in THE RIGHT DIRECTION.  THANK YOU

    Thanks for your reply via email/msg. He wrote:
    If you are interested in the actual design data for the Xeon processor, go to the Intel site and the actual CPU part numbers are:
    Xeon 4 core - E5.1620v2
    Xeon 6 core - E5.1650v2
    Xeon 8 core - E5.1680v2
    Xeon 12 core - E5.2697v2
    I read that the CPU is easy to swap out but am sure something goes wrong at a certain point - even if solderedon they make material to absorb the solder, making your work area VERY clean.
    My Question now is this, get an 8 core, then replace with 2 3.7 QUAD CHIPS, what would happen?
    I also noticed that the 8 core Mac Pro is 3.0 when in fact they do have a 3.4 8 core chip, so 2 =16? Or if correct, wouldn't you be able to replace a QUAD CHIP WITH THAT?  I;M SURE THEY ARE UO TO SOMETHING AS 1) WE HAVE SEEN NO AUDIO FPU OR PERHAPS I SHOULD CHECK OUT PC MAKERS WINDOWS machines for Sisoft Sandra "B-E-N-C-H-M-A-R-K-S" -
    SOMETHINGS UP AND AM SURE WE'LL ALL BE PLEASED, AS the mac pro      was announced Last year, barely made the December mark, then pushed to January, then February and now April.
    Would rather wait and have it done correct than released to early only to have it benchmarked in audio and found to be slower in a few areas- - - the logical part of my brain is wondering what else I would have to swap out as I am sure it would run, and fine for a while, then, poof....
    PEACE===AM SURE APPLE WILL BLOW US AWAY - they have to figure out how to increase the power for 150 watts or make the GPU work which in regard to FPU, I thought was NVIDIA?

  • I just downloaded the latest version of itunes for Windows and now itunes crashes every time i launch it

    I just downloaded the latest version of iTunes for Windows and now iTunes crashes everytime I launch it.  I have tried re-downloading and selecting Repair with no luck.  Any suggestions?
    thanks

    For general advice see Troubleshooting issues with iTunes for Windows updates.
    The steps in the second box are a guide to removing everything related to iTunes and then rebuilding it which is often a good starting point unless the symptoms indicate a more specific approach. Review the other boxes and the list of support documents further down page in case one of them applies.
    Your library should be unaffected by these steps but there is backup and recovery advice elsewhere in the user tip.
    If you've already tried a complete uninstall and reinstall try opening iTunes in safe mode (hold down CTRL+SHIFT as you start iTunes) then going to Edit > Preferences > Store and turning off Show iTunes in the Cloud purchases. You may find iTunes will now start normally.
    tt2

  • TS3694 Ipad won't turn on at all, and is recognized by itunes as being in recovery mode- when I try to restore it stalls on the portion stating "preparing device for restore" and on the page it says "itunes is restoring the software." Been like this for a

    Ipad 4 won't turn on at all, and is recognized by itunes as being in recovery mode- when I try to restore it stalls on the portion stating "preparing device for restore" and on the page it says "itunes is restoring the software." Been like this for a day. I read the restore notes on apple, but this seems to be a little more unique than the standard recovery issues. Any ideas as to why I seem to keep stalling during the restore process?

    Couple of things I can think of before going to the Apple store.
    First, if you can, power off the iPad. Then connect it to the charger that came with the iPad and plug that into a known good wall outlet. Leave it there at least an hour then try to reset your device. Press and hold the Home and Sleep buttons simultaneously until the Apple logo appears. Let go of the buttons and let the device restart.
    Also, you mentioned you have the latest iTunes. But it would be good to check the actual version. If the iPad is running iOS 7 you need iTunes 11.1 or later.

  • Been using an ipad 2 without passcode for quite some time. While upgrading to ios 7, I enabled Find Iphone on the Ipad. It asked for passcode and I gave one. Now I forgot it.

    Been using an ipad 2 without passcode for quite some time. While upgrading to ios 7, I enabled Find Iphone on the Ipad. It asked for passcode and I gave one. Now I forgot it. Connected it iTunes (on my windows 8 system) with which I sync and that too required an update as old version of itunes cannot read ios7. Problem is itunes does not want to get updated. Some vague error on not having right to write a file in ProgramData folder wheras as an admin i have full rights. Next I downloaded the install file from itunes site - no luck it would not install. Then I tried to remove itunes - it does not want to go. Am at wits end... Is the option only to restore it as a new Ipad?

    Place the iPhone into recovery mode and restore the iPhone with iTunes on your computer.
    http://support.apple.com/kb/HT1808

Maybe you are looking for