Mmap or malloc for 20GB file?

Greetings,
I have to process a flat file of about 17TB, and in the process I need to perform source to target lookup values which will be stored in a pre-processed hash file. The hash lookup table, for the 838 million or so records, will be approximately 20GB. Hardware is a SunFire (E25 or something) with about 8 CPU's allocated (SPARC), Solaris 10, 32GB RAM, and about 35TB of disk array. I will be compiling with the �xarch=v9b parameter for full 64-bit executables.
The 17TB input file will be a sequential read, so even though it is big, it should not be a problem. What I am trying to determine is how best to access the hash lookup table? Since good hash functions have the quality that they disperse the keys evenly throughout the table, lookups will require random access to the entire range of the table for the duration of the run.
What I am wondering is, would it be better to try and mmap the lookup table, or just malloc the 20GB at runtime and let the vm system handle it? Also, since the input file can be slit for processing, it would be advantageous to be able to run the program multiple times concurrently, each with its own chunk of the input. Thus, the lookup table would have to be accessed by the processes simultaneously as well (luckily as read only). Unfortunately, the lookup table cannot be split like the input data since any key could exist in any of the chunks.
Any insight would be greatly appreciated.
Thanks,
Matthew

If lookup table has to be shared among processes, then mmap with MAP_READ and
MAP_SHARED is the choice.
As I am writing this, I am wondering, how will it be, if I create a ISM segment using
shmget/shmat and then keep reading the hash file into the address retuned by shmat
from one process and then rest of the processes can just attach to that specific SHMID.
This way even VM data structrues are shared among processes. But you need to try
this out yourself to see how it works.
-surya

Similar Messages

  • 11.2.0.3 grid installation fails while selecting OCFS2 for ocr files

    We are installing 11GR2 (11.2.0.3) cluster on a 64 bit system. we have OCFS2 filesystem for shared devices. version 1.6.3.
    While selecting ocr file locations , we get the following error
    [INS-41321] Invalid oracle cluster register [OCR] Location
    Cause- The installer detects that the storage type of location is not supported for Oracle Cluster registery
    Action - Provide a supported storage location for the Oracle Cluster Registry
    Additional information
    /crp2db01/OCR/ocr_1 is not shared
    However , this mountpoint is shared across both the nodes.
    Note: 11201 grid installation was successful and it accepted the above locations for OCR. however ,we need 11.2.0.3 cluster for 11.2.0.3 database

    As for your current problem, just because Oracle "allows" OCFS2 in a GRID environment, I would never suggest nor implement that. It adds a layer of complexity that is totally unnecessary when a GRID/ASM implementation performs circles around OCFS2. ASM is much easier to manage, maintain, expand and shrink than OCFS2. Especially at version 11.2.0.3. When working at a large telco a few years ago, we had a 300TB+ ASM environment. OCFS2 could not even begin to be that big. ASM will provide you a MUCH more stable environment than OCFS2. And with ASM there is a lot of "magic" that happens with OCR/Voting that makes your life MUCH easier. If you "require" shared application files, then use ASM/ACFS. It is a much better "volume manager" than OCFS2.
    Since you must present devices to the system for OCFS2, you should not have any problems doing the same for ASM. (and don't use ASMLib as it is going away and is not necessary - just make sure you use a partition that skips the first 1M (usually cylinder 1) and you should be good to go!)
    I also would not use a "shared ORACLE_HOME" on either ACFS or OCFS2. The biggest reason is that you lose the ability to do a "rolling" upgrade and when you have a VLC, that becomes much more important that saving a few GB worth of storage.
    I would also pay attention to this:
    http://docs.oracle.com/cd/E11882_01/install.112/e22489/storage.htm#CDEDAHGB
    3.1.4.2 General Storage Considerations for Oracle RAC
    Use the following guidelines when choosing the storage options to use for each file type:
    You can choose any combination of the supported storage options for each file type provided that you satisfy all requirements listed for the chosen storage options.
    If you plan to install an Oracle RAC home on a shared OCFS2 location, then you must upgrade OCFS2 to at least version 1.4.1, which supports shared writable mmaps.
    Oracle recommends that you choose Oracle ASM as the storage option for database and recovery files.
    For Standard Edition Oracle RAC installations, Oracle ASM is the only supported storage option for database or recovery files.

  • Is there a size limit for uploading files?

    is there a size limit for uploading files?

    You have max 20GB storage.
    Upload anything you want up to that limit - be it one file or many.
    The only limit after that is the speed of your internet connection since extremely large files can take hours to upload.

  • Getting outOfMemory while using Xpath for 6MB file

    Hi ,
    Requirement:
    I have thousands of xml files of variable size (mostly around 5MB), Total size is around 20GB .The structure of xml content is as follows.
    filename: xaaaa
    <file>
    <page>
    <title>AmericanSamoa</title>
    <id>6</id>
    <revision>
    <id>133452270</id>
    <timestamp>2007-05-25T17:12:06Z</timestamp>
    <contributor>
    <username>Gurch</username>
    <id>241822</id>
    </contributor>
    <minor />
    <comment>Revert edit(s) by [[Special:Contributions/Ngaiklin|Ngaiklin]] to last version by [[Special:Contributions/Docu|Docu]]</comment>
    <text xml:space="preserve">#REDIRECT [[American Samoa]]{{R from CamelCase}}</text>
    </revision>
    </page>
    My task is to retrieve the ID , filename in which it exists and the position of node in the page, and i have to write it to a file.
    ex: 6:xaaaa:1
    My approach:
    I am using Xpath for this. The code is as follows.
    */*XPathReader.java*/*
    package preprocess;
    import java.io.IOException;
    import javax.xml.XMLConstants;
    import javax.xml.namespace.QName;
    import javax.xml.parsers.*;
    import javax.xml.xpath.*;
    import org.w3c.dom.Document;
    import org.xml.sax.SAXException;
    public class XPathReader {
    private String xmlFile;
    private Document xmlDocument;
    private XPath xPath;
    public XPathReader(String xmlFile) {
    this.xmlFile = xmlFile;
    initObjects();
    private void initObjects(){       
    try {
    xmlDocument = DocumentBuilderFactory.
                   newInstance().newDocumentBuilder().
                   parse(xmlFile);
    xPath = XPathFactory.newInstance().
                   newXPath();
    } catch (IOException ex) {
    ex.printStackTrace();
    } catch (SAXException ex) {
    ex.printStackTrace();
    } catch (ParserConfigurationException ex) {
    ex.printStackTrace();
    public Object read(String expression,
                   QName returnType){
    try {
    XPathExpression xPathExpression =
                   xPath.compile(expression);
    return xPathExpression.evaluate
                   (xmlDocument, returnType);
    } catch (XPathExpressionException ex) {
    ex.printStackTrace();
    return null;
    XpathReaderTest.java
    /* it takes directory name as argument, this directory contains xml file*/
    package preprocess;
    import java.io.*;
    import javax.xml.xpath.XPathConstants;
    import org.w3c.dom.*;
    public class XPathReaderTest {
    public XPathReaderTest() {
    public static void main(String[] args) throws IOException{
         if (args.length <= 0) {
              System.out.println(
              "Usage: java PreProcess dir_name"
              return;
              String dir=null;
              if (args.length >= 1) dir = args[0];
              int indexno=0;
              File directory = new File(dir);
              File[] files = directory.listFiles();
              FileWriter fstream = new FileWriter("index"+indexno+".txt");
         BufferedWriter out = new BufferedWriter(fstream);
         XPathReaderTest xt=new XPathReaderTest();
              /*for (int index = 0; index < files.length; index++)
                   System.out.println(files[index].toString());
              for (int index = 0,i=1; index < files.length; index++)
                   /*if(index/100>indexno){
                        indexno++;
                        out.close();
                        fstream = new FileWriter("index"+indexno+".txt");
                   out = new BufferedWriter(fstream);
                   xt.extract(files[index].toString(),index,i,out);
                   System.gc();
              out.close();
    public void extract(String completepath,int index,int i,BufferedWriter out)
    throws IOException
         System.out.println(index+" "+completepath);
              XPathReader reader = new XPathReader(completepath);
              String separator = File.separator;
              int pos = completepath.lastIndexOf(separator);
              String temp_fname=completepath.substring(0,pos);
              pos=temp_fname.lastIndexOf(separator);
              String f_name= completepath.substring(pos+1);
              i=1;
              while(true)
              String expression = "/file/page["+i+"]/id";
              String id_value= (String) reader.read(expression, XPathConstants.STRING);
              if(id_value=="")
                   break;
              out.write( id_value + ":"+ f_name+ ":"+i+ "\n" );
    i++;
    Problem:
    This code works fine for xml files < 6MB, but its giving outOfMemory for 6MB and above file.
    I have tried with -Xms256m -Xmx512m option.
    Please suggest the work around , or any modification to code that will resolve my problem.
    I am new to java world , so problem root cause will be very helpful for me.
    Thanks

    Hi ,
    Requirement:
    I have thousands of xml files of variable size (mostly around 5MB), Total size is around 20GB .The structure of xml content is as follows.
    /*filename: xaaaa*/
    <file>
    <page>
        <title>AmericanSamoa</title>
        <id>6</id>
        <revision>
          <id>133452270</id>
          <timestamp>2007-05-25T17:12:06Z</timestamp>
          <contributor>
            <username>Gurch</username>
            <id>241822</id>
          </contributor>
          <minor />
          <comment>Revert edit(s) by [[Special:Contributions/Ngaiklin|Ngaiklin]] to last version by [[Special:Contributions/Docu|Docu]]</comment>
          <text xml:space="preserve">#REDIRECT [[American Samoa]]{{R from CamelCase}}</text>
        </revision>
      </page>
    </file>My task is to retrieve the ID , filename in which it exists and the position of node in the page, and i have to write it to a file.
    ex: 6:xaaaa:1
    My approach:
    I am using Xpath for this. The code is as follows.
    */*XPathReader.java*/*
    package preprocess;
    import java.io.IOException;
    import javax.xml.XMLConstants;
    import javax.xml.namespace.QName;
    import javax.xml.parsers.*;
    import javax.xml.xpath.*;
    import org.w3c.dom.Document;
    import org.xml.sax.SAXException;
    public class XPathReader {
        private String xmlFile;
        private Document xmlDocument;
        private XPath xPath;
        public XPathReader(String xmlFile) {
            this.xmlFile = xmlFile;
            initObjects();
        private void initObjects(){       
            try {
                xmlDocument = DocumentBuilderFactory.
                   newInstance().newDocumentBuilder().
                   parse(xmlFile);           
                xPath =  XPathFactory.newInstance().
                   newXPath();
            } catch (IOException ex) {
                ex.printStackTrace();
            } catch (SAXException ex) {
                ex.printStackTrace();
            } catch (ParserConfigurationException ex) {
                ex.printStackTrace();
        public Object read(String expression,
                   QName returnType){
            try {
                XPathExpression xPathExpression =
                   xPath.compile(expression);
                return xPathExpression.evaluate
                   (xmlDocument, returnType);
            } catch (XPathExpressionException ex) {
                ex.printStackTrace();
                return null;
    XpathReaderTest.java
    /* *it takes directory name as argument, this directory contains xml file**/
    package preprocess;
    import java.io.*;
    import javax.xml.xpath.XPathConstants;
    import org.w3c.dom.*;
    public class XPathReaderTest {
        public XPathReaderTest() {
        public static void main(String[] args) throws IOException{
             if (args.length <= 0) {
                    System.out.println(
                     "Usage: java PreProcess dir_name"
                    return;
              String dir=null;
              if (args.length >= 1) dir = args[0];
              int indexno=0;
              File directory = new File(dir); 
              File[] files = directory.listFiles();
              FileWriter fstream = new FileWriter("index"+indexno+".txt");
             BufferedWriter out = new BufferedWriter(fstream);
             XPathReaderTest xt=new XPathReaderTest();
              /*for (int index = 0; index < files.length; index++)
                   System.out.println(files[index].toString()); 
              for (int index = 0,i=1; index < files.length; index++)
                   /*if(index/100>indexno){
                        indexno++;
                        out.close();
                        fstream = new FileWriter("index"+indexno+".txt");
                       out = new BufferedWriter(fstream);
                   xt.extract(files[index].toString(),index,i,out);
                   System.gc();
              out.close();
        public void extract(String completepath,int index,int i,BufferedWriter out)
        throws IOException
             System.out.println(index+" "+completepath);
              XPathReader reader = new XPathReader(completepath);
              String separator = File.separator;
              int pos = completepath.lastIndexOf(separator);
              String temp_fname=completepath.substring(0,pos);
              pos=temp_fname.lastIndexOf(separator);
              String f_name= completepath.substring(pos+1);
              i=1;
              while(true)
              String expression = "/file/page["+i+"]/id";
              String id_value= (String) reader.read(expression, XPathConstants.STRING);
              if(id_value=="")
                   break;
              out.write( id_value + ":"+ f_name+ ":"+i+ "\n" );
            i++;
    }Problem:
    This code works fine for xml files < 6MB, but its giving outOfMemory for 6MB and above file.
    I have tried with -Xms256m -Xmx512m option.
    Please suggest the work around , or any modification to code that will resolve my problem.
    I am new to java world , so problem root cause will be very helpful for me.
    Thanks

  • When i double click itunes it doesn't open it just comes up with an error saying " The itunes library.itl file cannot be found or created. The default location for this file is in the 'itunes' folder in the 'music' folder". How can i fix this?

    When i double click itunes it doesn't open it just comes up with an error saying " The itunes library.itl file cannot be found or created. The default location for this file is in the 'itunes' folder in the 'music' folder. How can i fix this problem?

    Anyone can help to advice how to solve this issue ?

  • I have 3 older ext. hard drives that I've utilized many times. Today while searching for old files, one of the three is no longer recognized by my PowerMac.  Any suggestions?

    I have 3 older ext. hard drives that I've utilized many times. Today while searching for old files, one of the three is no longer recognized by my PowerMac. The drive is not listed in Disk Utility.  Any suggestions?

    Is the computer in you equipment line:
    Dual Core Intel Xenon
    (which is not a PowerMac but a Mac Pro) the one you are asking about, or do you have an older PowerMac?
    If a Mac Pro, their forums are here:
    Mac Pro
    and, as Mac Pros have a totally different architecture from the pre-2005 Macs this forum covers, you may not have the same issues that can affect the older models. If someone didn't notice your equipment line, you could get advice that doesn't apply.
    If you really have a pre-2005 PowerMac, read on.
    If the stubborn external is USB and does not have its own power brick (i.e., it gets power only from the computer's UBS ports--"bus powered"), it may not be getting enough power. As electric motors age, they can demand more power than when new, and the power available on any USB port is limited.
    The typical workabouts to making a computer recognize an aging, bus-powered USB drive are:
    Get a powered USB hub (has its own power brick
    Get a "Y" USB cable: 1 Meter USB 2.0 A to 5 Pin Mini B Cable - Auxiliary USB "Y" Power Design for external hard drives.
    The second gets power from two USB ports on the computer and often that's enough.
    Remember that the USB ports on your keyboard seldom provide enough power even for a thumb drive, so be sure to use the USB ports on the back of the computer.

  • File Server Role: Slow access for "opened files" and slow Explorer browsing

    Since we migrated our fileserver from Windows Server 2008 R2 to Windows Server 2012 we are facing two major problems:
    1. Opening files which are already opened by other users takes about 1 minute before the file actually opens. This is not only for Office files such as Excel and Word, but also for other (not office) files. Again, this problem only rises when the file(s)
    is/are already opened by another user. There seems to be a sort of "Lock" check time which is about 45 to 60 seconds.
    2. The other problem is browsing via Explorer through the network drive (all clients are Windows 7 clients). Half of the time there is some kind of "hick up" with displaying the results of the folder. I cannot figure out a patern, but if there
    is no "hick up" then browsing is very fast (also in the busiest times of the working day)... If there is a "hick up" the result can take about 50 seconds to display the content of a folder.
    I suspect the SMB implementation / settings of Windows Server 2012 which are causing the problems...
    Things I tried:
    1. Changed the Oplocks wait time to 10 seconds (which is the minimum). The result is that openening files does indeed go some faster (still taking about 45 seconds).
    2. Disabled SMB2: the result is that browsing is fast... Opening files does go faster. BUT: we are then facing other problems like some files are not able to open... This setting was, after getting a lot of complaints from the users, changed back to enabled
    SMB2.
    3. Within the NIC card properties I disabled "QoS packet Scheduler", "Link-Layer Topology Discovery Mapper I/O Driver", "Link-Layer Topology Discovery Responder" and IPv6 (as we only use IPv4).
    All above with not the promising results.
    The server is a dedicated (virtual machine on vSphere 5.1) fileserver.
    Please Advice since this is not workable, and we have postponed the migration of the fileserver for our aother location.

    Hi Dave,
    I suggest you disable all third party applications like Anti-Virus application to test if it could reduce the waiting time when accessing a file.
    Here are some related threads below that could be useful to you:
    DFS Slowness when Opening Microsoft Documents and Excel Spreadsheets
    http://social.technet.microsoft.com/Forums/windowsserver/en-US/61ec9a99-0027-44cb-815c-0da9276c1c96/dfs-slowness-when-opening-microsoft-documents-and-excel-spreadsheets?forum=winservergen
    Opening files over network takes long time
    http://social.technet.microsoft.com/Forums/windows/en-US/c8ddb65f-8a17-4cee-afd4-dfc09e99d562/opening-files-over-network-takes-long-time?forum=w7itpronetworking
    opening folder or file takes over a minute on Windows 2008R2 File server
    http://social.technet.microsoft.com/Forums/windowsserver/en-US/b9aa98c4-3ef7-4e6d-810d-6099e72b33f6/opening-folder-or-file-takes-over-a-minute-on-windows-2008r2-file-server?forum=winserverfiles
    Best Regards,
    Amy Wang

  • Variable subsitution for target file names

    Hi All,
    I am using variable subsitution for dynamic file names. I am using the multimapping for multiple files in the target.So i coluld not able to use the dynamic configuration for file names. Now i want to replace all the spaces in the filename to underscore.
    For example
    My payload filed value "file name in the target file".
    Now my filename  "file_name_in_the_target_file".
    How to achieve this using Variable subsitution.
    Regards,
    Ramalakshmi.G

    Use replaceString Function.
    file name
    Constant (" ")               --> replaceString -------> TargetField
    Constant ("_")
    Regards
    Ramesh

  • Advance select for source file in Sender File Adapter

    Hi
    I am trying to utilise the parameter 'Advance Selection for source file' on a sender file adapter to pick the file from multiple folders
    My problem is that this parameter that is listed on the SAP help is not getting in File accessing Parameters.
    I am running PI 7.02 (NW702_07_Rel)
    Service pack 07
    Has anyone come across this before?
    any suggestions on how to do this?

    Hi,
    In PI7.0 the property exists. I have used it previously. I think SP was 13.
    Regards,
    Nutan
    Edited by: nutan champia on Nov 24, 2011 10:42 AM

  • How can i stop an error message that comes up when i am using word? the error message is "word is unable to save the Autorecover file in the location specified. Make sure that you have specified a valid location for Autoreover files in Preferences,-

    how can i stop an error message that comes up when i am using word? the error message is "word is unable to save the Autorecover file in the location specified. Make sure that you have specified a valid location for Autoreover files in Preferences,…"

    It sounds like if you open Preferences in Word there will be a place where you can specify where to store autorecover files. Right now it sounds like it's pointing to somewhere that doesn't exist.

  • Using relative path for in file/ftp adapter

    Hi All,
    How to have a relative path for file/ ftp adapter's inbound/outbound operation?
    Example: Consider $ORA_HOME = /home/oracle --> This environment variable can be different on different machines
    i want to drop a file in to $ORA_HOME/folder1/folder2 (Or poll for a file).
    <partnerLinkBinding name="FTP">
    <property name="wsdlLocation">FTP.wsdl</property>
    <property name="out_dir" type="LogicalDirectory">What do i write here???</property>
    <property name="retryInterval">60</property>
    </partnerLinkBinding>
    if i cant configure this in partner link section or in activation agent sction, how else do i achieve this?
    i am using 10.1.3.* version.
    Thanks in advance.
    Roshan.

    You can achive it using the deployment scripts if the directory is changing on the basis of the environment
    If you want to change at run time than you can use the jca properties to set using the variables at runtime.
    Regards,
    Ajay

  • Relative Path for JSP Files

    I use Liferay Portal. It has a defined directory structure for all portlets and expects files to be in their proper locations. One of these is that it expects all .jsp files to be in the /webroot/html directory.
    The problem is that all references in the struts-config.xml to .jsp files are assumed to be relative to the /html directory. So if, for instance, I set the path as /test/test.jsp then Liferay looks for the file in /html/test/test.jsp. But NitroX requires the full path, including the /html.
    I've gotten this to work on another competing product - in that product I set the Web Root to be the /html directory and it works great. That product allows you to set the path of the struts-config.xml separately. Is there any way to do something similar with NitroX? I tried changing the webroot in the .m7project file manually but it doesn't like it. It seems to insist that the struts-config.xml file is in the webroot.
    Thanks.

    NitroX currently supports only a war directory structure. When a project is created, the Web Application Root directory is computed as the directory containing the WEB-INF folder (which should contain the web.xml file).
    All web app resources (JSP files, image, css files, etc) are expected to be under the Web Application Root directory (at any nested directory level).
    We will enhance this behavior in the future to support arbitrary directory structures.

  • Attachments dont download from Gmail in Safari, They disappear,I even tried re-downloading Safari and that download disappeared, it is not in my downloads folder, I actually searched in finder for the file and it is nowhere on my computer.  Please help!

    Attachments dont download from Gmail in Safari, They disappear,I even tried re-downloading Safari and that download disappeared, it is not in my downloads folder, I actually searched in finder for the file and it is nowhere on my computer.  Please help!

    Oh my gosh I had the EXACT same problem, and for ages I couldn't figure out how to fix it until today. Here's what I did:
    First I went onto my computer, opened itunes, and un-installed tumblr, vine and kik. These were the apps I was having problems with (it said I had them on my phone but, like you, they didn't show).
    Then I went to the itunes store, searched for each one, and it said I could update them so I did (just FYI: for tumblr a window popped up saying "please click ok to confirm you are 17 years or older", so I did that also)
    When I went back to my phone and tried installing them again (still on my computer), it worked!
    I hope this helps, because it was incredibly frustrating. Good luck!

  • How do you open Garageband for iPad files with Garageband?

    I was led to believe that Garageband could open up Garageband For iPad files, but not vice versa.
    However when a friend sent me his Garageband For iPad file, and I tried to open it, a "compatibility updater" started running then stopped halfway through and told me Garageband was "unable to complete the download" and directed me to go to Apple help.
    Here I am.
    Anybody got any idea how to open (and work on) this file?
    Tank you in advance.
    -Dave-id

    Apparently there were a two updaters I had to run first, before letting the compatability updater do it's thing.
    And since I got my Garageband for Mac through the APP store I could only do my updating through that.
    So if anyone else is having that same problem, update, update, update!

  • How to get a list of file paths for all files used in a project

    I have a project in Premiere Pro CC which has a large number of bins.  A sequence in one of these bins uses files from other bins.  I am trying to find the locations of all each of the files used  in the project.
    1)  Obviously I can select each clip in the timeline and show in finder, but there a lot of clips
    2) The video usage associated with each file in the project would help.  However
               1.  I haven't found a way to display only clips that have video useage if all of the bins have not been expanded.
              2.  Video usage  shows usage for all sequences, so one would have to manually check the pull down for each file to see if it is used in the sequence in question.
    3) I tried exporting the project to final cut pro xml.  The path url gives me the information that I need.  For some reason, however, when I do the export only one clip's information is there, not the information for all of the other clips in the project.
    4) I tired an export for speedgrade and all of the file names are there.  However the paths are not. 
    Basically I want to find all of the files in the project and relocate them to a specific folder for that project.  There's got to be a way to do this but I'm not seeing it ....

    Thanks for the suggestion concerning the file path.  And certainly it would have been nice to have done this before beginning.  However this is a project that has been around for quite a while, and the files have been moved into different bins.  And now the project sequence is being revised.
    So the problem is, worded slightly differently, how can I search all of the bins for the files that are used just by this sequence, ignoring the files which are used by other sequences?  Or, how can I get a list of the file paths of the files that are used in the sequence?

Maybe you are looking for