HashMap usage in multithread env

Based on my understanding java.util.HashMap is not threadsafe. My question is does anyone know how exactly will hashmap behave in following situation if no external synchronization is done
//hashMapInstance has oldobj associated with "SOMEKEY"
Multiple Reader threads executing hashMapInstance.get("SOMEKEY")
One Writer thread executing hashMapInstance.put("SOMEKEY", newobj)
I understand the behaviour will be unpredictable, but does that mean that reader threads have chance of getting null values?
Or does that mean that some readers might get oldobj and some readers might get newobj?

Why worry about what it might do... just fix it.
hashMapInstance = Collections.synchronizedMap(hashMapInstance);
...there, now you're thread safe.

Similar Messages

  • Hashmap usage

    Dear JAVA Community,
    Could you please help me, I'am old C++-Programmer und i am trying to understand JAVA. Thank you very much.
    That works pretty:
    HashMap myMap = new HashMap();
    for (int i = 0; i < someGivenNumber; i++) {
         myClass r = new myClass (x*i,y*i); 
         myMap.put("r" + i, r);
    }Now, I would like to acces a method of the instance r2 e.g:
    r2.print();Unfortunately, I have not been able to discover this in the documentation. (Maybe because I am mentally handicapped by older experiences :-)
    thank you very much for your kind descriptions
    Emil from Vienna

    alternatiively, although I prefer the approcah above
    import java.util.HashMap;
    public class TestMethod {
        public static void main(String[] args) {
            HashMap myMap = new HashMap();
            for (int i = 0; i < 10; i++) {
                myMap.put("r" + i, new MyClass(5 * i, 5 * i));
            ((MyClass)myMap.get("r2")).print();
    class MyClass{
        int i =0;
        int j =0;
        public MyClass(int i, int j){
            this.i = i;
            this.j = j;
        public void print(){
            System.out.println("i is "+i+" j is "+j);
    }

  • Segmentation fault in __memp_fget

    Hi bdb experts,
    My program encountered the segfault with the following detailed inf:
    # lsb_release -a
    LSB Version: :core-3.1-amd64:core-3.1-ia32:core-3.1-noarch:graphics-3.1-amd64:graphics-3.1-ia32:graphics-3.1-noarch
    Distributor ID: RedHatEnterpriseServer
    Description: Red Hat Enterprise Linux Server release 5.5 (Tikanga)
    Release: 5.5
    Codename: Tikanga
    filesystem: ext2
    gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)
    the BDB config:
    ==================
    set_cachesize 2 0 3
    env flag: DB_CREATE | DB_INIT_MPOOL | DB_THREAD
    db flag: DB_CREATE | DB_THREAD
    And my Berkeley database was configured with 3 partitions
    as config above, my bdb was configured within a multithread environment
    dbp->set_partition(dbp, 3, NULL, db_partition_index)
    ==================
    The coredump back trace:
    Program terminated with signal 11, Segmentation fault.
    #0 0x00002ad2db2a78b1 in __memp_fget (dbmfp=0x1f9301e0, pgnoaddr=0x46582324, ip=0x0, txn=0x0, flags=0, addrp=0x46582328) at ../src/mp/mp_fget.c:260
    260 if (bhp->pgno != *pgnoaddr || bhp->mf_offset != mf_offset)
    (gdb) bt
    #0 0x00002ad2db2a78b1 in __memp_fget (dbmfp=0x1f9301e0, pgnoaddr=0x46582324, ip=0x0, txn=0x0, flags=0, addrp=0x46582328) at ../src/mp/mp_fget.c:260
    #1 0x00002ad2db14adb7 in __bam_search (dbc=0x2aaab89d1430, root_pgno=1, key=0x46582b20, flags=12802, slevel=1, recnop=0x0, exactp=0x465826b4) at ../src/btree/bt_search.c:806
    #2 0x00002ad2db1305c4 in __bamc_search (dbc=0x2aaab89d1430, root_pgno=1, key=0x46582b20, flags=14, exactp=0x465826b4) at ../src/btree/bt_cursor.c:2804
    #3 0x00002ad2db12e170 in __bamc_put (dbc=0x2aaab89d1430, key=0x46582b20, data=0x46582af0, flags=20, pgnop=0x46582784) at ../src/btree/bt_cursor.c:2143
    #4 0x00002ad2db22fd95 in __dbc_iput (dbc=0x2aaab49c68e0, key=0x46582b20, data=0x46582af0, flags=20) at ../src/db/db_cam.c:2134
    #5 0x00002ad2db22fbf7 in __dbc_put (dbc=0x2aaab49c68e0, key=0x46582b20, data=0x46582af0, flags=20) at ../src/db/db_cam.c:2047
    #6 0x00002ad2db2c6b91 in __partc_put (dbc=0x2aaab5389810, key=0x46582b20, data=0x46582af0, flags=20, pgnop=0x465828b4) at ../src/db/partition.c:1055
    #7 0x00002ad2db22fd95 in __dbc_iput (dbc=0x2aaab5389810, key=0x46582b20, data=0x46582af0, flags=20) at ../src/db/db_cam.c:2134
    #8 0x00002ad2db22fbf7 in __dbc_put (dbc=0x2aaab5389810, key=0x46582b20, data=0x46582af0, flags=20) at ../src/db/db_cam.c:2047
    #9 0x00002ad2db22aad1 in __db_put (dbp=0x1f92db90, ip=0x0, txn=0x0, key=0x46582b20, data=0x46582af0, flags=20) at ../src/db/db_am.c:537
    #10 0x00002ad2db24488c in __db_put_pp (dbp=0x1f92db90, txn=0x0, key=0x46582b20, data=0x46582af0, flags=20) at ../src/db/db_iface.c:1640
    #11 0x000000000041be46 in bdb::put (this=0x1f92c0a0, key=0x2aaab800103c "Layout:http://emap3.mapabc.com/mapabc/maptile?v=w2.61&x=54&y=26&z=6",
    value=0x2aaab8001240 "256|256▒\232-N", vsize=11, ts=1311611596) at backend/bdb.cc:268
    #12 0x00000000004151bf in cache_process_add (cmd_no=2, req_head=0x2aaab8001010, req_buf=0x46582e90, res_head=0x2aaacc0008c0, res_buf=0x46582e70) at gate_cache.cpp:1061
    #13 0x00000000004121e0 in ub_process_cmdmap (cmd_map=0x508a60, cmd_no=2, req_head=0x2aaab8001010, req_buf=0x46582e90, res_head=0x2aaacc0008c0, res_buf=0x46582e70)
    at ../../../../../../public/ub/output/include/ub_proccmd.h:27
    #14 0x0000000000414245 in cache_cmdproc_callback () at gate_cache.cpp:1302
    #15 0x0000000000469701 in apool_consume (pool=0x1f92e950, data=0x1f92e858) at apool_native.cpp:39
    #16 0x000000000044407f in apoolworkers (param=0x1f92e858) at apool.cpp:533
    #17 0x00000033b100673d in start_thread () from /lib64/libpthread.so.0
    #18 0x00000033b04d3d1d in clone () from /lib64/libc.so.6
    My question is:
    1) in multithread env bdb put and get method should have some other flag exception DB_THREAD or other config for env and db open?
    2) does the ext2 fs may affect the disk read and write for bdb?
    3) does the number of threads of my program affect the bdb put and get?
    4) what is the suitable page size for 64bit machine of bdb?

    Hello,
    What is the Berkeley DB version?
    For your question on page size please take a look at the
    documentation on, "Selecting a page size" at:
    http://download.oracle.com/docs/cd/E17076_02/html/programmer_reference/general_am_conf.html#am_conf_pagesize
    and for those on multithreaded environments please take a
    look at the documentation on, "Multithreaded applications" at:
    http://download.oracle.com/docs/cd/E17076_02/html/programmer_reference/program_mt.html
    I am not aware of any impacts of the ext2 filesystem. Perhaps
    someone else might have more information on that.
    Some other suggestions are to:
    0. Build with --enable-diagnostic to enable run-time
    debugging checks.
    1. turn on verbose error messaging as that often provides
    additional run-time error information. Please see the,
    "Run-time error information", documentation at:
    http://download.oracle.com/docs/cd/E17076_02/html/installation/debug_runtime.html
    2. Collect db_stat -MA statistics to to verify that the cache details
    look to be in order. See:
    http://download.oracle.com/docs/cd/E17076_02/html/api_reference/C/db_stat.html
    Thanks,
    Sandra

  • Handling collections in coherence cache

    hi,
    In a mutithreaded environment, how does coherence cache handles collection like HashMap in a multithreaded environment where the large majority of method calls are read-only, instead of structural changes?
    Are the read calls non-synchronized by coherence? Is there any mechanism to handle the write calls different from the read calls?
    Thanks.
    suvasis

    Hi Suvasis,
    Coherence caches are coherent, and use a minimal (or zero if possible) amount of synchronization for read-access.
    Coherence does support double-checked locking for read-heavy access:
    <tt>
    Object value = cache.get(k);
    if (value == null)
      cache.lock(key, -1);
      try
        value = cache.get(key);
        if (value == null)
          value = {something};
          cache.put(key, value);
      finally
        cache.unlock(key);
    // read-access to value
    Object x = value.getSomeAttribute();
    </tt>
    It should be noted that Coherence does not "observe" objects outside of the Coherence API calls (get/put/lock/unlock/etc). So once you "get" a local object instance from Coherence, Coherence doesn't pay attention to that local object until you explicitly "put" it the modified object back into the cache.
    Jon Purdy
    Tangosol, Inc.

  • Volatile Keyword

    I am confused with the keyword "volatile". Few questions.
    I have seen two different descriptions for this keyword.
    1) Processors store data in their own registers for more efficient use.In Multiprocessor environments, the "volatile" keyword will ensure that a piece of shared data is always picked up from the (common) memory location and not reused from the (private copy) register for that processor.
    If this is the case, then dont we require to use this keyword for all class level variables(static) ? Or does putting static keyword ensure this? (A static member is supposed to hold a common copy for all instances - how does it internally work in a multiprocessor environment?)
    Is it possible that an instance may be partly serviced by one processor and partly by another in a multiprocessor env? In that case, dont we need to declare all instance variables as "volaile"?
    2) "Volatile" keyword ensures that the data is "sequencially consistent" - meaning
    if we have --
    volatile int a = 5;
    volatile boolean flag = true;
    Then if flag is set to true, then a will already have been set to 5 always (this means this may not always be the case in the Runtime env, if the keyword is not volatile -- so in multithreading env..there can be problems if global variables are not "volatile"
    Am I on the right track? The two descriptions of "volatile" seem very different. Are they both correct? I also read that many JVMS dont implement this as of now..so where do we stand in the use\relevance of this keyword?
    Thanks,
    Mathew Samuel.

    If this is the case, then dont we require to use this
    keyword for all class level variables(static) ? Or
    does putting static keyword ensure this? (A static
    member is supposed to hold a common copy for all
    instances - how does it internally work in a
    multiprocessor environment?)First, lets correct one of your statements. You say, "A static member is supposed to hold a common copy for all instances" It may seem subtle, but a static variable is associated with the class, not any instance. There doesn't need to be any instances of a class for a static to exist. The point is that there is not a 'copy' for each instance.
    In a multi-threaded environment, each thread can cache it's own data. This is for performance. It takes time to keep the threads from interfering with each other.
    Is it possible that an instance may be partly serviced
    by one processor and partly by another in a
    multiprocessor env? In that case, dont we need to
    declare all instance variables as "volaile"?If you synchronize access to variables, you don't need to declare things as volatile. volatile isn't used all that often. Mainly, it's for certain situations where full synchrionization is not required.
    2) "Volatile" keyword ensures that the data is
    "sequencially consistent" - meaning
    if we have --
    volatile int a = 5;
    volatile boolean flag = true;I can't really confirm or deny this.

  • Flash Player lags in Win 7 x64

    Hello,
    I'm having an issue with Flash Player on my new desktop computer because it chops up playback. I'm talking about getting 1 FPS in a Flash movie that a much older computer can easily play with 20 FPS.
    I use Windows 7 Ultimate x64, Flash Player 11.0.1.152 64-bit, DirectX 11, and my graphics driver (AMD) is up to date as well.
    The problem seems to be related to CPU usage and multithreading (or lack of it, thereof). While other computers in my household seem to use multiple cores for Flash playback, this one uses only the first one of its four cores, and that core gets maxed out to 100% usage while a moderately demanding Flash movie is being played. However, this computer uses all of its CPU cores for other tasks and applications, so the issue seems to be limited to Flash Player only.
    So far I've tried reinstalling Flash Player, downgrading Flash Player, using 32-bit version instead of 64-bit, using different browsers, but the problem persisted in each case. I've also tried toggling harware acceleration in Flash Player's settings, but that seemed to have no effect at all on either CPU or GPU usage, not even after a restart.
    At this time I'm kind of running out of ideas and I'd be grateful for any help. Thanks in advance!

    Nevermind, I've figured out that my processor was faulty. The problem is solved now.

  • Connection Strategy

    i've got a problem with the speed of connecting to AD servers when the server is not available.
    normally when i do a search for a user using the search method on the DirContext class and the server is available it normally everything goes smooth.. what i wanted to do was catch a ConnectException just in case the server I am binding to is not available.. I tried a simple bind :
         String server = "<an IP address, but of course it does not exist or is actually down on the network>";
         string strUrl = "ldap://" + server + "<some valid URL>";
         hUserInfo = new HashMap();
         members = new Vector();
         env = new Hashtable(5, 0.75F);
         env.put("java.naming.ldap.version", "3");
         env.put("java.naming.referral", "throw");
         env.put("java.naming.security.authentication", "simple");
         env.put("java.naming.factory.initial", "com.sun.jndi.ldap.LdapCtxFactory");
         env.put("java.naming.provider.url", strUrl);
         env.put("java.naming.security.principal", strPrincipalName);
         env.put("java.naming.security.credentials", strPrincipalCreds);
         try
             ctx = new InitialDirContext(env);
         catch(NamingException ne)
             System.err.println("Error\t: Naming Exception occured");
             ne.printStackTrace();
         catch(Exception e)
             System.err.println("Error\t: General Exception occured");
             e.printStackTrace();
         }just in case it was going to refuse binding to that server but what happened was it accepted the bind even if the address i was passing was not available. it struck me that the only way was to do a directory search on the server that i binded to. my question is, is there a better way to see if the server is available other than doing a directory search on that server. "better" in terms of on how fast it would return the ConnectException or any exception denoting that the server is not available.. hope you guys could help

    Now I know why the ConnectException was happening.. it was trying to look for the address but the server connection timed out:
    Error     : Naming Exception occured
    javax.naming.CommunicationException: <some IP address>:389 [Root exception is java.net.ConnectException: Connection timed out: connect]
         at com.sun.jndi.ldap.Connection.<init>(Unknown Source)
         at com.sun.jndi.ldap.LdapClient.<init>(Unknown Source)
         at com.sun.jndi.ldap.LdapClient.getInstance(Unknown Source)
         at com.sun.jndi.ldap.LdapCtx.connect(Unknown Source)
         at com.sun.jndi.ldap.LdapCtx.<init>(Unknown Source)
         at com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(Unknown Source)
         at com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(Unknown Source)
         at com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxInstance(Unknown Source)
         at com.sun.jndi.ldap.LdapCtxFactory.getInitialContext(Unknown Source)
         at javax.naming.spi.NamingManager.getInitialContext(Unknown Source)
         at javax.naming.InitialContext.getDefaultInitCtx(Unknown Source)
         at javax.naming.InitialContext.init(Unknown Source)
         at javax.naming.InitialContext.<init>(Unknown Source)
         at javax.naming.directory.InitialDirContext.<init>(Unknown Source)
         at com.pwdReset.unlock.ADUnlock.<init>(ADUnlock.java:98)
         at com.pwdReset.unlock.TestADUnlock.main(TestADUnlock.java:30)
    Caused by: java.net.ConnectException: Connection timed out: connect
         at java.net.PlainSocketImpl.socketConnect(Native Method)
         at java.net.PlainSocketImpl.doConnect(Unknown Source)
         at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
         at java.net.PlainSocketImpl.connect(Unknown Source)
         at java.net.Socket.connect(Unknown Source)
         at java.net.Socket.connect(Unknown Source)
         at java.net.Socket.<init>(Unknown Source)
         at java.net.Socket.<init>(Unknown Source)
         at com.sun.jndi.ldap.Connection.createSocket(Unknown Source)
         ... 16 moreso i guess the right exception I am looking for is not probably ConnectException
    or ConnectException is the right one but I'm trying to find it the wrong way by letting it timeout...

  • Using J2SSH With MDBs

    Hi all,
    I have an MDB which receives remote location of file. I need to download that file from remote linux server to local file server. I am planning to use J2SSH library. Following are my options/thoughts:
    1. Create SSHClient object at MDB startup. So there is one SSHClient (SSH Connection) per MDB. On every onMessage() method, i will open sftpChannel, download file and quit channel. On MDB shutdown i will disconnect sshclient. Now question is with increased load, number of SSH Connections will increase. Is it a good approach ??
    2. Create a singleton class at startup which will create SSHClient. On every onMessage() method, I will get SSHClient from Singleton class and use it to create sftpChannel. Now i have only one ssh connection and using across all MDBs. How will it scale with increased load? Is it practical to use single SSHClient for huge load? I am bit nervous about singleton thing because if connection is disconnected for some network glitch, i will have to recover in singleton class.
    3. Create an MBEAN which creates SSHClient. On every onMessage() method, i will call mBean method which will download file using ssh client. Difference here is that Since mBean is single threaded, i can close and reconnect after n number of downloads or reconnect in case of conection failure. I will not have multithreaded problem. I am bit worried about scalability of mBean approach.
    Please do share your thoughts/how you have used in multithread env?
    Regards
    Chetan

    Chetan_ADP wrote:
    Yaa Files come from same servers. Expected rate is huge and approx 5000 files per hour.
    That isn't clear. Given server X how many files per hour to expect on average and at peak?
    And what is the file size?
    If you have only a couple of files from a single server and the sizes are small then opening/closing is probably best.
    If you have many files and/or large sizes then a continuous connection, with appropriate error logic, is probably best.
    You might want to verify the transfer rate as well to make sure your network can even do that.

  • How to configure ENV and DB for multithreaded application?

    Hi,
    From document, I know DB_THREAD must be checked for both ENV and DB, but , I don't know which one is best choice for multithreaded application while facing DB_INIT_LOCK and DB_INIT_CDB. In my application, there maybe multi readers and writers at the same time, should I use DB_INIT_LOCK instead of DB_INIT_CDB? what other flags should I use?
    DB_INIT_CDB provides multiple reader/single writer access while DB_INIT_LOCK should be used when multiple processes or threads are going to be reading and writing a Berkeley DB database.
    Thanks for your seggestions and answers.

    Thanks for the explanation,
    The Berkeley DB Concurrent Data Store product
    allows for multiple reader/single writer access
    to a database. This means that at any point in time,
    there may be either multiple readers accessing a
    database or a single writer updating the database.
    Berkeley DB Concurrent Data Store is intended for
    applications that need support for concurrent updates
    to a database that is largely used for reading.
    If you are looking to support multiple readers and
    multiple writers then take a look at the Transactional
    Data Store product
    (http://download.oracle.com/docs/cd/E17076_02/html/programmer_reference/transapp.html)
    In this case the Environment is typically opened with:
    DB_INIT_MPOOL, DB_INIT_LOCK, DB_INIT_LOG, and DB_INIT_TXN.
    Let me know if I missed any of your question.
    Thanks,
    Sandra

  • Hashmap,objectoutputstream,CPU usage

    hi java gurus ,
    i am developing a appln which will constantly write a hashmap to a file through objectoutput stream through serialisation.so by this approach when i need to add a new record i need to read the whole file store it in a hashmap and then add the record to the hashmap and then write the hashmap to the file.
    so by doing this my CPU usage goes to 100% due to which my appln becomes slow.
    pls tell me a way to stop my CPU usage going to 100%
    pls advice me whether i should change my flow if yes please tell me what i should be doing .
    Thank you in advance.

    You know, you might find it worthwhile to try defining your own, specialised HashMap rather than using the general purpose one in the class library. Get the number of Objects down. I don't know what the key and value items actually are in your application but consider unpacking them into primitives. Ideally, avoid even using Arrays.
    So, lets say you want to hash from a string of up to four characters to an integer define and entry object like:
    class MyHashEntry {
      MyHashEntry synonym;
      byte char0;
      byte char1;
      byte char2;
      byte char3;
      short value;
      static MyHashEntry[] map = new MyHashEntry[100000];
      public MyHashEntry(String key, int value) {
         byte[] chars = key.toByteArray();
         char0 = chars.length < 1 ? 0 : chars[0];
         char1 = chars.length < 2 ? 0 : chars[1];
         char2 = chars.length < 1 ? 0 : chars[2];
         char3 = chars.length < 1 ? 0 : chars[3];
         this.value = value;
         int col = hashValue % 1000000;
         next = map[col];
         map[col] = this;
      public int hashValue() {
        return (char0 << 24) ^ (char1 << 16) ^ (char2 << 8 ) ^ char3;
      public boolean equals(Object obj) {
        if(obj instanceof String) {
         ... do obvious comparison
    ... usual stuff
       public static MyHashEntry find(String key) {
         byte[] chars = key.toByteArray();
         int hash = 0;
         int i = 0;
         for(i = 0; i < 4; i++) {
           hash <<= 8;
           if(i < chars.length)
             hash ^= chars;
    hash %= 10000;
    MyHashEntry candidate = map[hash];
    while(candiate != null) {
    if(candiate.equals(key))
    return candidate;
    return null;
    (Untested)
    This should reduce the number of Objects by a factor of three of four.

  • *HashMap High Memory Usage, is this expected?

    I'm using this test class to demonstrate memory usage in hashmaps. I've done some preliminary testing by adding Runtime calls to get memory before and after, but what I'm immediately noticing is that I'm running out of memory before I hit the 200K mark.
    Error specifically is : Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    Am I doing anything horribly wrong here that is over-allocating memory to this purpose? It doesn't seem to me. like it should be hitting 75MB's so quickly. How can I estimate the usage per hashmap here? What is each string key worth? I saw that there's a chart of primitive memory somewhere...
    thanks in advance.
    package test;
    import java.util.*;
    /* burns through 75MB, and runs out.
    * is this expected behavior?
    public class ALMemoryTest {
         public static void main(String[] args) {
              ArrayList<HashMap<String,Object>> al = new ArrayList<HashMap<String,Object>>();
              HashMap<String,Object> hm;
              for (int i = 0; i < 200000; i++) {
                   hm = new LinkedHashMap<String,Object>();
                   hm.put("original", i);
                   hm.put("times2", i * 2);
                   hm.put("stringkey", i + "string");
                   al.add(hm);
              System.out.println(al.size());
    }

    So I made another test class. This shows I can get up to 185K-ish elements or so...
    package test;
    import java.util.*;
    /* burns through 75MB, and runs out.
    * is this expected behavior?
    public class ALMemoryTest {
         ArrayList<HashMap<String,Object>> al = new ArrayList<HashMap<String,Object>>();
         public double bytesToMbs (long bytes) {
              return bytes / 1048576;
         public LinkedHashMap<String,Object> newHashMap(int i) {
              LinkedHashMap<String,Object> hm = new LinkedHashMap<String,Object>();
              hm.put("original", i);
              hm.put("times2", i * 2);
              hm.put("stringkey", i + "string");
              return hm;
         public void totalMem() {
              System.out.println( "total bytes: " + Runtime.getRuntime().totalMemory() );     
         public void freeMem() {
              System.out.println( "free bytes: " + Runtime.getRuntime().freeMemory() );
         public void arrayListTest(int max) {
              for (int i = 0; i < max; i++) {
                   al.add(newHashMap(i));
              System.out.println(al.size());          
         public static void main(String[] args) {
              ALMemoryTest test = new ALMemoryTest();
              test.arrayListTest(100000);
              test.arrayListTest(50000);
              test.arrayListTest(20000);
              test.arrayListTest(10000);
              test.arrayListTest(5000);
              test.freeMem();
    }

  • HashMap memory usage

    Hi,
    I am implementing an indexer / compressor for plain text files (text, query log and urls files). The basic skeleton of the indexer is the Huffman codec, plus some various addon to boost performance.
    Huffman is used on words (Huffword); the first operation I execute is the complete scan of the file to collect term frequencies, which I will use to generate the Huffman model. Frequencies are stored in a HashMap<String, Integer>.
    The main problem is the HashMap dimension, I quickly run out of memory.
    In a query log of 300MB I collect something around 1700000 String-Integer pairs; is it possible that I need an 512MB-sized heap?

    >
    Huffman is used on words (Huffword); the first operation I execute is the complete scan of the file to collect term frequencies, which I will use to generate the Huffman model. Frequencies are stored in a HashMap<String, Integer>.
    The main problem is the HashMap dimension, I quickly run out of memory.
    In a query log of 300MB I collect something around 1700000 String-Integer pairs; is it possible that I need an 512MB-sized heap?
    >
    Answer to your question: yes, if you are consuming lots of memory, you need lots of heap.
    Answer to the question you didn't ask: with that many unique words, attempting to assign each word a Huffman code will make your file larger. Huffman codes are only useful when you have a relatively small vocabulary, where an even smaller number of terms predominate. This allows you to use a small number of bits for the frequently-occurring items, and a large number of bits for the rarely-occurring items.
    In your case you're going to have an extremely broad tree, with most of the terms being leaf nodes. If I'm remembering correctly, it will have log2(x) + N bits for a leaf node (where N accounts for the non-overlapping leading bits of the few predominant words), so 24+ bits per word. Plus, you have to store your entire dictionary in the file to be used for reconstruction.

  • Orion-web.xml and resource-env-ref-mapping correct usage

    What happened to resource-env-ref-mapping element in orion-web.xml? I have a 9.0.4.1 server running several .EAR files containing web applications using this attribute to configure JMS related items. But when deploying the .EAR to newer versions (9.0.5.2) of the server this attribute doesn't appear to be valid any longer?
    Here it is in the 9.0.4 documentation:
    http://strogoff.unex.es/oradoc/form_y_report_10g/web.904/b10322/apdx_a.htm
    Any help would be appreciated?

    I should have been more clear about the issue. The error only occurs when running inside the embedded OC4J container of JDeveloper 10.1.2 build 1913. The error does NOT occur in JDeveloper 9.0.3. The application also runs fine when deployed to a 10.1.2 Enterprise application server. The error only occurs in JDeveloper.
    The following error occurs when validating my orion-web.xml file. The entry is:
    orion-web.xml:
    <orion-web-app>
    <resource-ref-mapping name="jms/mQueueConnectionFactory" location="jms/matchingQueueConnectionFactory"/>
    <resource-env-ref-mapping name="jms/mQueue" location="jms/matchingQueue"/>
    </orion-web-app>
    web.xml
    <resource-env-ref>
    <resource-env-ref-name>jms/mQueue</resource-env-ref-name>
    <resource-env-ref-type>javax.jms.Queue</resource-env-ref-type>
    </resource-env-ref>
    java.lang.IllegalArgumentException: Unrecognized parent-elem combination: interface oracle.jdeveloper.xml.oc4j.war.OrionWebApp - resource-env-ref-mapping
         at oracle.javatools.xml.bind.XMLBinding.throwUnrecognizedElem(XMLBinding.java:127)
         at oracle.jdeveloper.xml.j2ee.war.WebAppBinding.elem2intImpl(WebAppBinding.java:637)
         at oracle.javatools.xml.bind.XMLBinding.elem2int(XMLBinding.java:104)
         at oracle.javatools.xml.bind.XMLBinding.insertBetween(XMLBinding.java:88)
         at oracle.javatools.xml.bind.BindingContext.insertNewElement(BindingContext.java:121)
         at oracle.javatools.xml.bind.BindingContext.insertElem(BindingContext.java:95)
         at oracle.javatools.xml.bind.BindingContext.setElement(BindingContext.java:71)
         at oracle.javatools.xml.bind.SetImpl.callSetterForUniqueElem(SetImpl.java:66)
         at oracle.javatools.xml.bind.SetImpl.callSetter(SetImpl.java:57)
         at oracle.javatools.xml.bind.SetImpl.invoke(SetImpl.java:26)
         at oracle.javatools.xml.bind.ElementProxy.invoke(ElementProxy.java:35)
         at $Proxy10.setWebApp(Unknown Source)
         at oracle.jdevimpl.runner.oc4j.Oc4jWorkspaceConfig.ensureLocalPageReposRootIsSet(Oc4jWorkspaceConfig.java:633)
         at oracle.jdevimpl.runner.oc4j.Oc4jWorkspaceConfig.transmogrifyConfigFiles(Oc4jWorkspaceConfig.java:269)
         at oracle.jdevimpl.runner.oc4j.Oc4jWorkspaceConfig.configureAll(Oc4jWorkspaceConfig.java:114)
         at oracle.jdevimpl.runner.oc4j.Oc4jStarter.preStart(Oc4jStarter.java:618)
         at oracle.jdevimpl.runner.oc4j.Oc4jStarter.start(Oc4jStarter.java:268)
         at oracle.ide.runner.RunProcess.startTarget(RunProcess.java:756)
         at oracle.jdeveloper.runner.JRunProcess.startTarget(JRunProcess.java:461)
         at oracle.ide.runner.RunProcess$2.run(RunProcess.java:699)
         at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:178)
         at java.awt.EventQueue.dispatchEvent(EventQueue.java:454)
         at java.awt.EventDispatchThread.pumpOneEventForHierarchy(EventDispatchThread.java:201)
         at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:151)
         at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:145)
         at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:137)
         at java.awt.EventDispatchThread.run(EventDispatchThread.java:100)

  • Bridge CS4 and Multicore usage?

    Hi everyone-
    I build a new machine a few months ago and am now in the process of reshuffling the hard drives as I underestimated the size of the archive. The current specs of the machine are:
    Core i7 920 (stock)
    12 GB RAM
    Main drive: Seagate 1.5TB 7200rpm 7200.11
    Archive drive (new one): 2TB WD Green
    Photoshop/Bridge CS4
    One area that is particularly slow is Bridge when importing files. Most of the files are from a 1Ds MKIII, so it takes a while to generate previews. The photos get added to the slideshow fairly quickly and the thumbnails are extracted in pretty fast time as well. However, it takes a while for the full res previews to generate. For example, in a directory mix of about 7.5gig worth of 1Ds MKIII files and 1D MKII files, the machine currently takes close to 8 minutes before the previews are generated. I looked at the processor usage once and it never goes over 20%. Thinking my hard drive may be slow since it wasn't a CPU bottleneck, I installed 2 OCZ Vertex drives in a RAID0 configuration, moved the photos and cache onto that drive, but ended up seeing the same performance when the previews were generated. (the files were added to the slideshow and had the thumbnails extracted much faster however.)
    My question is does Bridge support multiprocessors like the Core i7? It appears it doesn't based on my results but perhaps there is something wrong with my setup.
    Thanks for any help,
    Chris

    Schlotkins wrote:
    One area that is particularly slow is Bridge when importing files.
    My question is does Bridge support multiprocessors like the Core i7? It appears it doesn't based on my results but perhaps there is something wrong with my setup.
    Thanks for any help,
    Chris
    Hi Chris,
    How are you importing the files?
    Bridge and Camera Raw both support multithreaded operations. If you are using the 'Get Photos from Camera...' menu option than you are hitting a non-threaded process. That menu item calls a new process and it is not multithreaded. It's there for mounting peripherals that otherwise don't mount via Explorer and one way for ingesting images from a camera.
    If you are speaking of just browsing to files and getting cached data, what setting do you have under Preferences> Advanced? If you have Generate Monitor-Size Previews on, what size is your monitor resolution? And you are not talking about using the Loupe for 100% in the Preview panel, right?
    I haven't broken down the overall size to image count; approximately how many images are we talking?
    regards,
    steve

  • Backing up and restoring EP 7.0 (With Usage type DI installed) on HP-UX

    Hello,
    I know and saw several blogs/forums for Portal/Java WebAS 640 backup and restore and most of them says to backup Oracle and backup filesystems (/usr/sap/SID/* and /sapmnt/SID/* and ofcourse oracle).
    WE have never tested our restore on our Portal system which has DI (Dev Infrastructure) installed with it, hence need to know is it just enough to backup the above files systems (and also /home/sidadm and /etc) AND then just restore the oracle backup, and above files systems and system will be up and running with all the DI developments we did?
    We are about to apply our first round of Stacks on EP system (NW04s) with DI usage type, hence we want to test our restore if everything goes fine in case we have to do that.
    Our env:
    EP/DI on HP UX - currently going to stack 15
    Oracle on AIX
    Your response is greatly appreciated.
    Thanks
    WA

    we tried to just restore the Filesystem and database and without any efforts we were able to startup java.

Maybe you are looking for

  • Generic Internal Table

    Hi,    I need to develop a module that takes an infotype as input, does 'select * from <infotype table>' and writes data to a file. This module needs to be generic and should work for any infotype (even custom infotype). For example, if input is PA00

  • HT1937 my ipod is not turning on.. please please help me

    hey.. my ipod nano is not turning on as it is fully charged.. and there is also a bigger probelm that my charger is broken..  so please help me.. please please..

  • Battery in my Zeen for my e510 exploded

    The battery in my Zeen for my HP Photosmart eStation All-in-One Printer series - C510 exploded and popped the back right off of the ZEEN.  Now I cannot use it.  I emailed HP but they just sent back a generic email to contact a number for PARTS custom

  • Just wondering whether rendering is required for unedited AVI on the Timeline?

    I just bought Premiere Elements 11.0. I was currently using Premiere Elements 3.02. In 3.02 When importing a standard AVI File to the timeline, only transitions and Titles needed rendering before exporting to DvD. However, I noticed when doing the sa

  • Hover effect without it actually being a link?

    I have some text that displays more text when you click on it, and FAQ script. To help users know that it will do something I wanted to make it act like the other links on my page that bold when you mouse over them. I achieved this by putting a # for