Optimizing cursor read performance

I frequently iterate through an 8gb table with a cursor (using the C api on a btree table), and find that i/o read performance can be quite poor on my production system, with iostat showing 5mb/second read speeds.
However, when I db_dump / db_load the table, read performance goes up significantly (60mb/second on my imac, which I develop on).
When times, I'm able to iterate over my 8gb (1.5 million record) table in 140 seconds after dump/load. Before the dump/load, it took about 20 minutes. File size is not much reduced after the dump/load.
My suspicion is that the production database, that has constant read/writes to it, is fragmented, and thus perhaps the data is not optimized for btree cursor scanning of the entire table.
So... my question is...
What should I do to maintain the highest level of read performance on a large database? Will DB->compact provide me with the same performance level that a full dump/load will?
I could periodically down my web site do do a dump/load, but with 12gb of data, that would be a significant downtime. I'm also nervous about dump/load in a logging environment (ie, not losing any data in the log but not yet in the tables)
Another question, that I suspect the answer is "no" to, is: is there a way to read records out of a table in written-to-disk order, so as to optimize disk drive read head performance? i.e., avoiding asking the disk drive head to move all over the table as the cursor iterates.
-john

db_dump (the utility itself), when executed with standard options, calls db->dump(). So what one has to look for is what __db_dump() itself is doing:
1048         /*
1049          * Get a cursor and step through the database, printing out each
1050          * key/data pair.
1051          */
1052         if ((ret = __db_cursor(dbp, NULL, &dbcp, 0)) != 0)
1053                 return (ret);
1054
1055         memset(&key, 0, sizeof(key));
1056         memset(&data, 0, sizeof(data));
1057         if ((ret = __os_malloc(dbenv, 1024 * 1024, &data.data)) != 0)
1058                 goto err;
1059         data.ulen = 1024 * 1024;
1060         data.flags = DB_DBT_USERMEM;
1061         is_recno = (dbp->type == DB_RECNO || dbp->type == DB_QUEUE);
1062         keyflag = is_recno ? keyflag : 1;
1063         if (is_recno) {
1064                 keyret.data = &recno
1065                 keyret.size = sizeof(recno);
1066         }
1067
1068 retry: while ((ret =
1069             __dbc_get(dbcp, &key, &data, DB_NEXT | DB_MULTIPLE_KEY)) == 0) {
1070                 DB_MULTIPLE_INIT(pointer, &data);
1071                 for (;;) {
1072                         if (is_recno)
1073                                 DB_MULTIPLE_RECNO_NEXT(pointer, &data,
1074                                     recno, dataret.data, dataret.size);
1075                         else
1076                                 DB_MULTIPLE_KEY_NEXT(pointer,
1077                                     &data, keyret.data,
1078                                     keyret.size, dataret.data, dataret.size);As you can see, it's reading through the database and using the DB_MULTIPLE interface to retrieve multiple keys at once. Now this in itself, while performance oriented, may not explain a magnitude 10 difference in the particular pattern you're seeing. What may explain such an immediate difference is the simple fact that after having scanned through the database within your own application, you've already primed both the database cache and the OS' own filesystem cache.
What you're experiencing is not pointing towards any kind of fragmentation but rather showing the positive benefits of caching.
Since it would be atypical (making an educated assumption) for your application to do a full database traversal in standard use - what you're seeing with a 20 minute full traverse on cold cache is not really accurate in what your application will see anyways.
What you should do to maintain the highest level of read performance is quite simple:
1. Locality of reference (http://en.wikipedia.org/wiki/Locality_of_reference).
In the context of BDB this simply means choosing the appropriate key and (if necessary) btree comparison function that will place related data near or on the same page as other related data. Almost every application has locality of reference. Even if there is an initial page location and load hit in efficiency there will most likely follow an opportunity to take advantage of LoR from that point on. Ask yourself: is your application making the most of LoR based on the data and usage patterns it has available to it?
2. I/O costs. It may be quite possible to simply store less actual data by reducing any redundancy of data within the data itself. Is there a common trend for repetition of information between row to row? Here's a simplistic example:
key:data (year|make|occupation|location)
Sam:1982 Ford|Office Manager|Los Angeles
John:1992 Honda|Mechanic|San Francisco
Billie:1993 Honda|Pilot|Los Angeles
Susie:1989 Dodge|Office Manager|Detroit
Jimbo:2000 Ford|Policeman|San Francisco
Dale:1962 Dodge|Mechanic|New YorkAlmost every "column" there can be collapsed into an atomic integer (uint8_t, uint16_t, uint32_t, etc.) type which will significantly cut down on overall costs of storing the row data itself. For instance, applied to the above data, even if we were using uint32_t for each specific column (when some could utilize uint16_t or even uint8_t), you're looking at 12-16 bytes and the cost savings increase as more repetition is present among rows beforehand - as after they will share integer ids for duplicate "columns" between rows. It's not free per se, as it will transfer costs to CPU when one has to expand the integer id back to byte string (when actually making use of the data) and it will shift I/O to another "resolution" database - albeit the majority of this I/O you would be doing anyway. However, since one is storing less data overall we will be able to take more advantage of the cache that is available to us and definitely amortize the costs of decomposition/composition of integer keys to byte strings. The actual CPU cost of calling additional functions is ridiculously small compared to waiting on I/O. If you're comfortable with the additional complexity and your data has high levels of redundancy, then you should explore the possibilities.
In addition, another thing to explore is the possibility of database normalization (http://en.wikipedia.org/wiki/Database_normalization) for your data.
The common theme here, along with both normalization and the previous method, is to eliminate any significant sources of redundancy. If you don't have any source of redundancy and actually have 12GB of unrelated, unique (from row to row) data, then the only thing you can do is appropriately plan the database such that you'll be exploiting LoR - and if that offers no advantage - last resort: throw more cache and disk heads at it.
All that aside, there is a stronger question evident in your first statement:
<i>I frequently iterate through an 8gb table with a cursor</i>
and that question is <b>why</b>? This should <b>not</b> be necessary for a typical application to do.

Similar Messages

  • How to improve the event log read performance under intensive event writing

    We are collecting etw events from customer machines. In our perf test, the event read rate can reach 5000/sec when there is no heavy event writing. However, the customer machine has very intensive event writing and our read rate dropped a lot (to 300/sec).
    I understand there is IO bound since event write and read will race for the log file, which is also confirmed by the fact that whenever there is a burst of event write, a dip of event read happens at the same time. Therefore, the event read cannot catch up
    the event write and the customer gets lagging behind logs.
    Note that most of the events are security events generated by windows (instead of customers).
    Is there a way to improve the event read performance under intensive event write? I know it is a hard question given the theory blocker just mentioned. But we will lose customers if there is no solution. Appreciate any clue very much!

    Hi Leonjl,
    Thank you for posting on MSDN forum.
    I am trying to invite someone who familiar with this to come into this thread.
    Regards,
    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click
    HERE to participate the survey.

  • BDB read performance problem: lock contention between GC and VM threads

    Problem: BDB read performance is really bad when the size of the BDB crosses 20GB. Once the database crosses 20GB or near there, it takes more than one hour to read/delete/add 200K keys.
    After a point, of these 200K keys there are about 15-30K keys that are new and this number eventually should come down and there should not be any new keys after a point.
    Application:
    Transactional Data Store application. Single threaded process, that's trying to read one key's data, delete the data and add new data. The keys are really small (20 bytes) and the data is large (grows from 1KB to 100KB)
    On on machine, I have a total of 3 processes running with each process accessing its own BDB on a separate RAID1+0 drive. So, according to me there should really be no disk i/o wait that's slowing down the reads.
    After a point (past 20GB), There are about 4-5 million keys in my BDB and the data associated with each key could be anywhere between 1KB to 100KB. Eventually every key will have 100KB data associated with it.
    Hardware:
    16 core Intel Xeon, 96GB of RAM, 8 drive, running 2.6.18-194.26.1.0.1.el5 #1 SMP x86_64 x86_64 x86_64 GNU/Linux
    BDB config: BTREE
    bdb version: 4.8.30
    bdb cache size: 4GB
    bdb page size: experimented with 8KB, 64KB.
    3 processes, each process accesses its own BDB on a separate RAIDed(1+0) drive.
    envConfig.setAllowCreate(true);
    envConfig.setTxnNoSync(ourConfig.asynchronous);
    envConfig.setThreaded(true);
    envConfig.setInitializeLocking(true);
    envConfig.setLockDetectMode(LockDetectMode.DEFAULT);
    When writing to BDB: (Asynchrounous transactions)
    TransactionConfig tc = new TransactionConfig();
    tc.setNoSync(true);
    When reading from BDB (Allow reading from Uncommitted pages):
    CursorConfig cc = new CursorConfig();
    cc.setReadUncommitted(true);
    BDB stats: BDB size 49GB
    $ db_stat -m
    3GB 928MB Total cache size
    1 Number of caches
    1 Maximum number of caches
    3GB 928MB Pool individual cache size
    0 Maximum memory-mapped file size
    0 Maximum open file descriptors
    0 Maximum sequential buffer writes
    0 Sleep after writing maximum sequential buffers
    0 Requested pages mapped into the process' address space
    2127M Requested pages found in the cache (97%)
    57M Requested pages not found in the cache (57565917)
    6371509 Pages created in the cache
    57M Pages read into the cache (57565917)
    75M Pages written from the cache to the backing file (75763673)
    60M Clean pages forced from the cache (60775446)
    2661382 Dirty pages forced from the cache
    0 Dirty pages written by trickle-sync thread
    500593 Current total page count
    500593 Current clean page count
    0 Current dirty page count
    524287 Number of hash buckets used for page location
    4096 Assumed page size used
    2248M Total number of times hash chains searched for a page (2248788999)
    9 The longest hash chain searched for a page
    2669M Total number of hash chain entries checked for page (2669310818)
    0 The number of hash bucket locks that required waiting (0%)
    0 The maximum number of times any hash bucket lock was waited for (0%)
    0 The number of region locks that required waiting (0%)
    0 The number of buffers frozen
    0 The number of buffers thawed
    0 The number of frozen buffers freed
    63M The number of page allocations (63937431)
    181M The number of hash buckets examined during allocations (181211477)
    16 The maximum number of hash buckets examined for an allocation
    63M The number of pages examined during allocations (63436828)
    1 The max number of pages examined for an allocation
    0 Threads waited on page I/O
    0 The number of times a sync is interrupted
    Pool File: lastPoints
    8192 Page size
    0 Requested pages mapped into the process' address space
    2127M Requested pages found in the cache (97%)
    57M Requested pages not found in the cache (57565917)
    6371509 Pages created in the cache
    57M Pages read into the cache (57565917)
    75M Pages written from the cache to the backing file (75763673)
    $ db_stat -l
    0x40988 Log magic number
    16 Log version number
    31KB 256B Log record cache size
    0 Log file mode
    10Mb Current log file size
    856M Records entered into the log (856697337)
    941GB 371MB 67KB 112B Log bytes written
    2GB 262MB 998KB 478B Log bytes written since last checkpoint
    31M Total log file I/O writes (31624157)
    31M Total log file I/O writes due to overflow (31527047)
    97136 Total log file flushes
    686 Total log file I/O reads
    96414 Current log file number
    4482953 Current log file offset
    96414 On-disk log file number
    4482862 On-disk log file offset
    1 Maximum commits in a log flush
    1 Minimum commits in a log flush
    160KB Log region size
    195 The number of region locks that required waiting (0%)
    $ db_stat -c
    7 Last allocated locker ID
    0x7fffffff Current maximum unused locker ID
    9 Number of lock modes
    2000 Maximum number of locks possible
    2000 Maximum number of lockers possible
    2000 Maximum number of lock objects possible
    160 Number of lock object partitions
    0 Number of current locks
    1218 Maximum number of locks at any one time
    5 Maximum number of locks in any one bucket
    0 Maximum number of locks stolen by for an empty partition
    0 Maximum number of locks stolen for any one partition
    0 Number of current lockers
    8 Maximum number of lockers at any one time
    0 Number of current lock objects
    1218 Maximum number of lock objects at any one time
    5 Maximum number of lock objects in any one bucket
    0 Maximum number of objects stolen by for an empty partition
    0 Maximum number of objects stolen for any one partition
    400M Total number of locks requested (400062331)
    400M Total number of locks released (400062331)
    0 Total number of locks upgraded
    1 Total number of locks downgraded
    0 Lock requests not available due to conflicts, for which we waited
    0 Lock requests not available due to conflicts, for which we did not wait
    0 Number of deadlocks
    0 Lock timeout value
    0 Number of locks that have timed out
    0 Transaction timeout value
    0 Number of transactions that have timed out
    1MB 544KB The size of the lock region
    0 The number of partition locks that required waiting (0%)
    0 The maximum number of times any partition lock was waited for (0%)
    0 The number of object queue operations that required waiting (0%)
    0 The number of locker allocations that required waiting (0%)
    0 The number of region locks that required waiting (0%)
    5 Maximum hash bucket length
    $ db_stat -CA
    Default locking region information:
    7 Last allocated locker ID
    0x7fffffff Current maximum unused locker ID
    9 Number of lock modes
    2000 Maximum number of locks possible
    2000 Maximum number of lockers possible
    2000 Maximum number of lock objects possible
    160 Number of lock object partitions
    0 Number of current locks
    1218 Maximum number of locks at any one time
    5 Maximum number of locks in any one bucket
    0 Maximum number of locks stolen by for an empty partition
    0 Maximum number of locks stolen for any one partition
    0 Number of current lockers
    8 Maximum number of lockers at any one time
    0 Number of current lock objects
    1218 Maximum number of lock objects at any one time
    5 Maximum number of lock objects in any one bucket
    0 Maximum number of objects stolen by for an empty partition
    0 Maximum number of objects stolen for any one partition
    400M Total number of locks requested (400062331)
    400M Total number of locks released (400062331)
    0 Total number of locks upgraded
    1 Total number of locks downgraded
    0 Lock requests not available due to conflicts, for which we waited
    0 Lock requests not available due to conflicts, for which we did not wait
    0 Number of deadlocks
    0 Lock timeout value
    0 Number of locks that have timed out
    0 Transaction timeout value
    0 Number of transactions that have timed out
    1MB 544KB The size of the lock region
    0 The number of partition locks that required waiting (0%)
    0 The maximum number of times any partition lock was waited for (0%)
    0 The number of object queue operations that required waiting (0%)
    0 The number of locker allocations that required waiting (0%)
    0 The number of region locks that required waiting (0%)
    5 Maximum hash bucket length
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    Lock REGINFO information:
    Lock Region type
    5 Region ID
    __db.005 Region name
    0x2accda678000 Region address
    0x2accda678138 Region primary address
    0 Region maximum allocation
    0 Region allocated
    Region allocations: 6006 allocations, 0 failures, 0 frees, 1 longest
    Allocations by power-of-two sizes:
    1KB 6002
    2KB 0
    4KB 0
    8KB 0
    16KB 1
    32KB 0
    64KB 2
    128KB 0
    256KB 1
    512KB 0
    1024KB 0
    REGION_JOIN_OK Region flags
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    Lock region parameters:
    524317 Lock region region mutex [0/9 0% 5091/47054587432128]
    2053 locker table size
    2053 object table size
    944 obj_off
    226120 locker_off
    0 need_dd
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    Lock conflict matrix:
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    Locks grouped by lockers:
    Locker Mode Count Status ----------------- Object ---------------
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    Locks grouped by object:
    Locker Mode Count Status ----------------- Object ---------------
    Diagnosis:
    I'm seeing way to much lock contention on the Java Garbage Collector threads and also the VM thread when I strace my java process and I don't understand the behavior.
    We are spending more than 95% of the time trying to acquire locks and I don't know what these locks are. Any info here would help.
    Earlier I thought the overflow pages were the problem as 100KB data size was exceeding all overflow page limits. So, I implemented duplicate keys concept by chunking of my data to fit to overflow page limits.
    Now I don't see any overflow pages in my system but I still see bad bdb read performance.
    $ strace -c -f -p 5642 --->(607 times the lock timed out, errors)
    Process 5642 attached with 45 threads - interrupt to quit
    % time     seconds  usecs/call     calls    errors syscall
    98.19    7.670403        2257      3398       607 futex
     0.84    0.065886           8      8423           pread
     0.69    0.053980        4498        12           fdatasync
     0.22    0.017094           5      3778           pwrite
     0.05    0.004107           5       808           sched_yield
     0.00    0.000120          10        12           read
     0.00    0.000110           9        12           open
     0.00    0.000089           7        12           close
     0.00    0.000025           0      1431           clock_gettime
     0.00    0.000000           0        46           write
     0.00    0.000000           0         1         1 stat
     0.00    0.000000           0        12           lseek
     0.00    0.000000           0        26           mmap
     0.00    0.000000           0        88           mprotect
     0.00    0.000000           0        24           fcntl
    100.00    7.811814                 18083       608 total
    The above stats show that there is too much time spent locking (futex calls) and I don't understand that because
    the application is really single-threaded. I have turned on asynchronous transactions so the writes might be
    flushed asynchronously in the background but spending that much time locking and timing out seems wrong.
    So, there is possibly something I'm not setting or something weird with the way JVM is behaving on my box.
    I grep-ed for futex calls in one of my strace log snippet and I see that there is a VM thread that grabbed the mutex
    maximum number(223) of times and followed by Garbage Collector threads: the following is the lock counts and thread-pids
    within the process:
    These are the 10 GC threads (each thread has grabbed lock on an avg 85 times):
      86 [8538]
      85 [8539]
      91 [8540]
      91 [8541]
      92 [8542]
      87 [8543]
      90 [8544]
      96 [8545]
      87 [8546]
      97 [8547]
      96 [8548]
      91 [8549]
      91 [8550]
      80 [8552]
    VM Periodic Task Thread" prio=10 tid=0x00002aaaf4065000 nid=0x2180 waiting on condition (Main problem??)
     223 [8576] ==> grabbing a lock 223 times -- not sure why this is happening…
    "pool-2-thread-1" prio=10 tid=0x00002aaaf44b7000 nid=0x21c8 runnable [0x0000000042aa8000] -- main worker thread
       34 [8648] (main thread grabs futex only 34 times when compared to all the other threads)
    The load average seems ok; though my system thinks it has very less memory left and that
    I think is because its using up a lot of memory for the file system cache?
    top - 23:52:00 up 6 days, 8:41, 1 user, load average: 3.28, 3.40, 3.44
    Tasks: 229 total, 1 running, 228 sleeping, 0 stopped, 0 zombie
    Cpu(s): 3.2%us, 0.9%sy, 0.0%ni, 87.5%id, 8.3%wa, 0.0%hi, 0.1%si, 0.0%st
    Mem: 98999820k total, 98745988k used, 253832k free, 530372k buffers
    Swap: 18481144k total, 1304k used, 18479840k free, 89854800k cached
    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    8424 rchitta 16 0 7053m 6.2g 4.4g S 18.3 6.5 401:01.88 java
    8422 rchitta 15 0 7011m 6.1g 4.4g S 14.6 6.5 528:06.92 java
    8423 rchitta 15 0 6989m 6.1g 4.4g S 5.7 6.5 615:28.21 java
    $ java -version
    java version "1.6.0_21"
    Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
    Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)
    Maybe I should make my application a Concurrent Data Store app as there is really only one thread doing the writes and reads. But I would like
    to understand why my process is spending so much time in locking.
    Can I try any other options? How do I prevent such heavy locking from happening? Has anyone seen this kind of behavior? Maybe this is
    all normal. I'm pretty new to using BDB.
    If there is a way to disable locking that would also work as there is only one thread that's really doing all the job.
    Should I disable the file system cache? One thing is that my application does not utilize cache very well as once I visit a key, I don't visit that
    key again for a very long time so its very possible that the key has to be read again from the disk.
    It is possible that I'm thinking this completely wrong and focussing too much on locking behavior and the problem is else where.
    Any thoughts/suggestions etc are welcome. Your help on this is much appreciated.
    Thanks,
    Rama

    Hi,
    Looks like you're using BDB, not BDB JE, and this is the BDB JE forum. Could you please repost here?:
    Berkeley DB
    Thanks,
    mark

  • Read performance of a generically typed hashed table

    Hi,
        I'm curious to know the how the read performance of a hashed table will be affected in the following scenario. I have a class with many static attributes of hashed table type. All these hashed tables have one thing in common. They have a field "GUID" as unique key. The processing logic is same for all tables and hence I made the code generic. For example, I have an importing parameter with which I can "derive" the name of my class attribute.
    field-symbols: <fs_table> type hashed table,
                           <fs_line> type any.
    lv_class_attr = derive_name( iv_object_name ).
    assign (lv_class_Attr) to <fs_table>.
    read table <fs_table> with table key ('GUID') assigning <fs_line>.
    Will this code block give me the same performance that I would get if I use specifically typed hash tables? The code inspector gives me a low performance warning. But what will actually happen at runtime?
    Regards,
    Arun Prakash

    Hi,
    It is very simple. You defined table without unique key that why?.
    Whenever we define the typed hased table it require the unique key required. In your case it is dynamic hased table. It's don't have any unique key, While build internal hased table it require unique key defination. by in case of non-type table don't know the unique key fields.
    While reading data it will work like the standard table due to non unique key defination and It used the sequention read.

  • Double cursor read of database

    Hi,
    What is meant by double cursor read select statements.
    Can u please provide specific examples?

    Are you sure you don't mean "parallel cursor read of internal tables?"
    Rob

  • Poor NAS reading performance in OS X - need help

    Ok, this one is a mystery to me and it's driving me nuts.
    I have the following computers on my gigabit network:
    - iMac (Mid 2010) 27" i7 (Dual Boot configuration, Win7)
    - iMac (Early 2006) 20" Core Duo
    - MacBook (Mid 2007)
    - Dell XPS M1710 Gaming Laptop running Win7
    The NAS in question is a DROBO FS equipped with three 2TB Western Digital drives giving me 3.75TB usable storage.
    All computers can see and access the DROBO. All computers can mount and use all the shares.
    Both the Dell and iMac under Windows 7 seem to be able to write and read the DROBO at speeds as one would expect. Everything works great.
    Read: ~35Mbyte/s
    Write: ~25Mbyte/s
    Once I access the DROBO in OS X however something seems to be causing problems.
    Read: ~2Mbyte/s
    Write: ~25Mbyte/s
    As you can see, the read performance is way slower than it should be. Additionally, it's not a consistent flow of data. Sometimes it will only push 500kb/s then jump back up to around 2Mbyte/s and so forth. It's as if there is something corrupting it or really working against the flow of data.
    Since I have my iTunes library stored on the DROBO it makes it virtually unusable and DROBO tech support so far has been useless.
    If anyone has any ideas about what I can try, please let me know since I'm at my wits end.

    I'm having the exact same problem the original poster described. Drobo works fine (quickly) for both reading and writing on my Windows 7 PC. I can write quickly from my Mac Mini to the Drobo, but can only get 1-2MB/second read speeds. I too have tried direct-connecting the Drobo to the Mini, and that provided the fast speeds I expected. However, I don't think it's a switch issue, because I have both a D-Link gigabit switch and an Airport Extreme base (wired) between the Windows 7 PC and the Drobo. Additionally, I've tried hooking the Mini and Drobo both up to the Airport Extreme (in case the D-Link switch was doing something strange), but got the same issue.
    The Mini has all its software updates, the drobo has both firmware and software updated completely. Any suggestions would be most appreciated, the inability to stream movies is making the Drobo almost useless for me.

  • Low read performance from a USB 2.0 drive

    I experience a terribly low read performance from a USB 2.0 hard drive on a fresh Arch Linux installation with the latest stock kernel (2.6.23.1-7). Reading 200 MB takes around 3 minutes whereas under Windows with the same controller and drive it takes around 10 seconds.
    [gali@neutrino ~]$ sudo hdparm -t /dev/sdb1
    /dev/sdb1:
    Timing buffered disk reads: 6 MB in 4.89 seconds = 1.23 MB/sec
    It's a FAT32 volume.
    [gali@neutrino ~]$ cat /proc/mounts | grep sdb1
    /dev/sdb1 /media/GALI_EXT vfat rw,nosuid,nodev,noexec,uid=1000,gid=102,fmask=0133,dmask=0022,codepage=cp437,iocharset=utf8 0 0
    The ehci_usb module is loaded and the device manifests itself as USB 2.0.
    [gali@neutrino ~]$ cat /sys/bus/usb/devices/7-1/speed
    480
    [gali@neutrino ~]$ sudo lsusb -v -s 007:
    Bus 007 Device 002: ID 04b4:6830 Cypress Semiconductor Corp. USB-2.0 IDE Adapter
    Device Descriptor:
    bLength 18
    bDescriptorType 1
    bcdUSB 2.00
    bDeviceClass 0 (Defined at Interface level)
    bDeviceSubClass 0
    bDeviceProtocol 0
    bMaxPacketSize0 64
    idVendor 0x04b4 Cypress Semiconductor Corp.
    idProduct 0x6830 USB-2.0 IDE Adapter
    bcdDevice 2.40
    iManufacturer 0
    iProduct 57 Cypress AT2+LP
    iSerial 44 DEF10C1DEAF7
    bNumConfigurations 1
    Configuration Descriptor:
    bLength 9
    bDescriptorType 2
    wTotalLength 39
    bNumInterfaces 1
    bConfigurationValue 1
    iConfiguration 0
    bmAttributes 0xc0
    Self Powered
    MaxPower 2mA
    Interface Descriptor:
    bLength 9
    bDescriptorType 4
    bInterfaceNumber 0
    bAlternateSetting 0
    bNumEndpoints 3
    bInterfaceClass 8 Mass Storage
    bInterfaceSubClass 6 SCSI
    bInterfaceProtocol 80 Bulk (Zip)
    iInterface 0
    Endpoint Descriptor:
    bLength 7
    bDescriptorType 5
    bEndpointAddress 0x02 EP 2 OUT
    bmAttributes 2
    Transfer Type Bulk
    Synch Type None
    Usage Type Data
    wMaxPacketSize 0x0200 1x 512 bytes
    bInterval 0
    Endpoint Descriptor:
    bLength 7
    bDescriptorType 5
    bEndpointAddress 0x86 EP 6 IN
    bmAttributes 2
    Transfer Type Bulk
    Synch Type None
    Usage Type Data
    wMaxPacketSize 0x0200 1x 512 bytes
    bInterval 0
    Endpoint Descriptor:
    bLength 7
    bDescriptorType 5
    bEndpointAddress 0x81 EP 1 IN
    bmAttributes 3
    Transfer Type Interrupt
    Synch Type None
    Usage Type Data
    wMaxPacketSize 0x0004 1x 4 bytes
    bInterval 10
    Device Qualifier (for other device speed):
    bLength 10
    bDescriptorType 6
    bcdUSB 2.00
    bDeviceClass 0 (Defined at Interface level)
    bDeviceSubClass 0
    bDeviceProtocol 0
    bMaxPacketSize0 64
    bNumConfigurations 1
    Device Status: 0x0001
    Self Powered
    Bus 007 Device 001: ID 0000:0000
    Device Descriptor:
    bLength 18
    bDescriptorType 1
    bcdUSB 2.00
    bDeviceClass 9 Hub
    bDeviceSubClass 0 Unused
    bDeviceProtocol 1 Single TT
    bMaxPacketSize0 64
    idVendor 0x0000
    idProduct 0x0000
    bcdDevice 2.06
    iManufacturer 3 Linux 2.6.23-ARCH ehci_hcd
    iProduct 2 EHCI Host Controller
    iSerial 1 0000:07:00.2
    bNumConfigurations 1
    Configuration Descriptor:
    bLength 9
    bDescriptorType 2
    wTotalLength 25
    bNumInterfaces 1
    bConfigurationValue 1
    iConfiguration 0
    bmAttributes 0xe0
    Self Powered
    Remote Wakeup
    MaxPower 0mA
    Interface Descriptor:
    bLength 9
    bDescriptorType 4
    bInterfaceNumber 0
    bAlternateSetting 0
    bNumEndpoints 1
    bInterfaceClass 9 Hub
    bInterfaceSubClass 0 Unused
    bInterfaceProtocol 0 Full speed (or root) hub
    iInterface 0
    Endpoint Descriptor:
    bLength 7
    bDescriptorType 5
    bEndpointAddress 0x81 EP 1 IN
    bmAttributes 3
    Transfer Type Interrupt
    Synch Type None
    Usage Type Data
    wMaxPacketSize 0x0004 1x 4 bytes
    bInterval 12
    Hub Descriptor:
    bLength 9
    bDescriptorType 41
    nNbrPorts 4
    wHubCharacteristic 0x000a
    No power switching (usb 1.0)
    Per-port overcurrent protection
    TT think time 8 FS bits
    bPwrOn2PwrGood 10 * 2 milli seconds
    bHubContrCurrent 0 milli Ampere
    DeviceRemovable 0x00
    PortPwrCtrlMask 0xff
    Hub Port Status:
    Port 1: 0000.0503 highspeed power enable connect
    Port 2: 0000.0100 power
    Port 3: 0000.0100 power
    Port 4: 0000.0100 power
    Device Status: 0x0003
    Self Powered
    Remote Wakeup Enabled
    Another interesting thing, which may or may not have anything in common with this issue, is that issuing ANY read or write operation on the drive makes JACK running a FreeBoB driver for a FireWire audio interface connected to the same USB 2.0/FireWire controller collapse. But the low read performance happens regardless of whether the JACK is running or not.
    Anyone has any clue what to check? Thanks in advance.
    Last edited by galiyosha (2007-11-11 21:34:17)

    lilsirecho wrote:Can you change to a different  usb controller?  It may be defective.
    It is probably not, it works fine under Windows and it worked fine as well on another machine with older Arch Linux installation.
    lilsirecho wrote:
    Perhaps the current supplied by the usb controller is inadequate.
    Perhaps a powered hub added if you have one will alleviate the problem....EDITED
    It is an external HDD enclosure, so it is self-powered.
    lilsirecho wrote:One last gasp...check /etc/mkinitcpio.conf for hooks "usb"...
    Well, usb hook is not enabled, but as I don't use the drive for booting, I see no reason for it to be. The required modules (usbcore, ehci_hcd, usb_storage, ide_core) are loaded. Without them, the drive would probably not be mountable at all.
    Thanks for the effort, but neither of the tips probably points to the cause.

  • Extremely Poor Read Performance

    Hey guys,
    For a work project, I have been instructed to use a Berkeley DB as our data storage mechanism. Admittedly, I know little about BDB, but I've been learning more in the past day as I am reading up on it. I'm hoping, though, that even if no one can help me with the problem I am having, they can at least tell me if what I am seeing is typical/expected, or definitely wrong.
    Here's what I got:
    - Parent table A - Has 0 or 1 key for table B, and 0 or 1 key for table C
    - Table B
    - Table C
    For purpose of discussion, let's ignore table C as it is logically the same as Table B.
    Table B has 25 million rows, keyed by a 34-36 digit string, and a payload of 500-1000 bytes.
    Table A has 26 million rows, 25 million of which reference the 25 million rows in Table B.
    My question is not on the merits of why the data is structured the way it is, but rather about the performance I am seeing, so please refrain from questions such as "why is your data structured that way - can you structure it another way?" I know I can do that - again I just want to know what other people are experiencing for performance.
    Anyway, what's happening is this - my program runs a cursor on Table A to get all records. As it gets each record in Table A, it retrieves the referenced records in Table B. So, the cursor on table A represents sequential key access. The subsequent retrievals from Table B represent "random" retrievals - i.e. the key may be anywhere in the index, and is not at all related to the previous retrieval.
    Cruising the cursor on Table A, I am seeing performance of about 100,000 records per 2 seconds. However, when I add in the retrievals from Table B, performance stoops all the way down to 100,000 records per 1000 seconds, or better put 100 per second. At this rate, it will take nearly 70 hours to traverse my entire data set.
    My question is, am I simply running into a fundamental hardware issue in that I am doing random retrievals from Table B, and I cannot expect to see better performance than 100 retrievals per second due to all of the disk reads? Being that the DB is 20 GB in size, I cannot cache the entire table in memory, so does that mean that reading the data in this fashion is simply not feasible?
    If it isn't feasible, does anyone have a suggestion on a different way to read the data, without changing the table relationship as it currently stands? Considering Table B has a reverse reference to table A, I've considered putting a secondary index on table B so that instead of doing random seeks into table B, I can run a cursor on the secondary index of table B at the same time I run the cursor on table A. Then, for each record in table A that has a reference to table B, the first record in the cursor for table B should be the one I need. However, reading about secondary indexes, it looks like all a secondary index does is give a reference to the key to the table. Thus, my concern is that running a cursor on the secondary index of table B will really be no different than randomly selecting the records from table B, as it will be doing just that in the background anyway. Am I understanding this correctly, or would a secondary index really help in my case?
    Thank you for your time.
    -Brett

    Hi Brett,
    Is the sorting order the same between the two databases, A and B, that is, are the keys ordered in the same way? For example, to key N in database A, key N in database B is referred.
    I would guess not, because you mention the "randomness" in retrieving from B when doing the cursor sequential traversal of A, and the 34-36 digit keys in B are probably randomly generated.
    With B as a secondary database, associated to A as the primary database, it would make sense having a cursor on secondary database B to iterate, if you expect that the same ordering of keys in A (as mentioned in the beginning of this post). For example, you would use DBcursor->get to iterate in the secondary database B, or DBcursor->pget if you also want to retrieve the key from the primary database A: DBcursor->get(), DBcursor->pget()
    Basically secondary indexes allow for accessing records in a database (primary db) based on a different piece of information other than the primary key:
    Secondary indexes
    So, when you iterate with a cursor in B you would retrieve the data from A (and in addition the key from A) in the order given by the keys (secondary keys) in B.
    However, a secondary database does not seem to me feasible in your case. You seem to have about 1 mil records in primary db A for which you would not have to generate a secondary key, so you would have to return DB_DONOTINDEX from the secondary callback: DB->associate()
    (it may be difficult to account exactly for the records in A for which you do not want to generate secondary keys)
    Also, the secondary key, the 34-36 digit string, would have to somehow be derived from the primary key and data in A.
    If the ordering is not similar (in the sense explained at the beginning of the post) between A and B, then having the secondary index does not too much value, other than simplifying retrieval from A in queries where the query criteria involves the 34-36 digit string.
    Back to your current way of structuring data, there are some suggestion that could improve retrieval times:
    - try using the latest Berkeley DB release, 5.1.19: Berkeley DB Release History
    - try configuring a page size for the databases A and B equal to that of the filesystem's block size: Selecting a page size
    - try to avoid the creation of overflow items and pages by properly sizing the page size -- you can inspect database statistics using db_stat -d: db_stat
    - try increasing the cache size to a larger value: Selecting a cache size
    - if there's a single process accessing the environment, try to back the environment shared region files in per-process private memory (using DB_PRIVATE flag in the DB_ENV->open() call);
    - try performing read operations outside of transactions, that is, do not use transactional cursors.
    For reference, review these sections in the Berkeley DB Reference Guide:
    Access method tuning
    Transaction tuning
    Regards,
    Andrei

  • OPEN CURSOR approach- Performance

    Hi All,
    We have a requirement wherein we are using OPEN CURSOR and FETCH CURSOR in a ABAP Program (Function Module).
    Any sample code related to CURSORs will be help full.
    Is OPEN CURSOR method better in terms of performance compared to simple SELECT statements?
    Any help regarding this will be highly appriciated.
    Thanks in advance.

    Hi Shilpa,
       Yes from performance perspective if you are reading a large chunk of data then the Open Cursor gives you better performance. You mak keep you mouse cursor on the word cursor and read help.
    Sample Below:
    DATA: BEGIN OF count_line,
            carrid TYPE spfli-carrid,
            count  TYPE i,
          END OF count_line,
          spfli_tab TYPE TABLE OF spfli.
    DATA: dbcur1 TYPE cursor,
          dbcur2 TYPE cursor.
    OPEN CURSOR dbcur1 FOR
      SELECT carrid count(*) AS count
             FROM spfli
             GROUP BY carrid
             ORDER BY carrid.
    OPEN CURSOR dbcur2 FOR
      SELECT *
             FROM spfli
             ORDER BY carrid.
    DO.
      FETCH NEXT CURSOR dbcur1 INTO count_line.
      IF sy-subrc <> 0.
        EXIT.
      ENDIF.
      FETCH NEXT CURSOR dbcur2
        INTO TABLE spfli_tab PACKAGE SIZE count_line-count.
    ENDDO.
    CLOSE CURSOR: dbcur1,
                  dbcur2.
    best regards,
    Kazmi

  • Optimizing built application performance in LV 2012 (and earlier)

    There's a new help topic in the LabVIEW 2012 help called Optimizing Execution Speed for Built Applications.  If you're interested in the runtime performance of your built applications (EXEs, DLLs, etc.), I'd recommend taking a look.
    Most of the steps apply to earlier versions of LabVIEW as well - just ignore the steps that mention the "compiler optimization threshold".
    Greg Stoll
    LabVIEW R&D
    Greg Stoll
    LabVIEW R&D

    If the application is running on a standalone computer with no access to the internet, then the aplication may be taking a really long time to load. This is because Windows is trying to check the applicaton's signature over the internet when there is no internet connection.
    The solution to this problem is found here:
    http://digital.ni.com/public.nsf/allkb/9A7E2F34EC9DDEDE86257A09002A9E14

  • Adobe Reader Performance with Protect Mode On

    We are getting very poor performance in opening PDF documents with Adobe Reader XI with Protected Mode on, particularly for users remote from our central server.  With Protected Mode off, performance is much better (10-20X faster opening).  Note that we are running Windows 7 with Application Data Roaming and Adobe Reader XI with Protected Mode on, Protected View off, and Advanced Security on.  We have diagnosed the problem as follows: We are using the Windows 7 Application Roaming Data feature to house some profile data for our remote users, and when they try to open a PDF document with Protected Mode on, Adobe sends numerous I/O packets (approximatel 6000!) across the "wire" for security checking against the Application Roaming data file on the central server, thus greatly slowing PDF opening.  We would like to know the following: 1) Is there a way to turn-off Protected Mode for company server stored documents, while keeping it active for documents received from external sources (e.g. from the internet, e-mail, etc.). 2) is there a way to have Protect Mode security verification done only on the local machine profile data and not on the centrally stored Application Roaming Data file (thus greatly reducing the "across the wire" I/O traffic).  Thanks for any help, CCB.

    I don't think any of the scenarios you are envisaging are possible.  Protected Mode is enabled/disabled on a user basis (in HKEY_USERS).

  • Optimizing the query performance?

    When I run this query it takes 2-3 hours to bring the data. It pulls the data starting 2008 december. How can I optimize the performance for the below query?
    SELECT e1.Eventdate,
    TO_CHAR( TO_DATE (e1.eventdate, 'mmddyyyy, ), 'MM/DD/YYYY HH24:MI:SS' ),
    e1.USERID,
    Vu.display_name_FMLS as displayname,
         e1.eventdetail1,
    e1.eventtype
    FROM event e1,
    View_User VU
    WHERE ((Vu.USERid = e1.USERID) or (Vu.UserLanid = e1.Userid))
    AND Vu.Useractive ='1' and
    e1.eventtype = 'SRCH'
    AND e1.eventresults = '1'
    AND NOT EXISTS (
    SELECT 1
    FROM event e2
    WHERE e2.userid = e1.userid
    AND e2.eventtype IN ('OPENTPC', 'OPENFAQ', 'KEYFACT')
    AND e2.eventrecordid =
    (SELECT MIN (eventrecordid)
    FROM event e3
    WHERE e3.userid = e2.userid
    AND e3.eventrecordid > e1.eventrecordid))

    not quite sure what your requirements are but just looking at this your are selecting from the same table (event) 3 times. couldn't you just get everything you need from the event table up front and then use case statements or aggregate functions or something to just pull the stuff you want.
    maybe if you gave a bit of sample data and what the output should look like someone might come up with an approach that doesn't involve mutiple selects from the same table.

  • Optimizing iTunes Sluggish Performance/Large Library...

    Is there any way to get better performance out of sluggish iTunes when it has a large library of 100+gigs? Everything is moving as slow as molasses. Even when updating/correcting text on song files. Could one use or create two libraries to optimize iTunes performance?

    Is there any way to get better performance out of sluggish iTunes when it has a large library of 100+gigs? Everything is moving as slow as molasses. Even when updating/correcting text on song files. Could one use or create two libraries to optimize iTunes performance?

  • Passing Credentials When Reading Performance Counters on Remote Computers; Systems.Diagnostics namespace

    Hello,
    I am working on a website where I need to read a few performance counters on remote computers.
    I'm using the System.Diagnostics namespace and the following is a snipet of my code:
    ****************  CODE  *********************************************************
     Try
                With perf_process
                    .MachineName = Hostname
                    .CategoryName = "Process"
                    .CounterName = "Private Bytes"
                    'Write entry to log here
                    tmp_working_set = perf_process.NextValue()
                    txtWorkingSet.Text = tmp_working_set
                    Select Case tmp_working_set
                        Case Is > 80000000
                            working_set_status = "Red"
                        Case 40000000 To 80000000
                            working_set_status = "Green"
                        Case 1000000 To 40000000
                            working_set_status = "Yellow"
                        Case Else
                            working_set_status = "Error"
                    End Select
                    If working_set_status = "Error" Then
                        txtWorkingSet.BackColor = Drawing.Color.Red
                        txtWorkingSet.Text = String.Format(CultureInfo.InvariantCulture, "{0:0,0.0}", working_set_status)
                    Else
                        txtWorkingSet.Text = String.Format(CultureInfo.InvariantCulture, "{0:0,0.0}", tmp_working_set)
                        txtWorkingSet.Text = tmp_working_set
                    End If
                End With
            Catch
                ErrMsg = ("Error reading the Working Set (memory) counter on " & Hostname & "." & vbCrLf & "Error number is " & Err.Number & vbCrLf & "Error description:
    " & Err.Description)
                MsgBox(ErrMsg)
                Write_Log_Entry(Now(), ErrMsg)            
                ErrMsg = ""
            End Try
    ****************  CODE  *********************************************************
    I usually end up with an "Access Denied" error because the account I'm running under does not have the proper permissions on the remote computer to read the counters.
    How can I pass and connect to the remote computer with a different set of credentials that have access to the counters?
    Exactly what permissions do I need to access the remote counters?  I can read them on some of my test computers and on others, I get the "Access Denied" error.
    Thanks in Advance,
    DetRich
    DetRich

    http://forums.asp.net/
    The ASP.NET forum is probably where you need to post.

  • 7.5.2 firmware update causes samba read performance to be slow.

    Anybody has a resolution for following problem. With 7.4.1 I had an acceptable wireless speed to my macbook, from 2 different samba mounts directly connected to the airport extreme: one nas over gigabit, one linux machine on 100mbit.
    After updating to 7.5.2 the write performance is still good but reading from either the nas or the linux machine is a factor of 100-1000 slower then the write performance.....
    Does anyone has a link to download the old firmware (7.4.1)? which seems not to be downloadable from the support page.
    Any tips or tricks?

    In Airport Utility, click to highlight your Airport Extreme. In the right hand side, click on "Manual Setup". In the summary tab, place your mouse pointer above "Version" in the version number area. You will see the word turn into a link (gray highlight with a little arrow inside it). Click on this and a drop down menu will appear which will let you revert to a prior firmware version.

Maybe you are looking for

  • Portal30 Installation/configuration problems.

    Subject: Portal30 Please share your solution in installing th Oracle Portal 3.0 (included in Oracle 9iAS installation.). I am facing problem in its installation on WINDOWS 2000 server. The first time I installed the Oracle 9iAS 1.0.2, the Portal30 an

  • Material define for shipping in material master.

    Hello, My client is used the trading goods process,they have to imports the goods.They want to define the material shipping as initial phase in material master,it has to come by sea or air,requirement is that they define the material in purchasing in

  • Payment Card Industry Data Security Standards Requirement

    We store credit card numbers in our CRM and ERP systems for billing purposes. We use Delego software for credit card security. We also mask the credit card numbers during display. According to our interpretation of section 3 of the Payment Card Indus

  • Problems using CVS with Jdeveloper - Project

    I have Jdeveloper on Windows 2000 and CVS repository on unix. I am using wincvs. In the documentation it mentions that I have to create a new project on JDeveloper and then import that project in CVS. But if I already have files in CVS then how can I

  • Idea for new freeze track feature...

    Dear all, I just wanted to run this by you all before submitting it to the suggestions box to see if you reckon it would be a good idea, and even if it would be feasable to make... What I was thinking is this: Would it be possible to render just an i