About data compression

The lastest version of Berkeley DB supports data compression with its set_bt_compression method.
I created a database, using default data compression method provided by Berkeley DB. Like this:
DB *dbp;
db_create(&dbp, inenv, 0);
dbp->set_flags( dbp, DB_DUPSORT );
dbp->set_bt_compress(dbp, NULL, NULL);
Then i insert key, data.
For keys, they are random char arrays,
For data, they are char array with the same content.
Now the problem is: the compressed database file is the same size of the one that i didn't use the compress method.
Can someone tell me why? THX

Hi,
This is likely because the default compression function does not have much to work with.
Specifying NULL for both compression and decompression functions in the DB->set_bt_compress method call implies using the default compression/decompression functions in BDB. Berkeley DB's default compression function performs prefix compression on all keys and prefix compression on data values for duplicate keys.
You haven't specified a prefix or key comparison function (DB->set_bt_prefix, DB->set_bt_compare), hence a default lexical comparison function is used as the prefix function. Given that your keys are random char arrays, the default lexical comparison function may not perform very well in identifying efficient (large-sized) prefixes for the keys.
Also, as the keys are truly random, it's unlikely that you'll have duplicates, so there's likely nothing to compress on data values for duplicate keys.
Even if the compression function does compress any keys' prefixes or prefixes for duplicate's data items, if the compressed items (and uncompressed ones) still require to be stored on the same number of database pages as in the case without compression, you'll not see any difference in database file size.
Regards,
Andrei

Similar Messages

  • Using Data Compression on Microsoft SQL 2008 R2

    We have a very large database which keeps growing and growing. This has made our upgrade process extremely troublesome because the upgrade wizard seems to require close to 3 times the database size of free space to even start.
    As such, we are considering activating DATA COMPRESSION on the PAGE level in Microsoft SQL Server 2008 R2. This technology is native to the SQL Server and compresses the rows and the pages so that they do not take up more space than necessary.
    Traditionally each row take up the space of the maximum of all the fields even though only part of a field is filled with data.
    [Blog about Data Compression|http://blogs.msdn.com/b/sqlserverstorageengine/archive/2007/11/12/types-of-data-compression-in-sql-server-2008.aspx]
    Our idea is to use this initially on the Axxx tables (historic data) to minimize the space they take by using for example:
    ALTER TABLE [dbo].[ADO1] REBUILD PARTITION = ALL
    WITH (DATA_COMPRESSION = PAGE)
    On a test database we have seen tables go from 6GB of space to around 1,5GB which is a significant saving.
    MY QUESTION: Is this allowed to do from SAP point of view? The technology is completely transparent but it does involve a rebuild of the table as demonstrated above.
    Thanks.
    Best regards,
    Mike

    We are using Simple recovery model, so our log files are pretty small.
    Our database itself is about 140GB now and it keeps growing.
    We've also reduced the history size to about 10 versions.
    Still, some of our tables are 6-10GB.
    Some of the advantages of data compression is to also that it improves disk I/O at the cost of slightly higher CPU, which we are pretty sure our server can handle.
    Mike

  • I have a question about Data Rates.

    Hello All.
    This is a bit of a noob question I'm sure. I don't think I really understand Data Rates and how it applies to Motion... therefore I'm not even sure what kind of questions to ask. I've been reading up online and thought I would ask some questions here. Thanks to all in advance.
    I've never really worried about Data Rates until now. I am creating an Apple Motion piece with about 15 different video clips in it. And 1/2 of them have alpha channels.
    What exactly is Data Rate? Is it the rate in which video clip data is read (in bits/second) from the Disc and placed into my screen? In Motion- is the Data Rate for video only? What if the clip has audio? If a HDD is simply a plastic disc with a dye read by "1" laser... how come my computer can pull "2" files off the disc at the same time? Is that what data transfer is all about? Is that were RAM comes into play?
    I have crunched my clips as much as I can. They are short clips (10-15seconds each). I've compressed them with the Animation codec to preserve the Alpha channel and sized them proportionally smaller (320x240). This dropped their data rate significantly. I've also taken out any audio that was associated with them.
    Is data rate what is slowing my system down?
    The data rates are all under 2MBs. Some are as low as 230Kbs. They were MUCH higher. However, my animation still plays VERY slowly.
    I'm running a 3GigRam Powerbook Pro 2.33GHz.
    I store all my media on a 1TB GRaid Firewire 800 drive. However for portability I'm using a USB 2 smartdisk external drive. I think the speed is 5200rpm.
    I'm guessing this all plays into the speed at which motion can function.
    If I total my data rate transfer I get somewhere in the vicinity of 11MBs/second. Is that what motion needs for it to play smoothly a 11MBs/second data connection? USB 2.0 is like what 480Mbs/second. So there is no way it's going to play quickly. What if I played it from my hard drive? What is the data rate of my internal HDD?
    I guess my overall question is.
    #1. Is my thinking correct on all of these topics? Do my bits, bytes and megs make sense. Is my thought process correct?
    #2. Barring getting a new machine or buying new hardware. What can I do to speed up this workflow? Working with 15 different video clips is bogging Motion down and becoming frustrating to work with. Even if only 3-4 of the clips are up at a time it bogs things down. Especially if I throw on a glow effect or something.
    Any help is greatly appreciated.
    -Fraky

    Data rate DOES make a difference, but I'd say your real problem has more to do with the fact that you're working on a Powerbook. Motion's real time capabilities derive from the capability of the video card. Not the processor. Some cards do better than others, but laptops are not even recommended for running Motion.
    To improve your workflow on a laptop will be limited, but there are a few things that you can try.
    Make sure that thumbnails and previews are turned off.
    Make sure that you are operating in Draft Mode.
    Lower the display resolution to half, or quarter.
    Don't expect to be getting real time playback. Treat it more like After Effects.
    Compressing your clips into smaller Animations does help because it lowers the data rate, but you're still dealing with the animation codec which is a high data rate codec. Unfortunately, it sounds necessary in your case because you're dealing with alpha channels.
    The data rate comes into play with your setup trying to play through your USB drive. USB drives are never recommended for editing or Motion work. Their throughput is not consistent enough for video work. a small FW drive would be better, though your real problem as I said is the Powerbook.
    If you must work on the powerbook, then don't expect real-time playback. Instead, build your animation, step through it, and do RAM previews to view sections in real time.
    I hope this helps.
    Andy

  • Question about advance compression in Oracle 11gR2

    Hi,
    I am on Oracle 11gR2 on Solaris 10. I want to run the oracle advance compression advisor for my database and get compression ratios for the tables, how can I do it? I am looking for a sample command to run this advisor package (dbms_compression) from sqlplus.
    Can someone please suggest.
    Thanks,
    Nirav

    Thanks SriniChavali and Stefan. I can't make more answer "Helpful' so I couldn't do that for your answers! Here is my point. In Jonathan's blog i see this remarks:
    "Sadly it seems that “compress for OLTP” (formerly “compress for all operations”) doesn’t compress for all operations, it compresses only for inserts, and the benefits it has over basic compression are that (a) it leaves 10% of the block free for updates, and (b) it doesn’t require direct path inserts to trigger compression. Given the limitations on how it works you may find that the problems it brings might make it something you want to avoid.... this is at this link: http://allthingsoracle.com/compression-in-oracle-part-3-oltp-compression/
    To date I’ve only heard complaints about OLTP compression (there’s an element of self-selection there as no-one ever calls me to look at their system because it’s running so well and has no problems). A common thread in the complaints I have heard, though, is about the significant amount of row migration (once it has been noticed), the extra CPU, and ”buffer busy waits”.
    Compression for OLTP is (according to the manuals) supposed to be able to compress during updates – but it doesn’t (at least, as far as I can tell); this means that you can easily end up suffering a large number of row migrations on updates, which can result in extra random I/Os, buffer busy waits, and increased CPU and latch activity.
    If you can work out a good strategy for using OLTP compression, though, think carefully about making a choice between freelist management and ASSM – there seem to be some undesirable side effects that appear when you mix OLTP compression with ASSM."
    Note that I am not that technical and I hope I have not mis-quoted. I am trying to figure out if this is a good option or not and seeing some findings that it may not be so good and seeing at some other places notes that it is indeed very good.
    Best regards

  • How to find data compression and speed

    1. whats the command/way for viewing how much space the data has taken on its HANA tables as oposed to the same data in disk . I mean how do people measure that there has been a  10:1 data compression.
    2. The time taken for execution, as per seen from executing a same SQL on HANA varies ( i see that when i am F8 ing the same query repeatedly) , so its not given in terms of pure cpu cycles , which would have been more absolute .
    I always thought  that there must a better way of checking the speed of execution  like checking the log which gives all data regarding executions , than just seeing the output window query executions.

    Rajarshi Muhuri wrote:
    1. whats the command/way for viewing how much space the data has taken on its HANA tables as oposed to the same data in disk . I mean how do people measure that there has been a  10:1 data compression.
    The data is stored the same way in memory as it is on disk. In fact, scans, joins etc. are performed on compressed data.
    To calculate the compression factor, we check the required storage after compression and compare it to what would be required to save the same amount of data uncompressed (you know, length of data x number of occurance for each distinct value of a column).
    One thing to note here is: compression factors must always be seen for one column at a time. There is no such measure like "table compression factor".
    > 2. The time taken for execution, as per seen from executing a same SQL on HANA varies ( i see that when i am F8 ing the same query repeatedly) , so its not given in terms of pure cpu cycles , which would have been more absolute .
    >
    > I always thought  that there must a better way of checking the speed of execution  like checking the log which gives all data regarding executions , than just seeing the output window query executions.
    Well, CPU cycles wouldn't be an absolute measure as well.
    Think about the time that is not  spend on the CPU.
    Wait time for locks for example.
    Or time lost because other processes used the CPU.
    In reality you're ususally not interested so much in the perfect execution on one query that has all resources of the system bound to it, but instead you strive to get the best performance when the system has it's typical workload.
    In the end, the actual response time is what means money to business processes.
    So that's what we're looking at.
    And there are some tools available for that. The performance trace for example.
    And yes, query runtimes will always differ and never be totally stable all the time.
    That is why performance benchmarks take averages for multiple runs.
    regards,
    Lars

  • Data compression in xi

    Hi ,
    How do you do data compression in xi?
    thanks in advance,
    Ramya Shenoy

    Hi Ramya,
                    Are you talking about the archiving of the messages in the XI server. Or compressing individual XI message as parteek has explained in his reply uisng the PayloadZipBean.
    Thanks
    Ajay

  • Oracle spatial data compression (using advance compression).

    What are the bast practice for oracle spatial to compress data using advance compression).
    ver. 11.2.0.3

    Details about Advanced Compression can be found in:
    Oracle E-Business Suite Release 12.1 with Oracle Database 11g Advanced Compression (Doc ID 1110648.1)
    Is Advanced Compression Supported In The E-business Suite ? (Doc ID 1368152.1)
    https://blogs.oracle.com/stevenChan/entry/using_advanced_compression_with_e-business_suite
    Thanks,
    Hussein

  • Re: Data Compression

    -----Original Message-----
    From: Jose Suriol <[email protected]>
    To: 'Forte mail list' <[email protected]>
    Date: Friday, February 27, 1998 1:00 PM
    Subject: Data Compression
    >
    Thanks to all who replied to my post about Fort

    >
    Thanks to all who replied to my post about Forte compressing
    data before sending them to the network. It appears Forte tries to
    minimize the size of certain data types but does not do compression
    across the board. As I understand Forte Version 4 will probably
    support the Secure Sockets Layer (SSL) which has a data compression
    option, but unfortunately SSL is, first and foremost, a secureprotocol,
    and while compression is optional, encryption (a CPU intensive process)
    in not.
    Encryption, integrity and compression are all optional in SSL.
    Its possible to request a connection that only has compression, assuming
    that the
    other side agrees.
    Derek

  • Message Data Compression

    I'm not exactly new to java, I've just been away from it for a few years.
    I'm trying to create an XML-based messaging system for communication between a suite of applications. XML being what it is (verbose text) I want to apply data compression. (In the short term, during development, the messages will be between components on a single machine. Ultimately, the applications will likely run on many machines and use the internet to communicate with each other.)
    I was looking at the java.util.zip tools, but I'm not actually creating files. I thought I could use just the ZipEntry part, but it's not coming together well. I was also thinking some flavor of SOAP might serve my needs, but SOAP has come onto the scene during my absence from development activities. I've got to familiarize myself with it a bit more before I can assess whether or not it fits my needs.
    I'm open to suggestions as to how I should approach this. All ideas anyone cares to share are greately appreciated.
    - Patrick

    The system will probably use a combination of RMI and JMS, but that's not anything I want to bring into the question at hand.
    The only problem I'm concerned about right now is, "How do I compress a packet of data?" What I do with that packet of compressed data is a problem for a different level of abstraction. I've got a fairly large buffer of XML that I want to compress before passing off to another entity to act upon. What's the best way to do that?
    - Patrick

  • Http Data Compression

    Is there any Http Data Compression support in WLS 6.1 or 7.0 ?
    There are tools for the IIS and Apache server. This helps the network
    performance and downloading time.
    www.ehyperspace.com
    http://www.innermedia.com/Products/SqueezePlay_IIS_Real-Time_Web_/squeezepla
    y_iis_real-time_web_.htm
    thanks
    /selvan

    There are no generic solutions for Weblogic 5.1.
    We support filter-like functionality for Weblogic 5.1 with our EnGarde
    software, but we only provide it through OEM contracts (no direct sales).
    Sorry.
    You can use a "front component" to route all requests to other servlets/JSPs
    yourself, but if you do substitution with a "front component", you'll have
    to extend the WL classes themselves (request, response), which gets tricky.
    Peace,
    Cameron Purdy
    Tangosol, Inc.
    http://www.tangosol.com/coherence.jsp
    Tangosol Coherence: Clustered Replicated Cache for Weblogic
    "Selvan Ramasamy" <[email protected]> wrote in message
    news:[email protected]..
    Yes, I totally forgot about the filters ... Thank you .
    What will be your suggestion for the Weblogic 5.1 server ? As most of my
    customers are using the weblogic 5.1.
    thanks
    "Cameron Purdy" <[email protected]> wrote in message
    news:[email protected]..
    Cameron, how can I do this so that I don't have change all of my jspsand
    servlets ?
    Should I plug a custom ServletResponse to do this ?In 6.1 (maybe) or 7.0 you can use a filter, which is like a Servlet that
    substitutes its own Request and/or Response object.
    Peace,
    Cameron Purdy
    Tangosol, Inc.
    http://www.tangosol.com/coherence.jsp
    Tangosol Coherence: Clustered Replicated Cache for Weblogic
    "Selvan Ramasamy" <[email protected]> wrote in message
    news:[email protected]..
    >

  • I was updating software and suddenly my IPHONE  started asking for I tunes on mobile screen ,  how can  i  get by screen back or How i can restore without loosing data , I m more worried about data , Please help in resolutio

    i was updating software and suddenly my IPHONE  started asking for I tunes on mobile screen ,  how can  i  get by screen back or How i can restore without loosing data , I m more worried about data , Please help in resolutio

    What exactly are you seeing on the phone's screen ? If the iTunes icon and cable then the phone is in recovery mode, in which case it's too late to take a new backup or to copy any content off the phone - you will have to connect the phone to your computer's iTunes and reset it back to factory defaults, after which you can restore to the last backup that you took or just resync your content to it

  • Can anyone Explain about Data conversion for Material master In SAP MM

    Can anyone Explain about Data conversion for Material master, Vendor  In SAP MM
    Thanks

    Hi,
    Refer following link;
    [Data Migration Methodology|http://christian.bergeron.voila.net/DC_Guide/Data_Migration_Methodology_for_SAP_V01a.doc]

  • A problem about the compression of directed graphs

    Right now I encountered a problem in my research that is about graph compressing (or contraction, etc.). I've searched in various ways for existing techniques on this problem and found nothing. I've been trying to figure out this problem by myself. However I would also like to seek some advice from you guys.
    The description of this problem:
    Given a directed graph G = (V, E) where V is the set of vertices and E
    is the set of directed edges. These vertices and directed edges
    represent the events and the directional relationships between pairs of
    events. Each edge is associated with a weight (or confidence score)
    which indicates the degree of the relationship. Now I want to compress
    the graph by merging some of the vertices into one superior vertex,
    which implies that several lower-level events are merged into one
    high-level event (this is mainly because the extent of news events
    defined are usually flexible). After that we can reorganize the
    vertices and edges and repeat this process until the size of the graph
    reaches a certain limit. Purely looking from the point of graph theory,
    is there any existing graph algorithm that solves this problem? As far as I have searched, the answer seems to be negative.This seems to be an interesting novel problem which falls in the area of graph algorithms. Could you suggest anything? Attached is a sample directed graph of such a kind which may be interesting.
    Check this URL to find out more about this kind of DAG:
    http://ihome.cuhk.edu.hk/~s042162/2005-05-24.jpg
    Thank you very much for your time and help.
    Regards,
    Daniel

    Sounds like an interesting problem. The temporal aspect presents an interesting wrinkle. Graph models have been becoming popular for the standard clustering problem recently, but they are typically undirected formulations. The idea of compressing a graph reminded me of work done by G. Karypis and V. Kumar on graph partitioning, some of their papers are available here:
    http://www-users.cs.umn.edu/~karypis/publications/partitioning.html
    SImilar to the reference given by matfud, with the additional restriction that there may be a size restriction on the size of the partitions or there may be multiple partitions (both restrictions make the problem NP-complete, IIRC).
    There's also the area of spectral graph partitioning which may be of interest. Its a way of finding relatively dense areas in a graph by using the eigenvalues of the adjacency matrix. Most of the results in this area are dependent on the fact that adjacency matrices for graphs are symmetric and semi-definite, which wouldn't be the case for a directed graph, but could be worth some experimentation if you have MATLAB or something similar.
    There's something else this problem reminds me of, but I can't think of it right now. Maybe later something will come to me.
    Good luck.

  • File Adapter Data Compression

    I'd like to extend file adapter behavior to add data compression features like unzip after read file and zip before write file. I read oracles's file adapter documentation but i didn't find any extension point

    if its java mapping, just create a DT with any structure as you wish.
    ex.
    DT_Dummy
    |__ Dummy_field
    java mapping does not validate the xml against the DT you created

  • TS3591 Updated itunes - windows quickly shuts it down and says something about "Data Execution Prevention". I followed the instrucxtions to a dead end. I have re downloaded it twice and used the repair files once but get the same result. It worked fine be

    Updated itunes - windows quickly shuts it down and says something about "Data Execution Prevention". I followed the instrucxtions to a dead end. I have re downloaded it twice and used the repair files once but get the same result. It worked fine before I updated it to death.

    Try updating your QuickTime to the most recent version. Does that clear up the DEP errors in iTunes?

Maybe you are looking for

  • Excise invoice cancel

    Hello Sir,        We have captured excise invoice without purchase order in t code J1IEX.  How to reverse or cancel this document? Thank you Shakir

  • Write Off Process

    Can anyone help me telling 'How Write off process with Tcode happens in FICA'. FP04 Tcode using for write off. What are the impact for this in CA, And how it displays in history.

  • Adding/removing columns in jtable

    Hello everyone, i was looking for a way to add/remove columns from a jtable. The way i envision it working is... Initially have a predefined number of columns, of these only show the user say 5 of those on startup. but then provide a drop down (or li

  • CUA: Distribute users from central to specified child systems

    Hello! I've a question concerning CUA: I've added two new systems to our CUA. Now I want to destribute the users of the central-system to the new child-systems accompanied by assigning a specifed role for the child-systems. Unfortunately, the user ma

  • Moving from trial to full version.  Serial number not valid?

    Trial version will not accept new serial number for authorization.  I bought the full version today from the Apple store (for $99 -- I guess I spent too much due to it being $79 on the App Store).  Apple sent me the new serial number, but Aperture wo