Best method for encrypting/decrypting large XML files ( 100MB)

I am in need of encrypting XML for large part files that can get upwards of 100Mb+.
I found some articles and code, but the only example I was successful in getting to work used XMLCipher, which takes a Document, parses it, and then encrypts it.
Obviously, 100Mb files do not cooperate well with DOM, so I want to find a better method for encryption/decryption of these files.
I found some articles using a CipherInputStream and CipherOutputStreams, but am not clear if this is the way to go and if this will avoid memory errors.
import java.io.*;
import java.security.spec.AlgorithmParameterSpec;
import javax.crypto.*;
import javax.crypto.spec.IvParameterSpec;
public class DesEncrypter {
    Cipher ecipher;
    Cipher dcipher;
    public DesEncrypter(SecretKey key) {
        // Create an 8-byte initialization vector
        byte[] iv = new byte[]{
            (byte)0x8E, 0x12, 0x39, (byte)0x9C,
            0x07, 0x72, 0x6F, 0x5A
        AlgorithmParameterSpec paramSpec = new IvParameterSpec(iv);
        try {
            ecipher = Cipher.getInstance("DES/CBC/PKCS5Padding");
            dcipher = Cipher.getInstance("DES/CBC/PKCS5Padding");
            // CBC requires an initialization vector
            ecipher.init(Cipher.ENCRYPT_MODE, key, paramSpec);
            dcipher.init(Cipher.DECRYPT_MODE, key, paramSpec);
        } catch (java.security.InvalidAlgorithmParameterException e) {
        } catch (javax.crypto.NoSuchPaddingException e) {
        } catch (java.security.NoSuchAlgorithmException e) {
        } catch (java.security.InvalidKeyException e) {
    // Buffer used to transport the bytes from one stream to another
    byte[] buf = new byte[1024];
    public void encrypt(InputStream in, OutputStream out) {
        try {
            // Bytes written to out will be encrypted
            out = new CipherOutputStream(out, ecipher);
            // Read in the cleartext bytes and write to out to encrypt
            int numRead = 0;
            while ((numRead = in.read(buf)) >= 0) {
                out.write(buf, 0, numRead);
            out.close();
        } catch (java.io.IOException e) {
    public void decrypt(InputStream in, OutputStream out) {
        try {
            // Bytes read from in will be decrypted
            in = new CipherInputStream(in, dcipher);
            // Read in the decrypted bytes and write the cleartext to out
            int numRead = 0;
            while ((numRead = in.read(buf)) >= 0) {
                out.write(buf, 0, numRead);
            out.close();
        } catch (java.io.IOException e) {
}This looks like it might fit, but there is one more twist, I am using a persistence manager and xml encoding to accomplish that, so I am not sure how (where) to implement this method without affecting persistence.
Any guidance on what would work best in this situation would be appreciated.
Regards,
vbplayr2000

I can give some general guidelines that might help, having done much similar work:
You have 2 different issues, at least from my reading of your problem:
1) How to deal with large XML docs that most parsers will not handle without memory issues
2) Where to hide or "black box" the encrypt/decrypt routines
#1: Check into XPP3/XMLPull. Yes, it's different that the other XML parsers you are used to using, and more work is involved, but it is blazing fast and can be used to parse a stream as it is being read. You can populate beans and process as needed since there is really not much "inversion of control" involved compared to parsers that go on to finish the entire document or load it all into memory.
#2: Extend Serializable and write your own readObject/writeObject methods. Place the encrypt/decrypt in there as appropriate. That will "hide" the implementation and should be what any persistence manager can deal with.
Regards,
antarti

Similar Messages

  • What are the best tools for opening very large XML files and examining the tree and confirming they are valid?

    I am generating some very large XML files (600,000+ lines, 50MB+ characters). I finally have them all being valid XML and valid UTF-8.
    But the files are so large Safari and Chrome will often not open them. FireFox will though.
    Instead of these browsers, I was wondering if there are there any other recommended apps for the Mac for opening and viewing the XML, getting an error message if they are not valid for some reason and examing the XML tree?
    I opened the file in the default app for XML which is Xcode, but that is just like opening it in a plain text editor. You can't expand/collapse the XML tree like you can with a browser, and it doesn't report errors.
    Thanks,
    Doug

    Hi Tom,
    I had not seen that list. I'll look it over.
    I'm also in touch with the developer of BBEdit (they are quite responsive) and they are willing to look at the file in question and see why it is not reporting UTF-8 errors while Chrome is.
    For now I have all the invalid characters quashed and things are working. But it would be useful in the future.
    By the by, some of those editors are quite pricey!
    doug

  • Best parser for handling very large XML  document

    which is the best parser whenread and extract information from very large XML document

    Any SAX-parser, since DOM would use 6 times as much primary memory as the file-size.
    Xerces SAX-parser is in my experience the fastest.
    Gil

  • What is the best method for saving the client sequence file revision in the database \ report?

    I'm trying to figure out the best way to store the sequence file Revision in the database. That is, if I have the Revision (SequenceFile.AsPropertyObjectFile.Version), where (e.g. what table / field) should I put it if I'm using the SQL Server schema that ships with TestStand?  How do I get it there?
    Certified LabVIEW Architect
    Wait for Flag / Set Flag
    Separate Views from Implementation for Strict Type Defs
    Solved!
    Go to Solution.

    Ok LabBEAN,
    Here is my tutorial.  It is actually easier than I thought:
    Step 1: Configure>>Database Options and uncheck the Disable Database Logging.
    Step 2: Click the Data Link tab
    Step 3: Make sure the Connection String Expression is pointing to the right location.  Should be a public directory with a .mdb file if you are using TestStand defaults.  Click the View Data button to verify.  It should open the Database Viewer so you can look at the tables.  Leave the Database Viewer open.
    Step 4: Click the Schemas tab
    Step 5: With the Generic Recordset (NI) selected click the Duplicate button.
    Step 6: You should now see a copy of that schema.  In the Name box name it MyRecordset and make sure it is checked
    Step 7: Click the Statements tab and highlight STEP_SEQCALL.  NOTE: you must always do this before clicking on the Columns/Parameters tab
    Step 8: Click the Columns/Parameters tab
    Step 9: Highlight the SEQUENCE_FILE_PATH item and click the Copy Button
    Step 10:  Highlight the new entry and change the Name to SEQUENCE_FILE_VERSION
    Step 11: MOST CRITICAL STEP: Change the Expression to RunState.Engine.GetSequenceFileEx(Logging.StepResu​lt.TS.SequenceCall.SequenceFile, 107, ConflictHandler_Error).AsPropertyObjectFile.Versio​n  It is better to do it this way because you never now where a sequence file call will be made and not all sequence calls are made to the model client sequence.
    Step 12: Hopefully you left the Database Viewer open from Step 3. Go to it.
    Step 13: Right click the STEP_SEQCALL and select Add Column..  Name the new column: SEQUENCE_FILE_VERSION.  Basically you need a column that matches the one you created back in the Columns/Parameters tab.  Set it up with the same type and size.  NOTE: there is an alternate way to do this using the Execute SQL View in the Database Options but you need to create it.  You can create it from the Schemas tab back in TS by clicking the Build .sql File.. button.
    Step 14: Back in TestStand click OK to save and close the Database Options.
    Now run your sequence and you will see the new data in your database.
    jigg
    CTA, CLA
    teststandhelp.com
    ~Will work for kudos and/or BBQ~

  • Best method for silent-auto DBCA : response file OR dbca scripts ?

    As far as I discovered there are 2 alternatives on creating silently a Database
    1. Generate scripts from DBCA itself at end and run the batch - even if you have to make some modification to pass dynamically the DB Name or other parameters eachtime
    2. Execute a dbca response file
    What are the differences ?
    Which is the best (more reliable) ?
    Windows 7 11.2g EE
    Thank you

    Those scripts are definitely much slower than directly running the dbca binary. dbca has its own way of specifying all the required parameters as command line arguments directly without any response files.
    If you're relying on response files, you need extra programming effect to fill all the required entries in them based upon the environment and user's input and chances of introducing vulnerability is a bit higher as commands like sed/awk etc you use to fill the response files may behave weirdly sometimes (i'm not saying this happens always but there's a chance).
    If you're giving everything inline as an argument,and you can use environment variables etc.. as arguments..so less chances of missing something
    You can check the dbca documentation if it supports much more arguments..i listed whatever i used. And I'm not sure if response files support more arguments than this (i think this has covered pretty much all important ones).
    If you want to do instance level parameter changes, you can always do them after creating the database (of course you can automate that by putting a sample init file with all required changes like SGA size wtc.. and restart the instance with that file...everything automated and executed in a flow after the initial db creation).
    CSM

  • Bouncy Castle Encryption for Large XML Files?

    Hi,
    I am trying to encrypt files > 8 KB using the KeyBasedLargeFileProcessor Utility class of Bouncy Castle.
    It's encrypting the file. But, unable to decrypt the same encrypted file. Hence it's encrypting incorrectly.
    While trying to decrypt the encrypted file, it says "It was encrypted with a key (4E616D65) that does not exist"
    Please suggest what change needs to be made in the 'encryptFile' method where it says -
    "OutputStream cOut = cPk.open(out, new byte[1 << 16]);"
    I tried changing the value, but it doesn't work. The maximum file size that we may encrypt is around 3 MB.
    The files that are being encrypted / decrypted are XML files.
    Any inputs are highly appreciated.
    Thanks,
    Tan

    While trying to decrypt the encrypted file, it says "It was encrypted with a key (4E616D65) that does not exist" So use the same key you encrypted with.
    Please suggest what change needs to be made in the 'encryptFile' method where it says -
    "OutputStream cOut = cPk.open(out, new byte[1 << 16]);" Do you have some evidence that that's where the problem is?

  • Best technology to navigate through a very large XML file in a web page

    Hi!
    I have a very large XML file that needs to be displayed in my web page, may be as a tree structure. Visitors should be able to go to any level depth nodes and access the children elements or text element of those nodes.
    I thought about using DOM parser with Java but dropped that idea as DOM would be stored in memory and hence its space consuming. Neither SAX works for me as every time there is a click on any of the nodes, my SAX parser parses the whole document for the node and its time consuming.
    Could anyone please tell me the best technology and best parser to be used for very large XML files?

    Thank you for your suggestion. I have a question,
    though. If I use a relational database and try to
    access it for EACH and EVERY click the user makes,
    wouldn't that take much time to populate the page with
    data?
    Isn't XML store more efficient here? Please reply me.You have the choice of reading a small number of records (10 children per element?) from a database, or parsing multiple megabytes. Reading 10 records from a database should take maybe 100 milliseconds (1/10 of a second). I have written a web application that reads several hundred records and returns them with acceptable response time, and I am no expert. To parse an XML file of many megabytes... you have already tried this, so you know how long it takes, right? If you haven't tried it then you should. It's possible to waste a lot of time considering alternatives -- the term is "analysis paralysis". Speculating on how fast something might be doesn't get you very far.

  • I want to load large raw XML file in firefox and parse by DOM. But, for large XML file the firefox very slow some time crashed . Is there any option to increase DOM handling memory in Firefox

    Actually i am using an off-line form to load very large XML file and using firefox to load that form. But, its taking more time to load and some time the browser crashed. through DOM parsing this XML file to my form. Is there any option to increase DOM handler size in firefox

    Thank you for your suggestion. I have a question,
    though. If I use a relational database and try to
    access it for EACH and EVERY click the user makes,
    wouldn't that take much time to populate the page with
    data?
    Isn't XML store more efficient here? Please reply me.You have the choice of reading a small number of records (10 children per element?) from a database, or parsing multiple megabytes. Reading 10 records from a database should take maybe 100 milliseconds (1/10 of a second). I have written a web application that reads several hundred records and returns them with acceptable response time, and I am no expert. To parse an XML file of many megabytes... you have already tried this, so you know how long it takes, right? If you haven't tried it then you should. It's possible to waste a lot of time considering alternatives -- the term is "analysis paralysis". Speculating on how fast something might be doesn't get you very far.

  • Efficient searching in a large XML file for specific elements

    Hi
    How can I search in a large XML file for a specific element efficiently (fast and memory savvy?) I have a large (approximately 32MB with about 140,000 main elements) XML file and I have to search through it for specific elements. What stable and production-ready open source tools are available for such tasks? I think PDOM is a solution but I can't find any well-known and stable implementations on the web.
    Thanks in advance,
    Behrang Saeedzadeh.

    The problem with DOM parsers is that the whole document needs to be parsed!
    So with large documents this uses up a lot of memory.
    I suggest you look at sometthing like a pull parser (Piccolo or MPX1) which is a fast parser that is program driven and not event driven like SAX. This has the advantage of not needing to remember your state between events.
    I have used Piccolo to extract events from large xml based log files.
    Carl.

  • When bouncing- what's best method for smallest file size/highest quality?

    I am in the process of embedding 3 mp3's into a PDF to submit as a portfolio. The PDF also has text, and two scores included, and with the 3 embedded mp3's it can't be more than 10mb.
    So my question is: When bouncing a project out of Logic, what is the best method for getting the smallest file size, but retaining the best audio quality? And once it's out of Logic and it is an mp3 or other type of audio file, is there a best format for compressing it further, and still maintaining the relative quality?
    I bounced out the three projects into wav's. Now I am using Switch for Mac to compress them down to smaller Mp3's. I basically need them to be about 3 mb each. Two of the recordings sound OK at that size, but they are just MIDI(one project is piano and string quartet, the other is just piano- all software instruments. The recording that combines MIDI and Audio and has more tracks (three audio tracks and 10 Midi/software instrument tracks)and sounds completely horrible if I get it under 5 mb as an mp3. The problem is that I need all three to equal around 9mb, but still sound good enough to submit as a portfolio for consideration into a Master's program.
    If anyone can help I would really appreciate it. Please be detailed in your response, because I am new to logic and I really need the step by step.
    Thank you...

    MUYconfundido wrote:
    I am in the process of embedding 3 mp3's into a PDF to submit as a portfolio. The PDF also has text, and two scores included, and with the 3 embedded mp3's it can't be more than 10mb.
    So my question is: When bouncing a project out of Logic, what is the best method for getting the smallest file size, but retaining the best audio quality?
    The highest bitrate that falls within your limits. You'll have to calculate how big your MP3's can be, then choose the bitrate that keeps the size within your limit. The formula is simple: bitrate is the number of kilobits per second, so a 46 second stereo file at 96 kbps would be 96 x 46 = 4416 kbits / 8* = 552 kBytes or 0.552 MB. (*8 bits = 1 Byte)
    So if you know the length of your tracks you can calculate what bitrate you need to keep it within 10 MB total.
    I consider 128 kbps the lowest bearable bitrate for popsongs and other modern drumkit based music. Deterioration of sound quality is often directly related to the quality of the initial mix and the type of instruments used in it. Piano(-like) tones tend to sound watery pretty quickly at lower bitrates, as do crash and ride cymbals. But don't take my word for it, try it out.
    And once it's out of Logic and it is an mp3 or other type of audio file, is there a best format for compressing it further, and still maintaining the relative quality?
    You can only ZIP the whole thing after that, but that is just for transport. You'll have to unzip it again to use it. And no, you cannot compress an MP3 any further and still play it.
    I bounced out the three projects into wav's. Now I am using Switch for Mac to compress them down to smaller Mp3's.
    That is silly, you could have done that in Logic, which has one of the best MP3 encoders built in. And how good encoders are will especially come out at bitrates around or below 128, which you might be looking at.
    I basically need them to be about 3 mb each.
    So, one more scrap of info we need here: how long are those three pieces, exactly? I'll calculate the bitrate for you - but please bounce 'm directly out of Logic as MP3's. They will very probably sound better than your WAV-conversions made with Switch.
    !http://farm5.static.flickr.com/4084/4996323899_071398b89a.jpg!
    Two of the recordings sound OK at that size, but they are just MIDI(one project is piano and string quartet, the other is just piano- all software instruments. The recording that combines MIDI and Audio and has more tracks (three audio tracks and 10 Midi/software instrument tracks)and sounds completely horrible if I get it under 5 mb as an mp3. The problem is that I need all three to equal around 9mb, but still sound good enough to submit as a portfolio for consideration into a Master's program.
    Length of the piece? And does the .Wav bounce you have sound OK?

  • What is the best method for writing Multicolum​n List data to a text file?

    I am trying to find the best method for writing the data from a multicolumn list to a text file. Say the list has 7 rows and 6 columns of data. I would like the final file to have the data resemble the Multicolumn List as closely as possible with or without column headers. A sample VI showing how to accomplish this would be greatly appreciated. I realize this is pretty basic stuff, but I can get the output to the file, but it comes out with duplicate data and I am on a time crunch hense my request for help.
    Thank You,
    Charlie
    Everything is Free! Until you have to pay for it.

    Hello,
    I think that the answer to your question it's on the example that I've made right now.
    See the attached files....
    Software developer
    www.mcm-electronics.com
    PS: Don't forget to rate a good anwser ; )
    Currently using Labview 2011
    PORTUGAL
    Attachments:
    Multi.vi ‏12 KB
    Multi.PNG ‏6 KB

  • What is the best method for saving files off of the hard drive?

    I would like to save music files and photos somewhere other than my hard drive (eg. CDs, DVDs, zip disks). What is the best method for doing this?

    It rather depends on how paranoid you are
    External hard drives protect you from hard disk trouble on your machine.
    Optical media (CD's, DVD's) protect you from trouble with magnetic disks - such as external hard drives and the HD in your machine. Optical disks are thought to be more reliable that magnetic disks for long term storage.
    Off site back-ups will protect you from fire or theft at your home or office.
    How paranoid are you? Personally everything is backed up across three diferent external HD's. And maybe twice a year I burn a copy of my photos onto DVDs and they go to a relative's house across town.
    Regards
    TD

  • Best Encoding method for decent quality but low file size

    Been trying to encode some mountain bike helmet cam footage which is fairly fast moving and looking for the best method for encoding at a decent quality without to much pixelation occuring. videos will be on my website so want file size to be kept quite low. website is www.extremesportsfilms.com if anyone is interested.
    Cheers

    Thank you, Harm !
    This is the first time in my carreer, that I've seen a qualitysetting that doesn't inflict on filesized
    I went in to set the bitrate diffently, but the field is grey'd out - it's not possible for me to change.
    It seems to be locked no matter what quicktime-format I choose. If I choose h.264 format insted, I CAN change bitrate, but I need to deliver in XDcam for this client.
    Is there no way for me to make the XDcam files smaller, then? They end up 5-7GBs for each show and with my upload-speed, it takes me 6-7 hours to upload and I would like to speed this up by decreasing the file sizes a little, if possible?
    Thanks

  • Is JAXB suitable for large XML files ?

    Hi,
    I have a very large XML file (~700 MB) (schema available). I need to unmarshall this into java objects and carry out some (business validation rules) on it.. These buisness rules may involve validating data from content objects that correspond to different sections of this large XML file.
    I am uncertain whether JAXB will help me here. (just started on it) Does JAXB build the entire content tree for the XML document during Unmarshaller.unmarshall ? Is there anyway of asking it to build content objects on demand as opposed to building the whole content tree immediately ?
    All help/suggestions appreciated.

    Forgot to add:
    after carrying out validation the data is put into some RDBMS tables.
    One approach would be to convert the XML files into SQL Loader compatible flat files (using a tool). Load these flat files into staging tables. Perform business validations on staging table data and then finally move the data into the main tables. All the validating logic could either be in stored procedures or java code.
    The above is very long-winded. It would be great if JAXB can handle very large XML files (without loading the whole XML file into memory) so that business validations can be done by java, without any intermediate format conversion.
    I hope the above is somewhat clear.

  • Loading, processing and transforming Large XML Files

    Hi all,
    I realize this may have been asked before, but searching the history of the forum isn't easy, considering it's not always a safe bet which words to use on the search.
    Here's the situation. We're trying to load and manipulate large XML files of up to 100MB in size.
    The difference from what we have in our hands to other related issues posted is that the XML isn't big because it has a largly branched tree of data, but rather because it includes large base64-encoded files in the xml itself. The size of the 'clean' xml is relatively small (a few hundred bytes to some kilobytes).
    We had to deal with transferring the xml to our application using a webservice, loading the xml to memory in order to read values from it, and now we also need to transform the xml to a different format.
    We solved the webservice issue using XFire.
    We solved the loading of the xml using JAXB. Nevertheless, we use string manipulations to 'cut' the xml before we load it to memory - otherwise we get OutOfMemory errors. We don't need to load the whole XML to memory, but I really hate this solution because of the 'unorthodox' manipulation of the xml (i.e. the cutting of it).
    Now we need to deal with the transofmation of those XMLs, but obviously we can't cut it down this time. We have little experience writing XSL, but no experience on how to use Java to use the XSL files. We're looking for suggestions on how to do it most efficiently.
    The biggest problem we encounter is the OutOfMemory errors.
    So I ask several questions in one post:
    1. Is there a better way to transfer the large files using a webservice?
    2. Is there a better way to load and manipulate the large XML files?
    3. What's the best way for us to transform those large XMLs?
    4. Are we missing something in terms of memory management? Is there a better way to control it? We really are struggling there.
    I assume this is an important piece of information: We currently use JDK 1.4.2, and cannot upgrade to 1.5.
    Thanks for the help.

    I think there may be a way to do it.
    First, for low RAM needs, nothing beats SAX. as the first processor of the data. With SAX, you control the memory use since SAX only processes one "chunk" of the file at a time. You supply a class with methods named startElement, endElement, and characters. It calls the startElement method when it finds a new element. It calls the characters method when it wants to pass you some or all of the text between the start and end tags. It calls endElement to signal that passing characters is over, and to let you get ready for the next element. So, if your characters method did nothing with the base-64 data, you could see the XML go by with low memory needs.
    Since we know in your case that the characters will process large chunks of data, you can expect many calls as SAX calls your code. The only workable solution is to use a StringBuffer to accumulate the data. When the endElement is called, you can decode the base-64 data and keep it somewhere. The most efficient way to do this is to have one StringBuffer for the class handling the SAX calls. Instantiate it with a big enough size to hold the largest of your binary data streams. In the startElement, you can set the length of the StringBuilder to zero and reuse it over and over.
    You did not say what you wanted to do with the XML data once you have processed it. SAX is nice from a memory perspective, but it makes you do all the work of storing the data. Unless you build a structured set of classes "on the fly" nothing is kept. There is a way to pass the output of one SAX pass into a DOM processor (without the binary data, in this case) and then you would wind up with a nice tree object with the rest of your data and a group of binary data objects. I've never done the SAX/DOM combo, but it is called a SAXFilter, and you should be able to google an example.
    So, the bottom line is that is is very possible to do what you want, but it will take some careful design on your part.
    Dave Patterson

Maybe you are looking for