Fastest way to read a 5MB textfile (200,000 lines)?

My current code:
String file = "";
BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(filename)));
while(in.ready()) file = file.concat(in.readLine());Unfortunately it is awfully slow (starting with 2,000 lines per second but dropping to like 100 lines per second over time). What is the fastest way to load the content of that file in a string and out of curiosity, why are my lines per second dropping?
Thank you for your help!

tjacobs01 wrote:
This is exactly what I've done - it's the fastest thing out there. One thing to comment on though: Using a BufferedInputStream is not necessary; BufferedInputStream only improves performance if you're doing inefficient reading (such as reading blocks by newlines). If you doubt what I'm saying set up a quick performance test bed and test it... I did this a while back which is why I know the truth :)I believe you. Thanks for the tip.
In my case I needed the BIS because I only wanted to load the 40 MB audio file in 1 MB chunks at a time
to process it and store it in my own representation.

Similar Messages

  • Haskell: fastest way to read and print a file? & hGetBuf example?

    I was playing around with simple IO in Haskell just for the sake of learning and decided to implement a very basic version of "cat" for a rough speed comparison. Being a Haskell noob, I tried various combinations of readFile and hGetContents from System.IO before I realized that they read a single Char at a time, which explained why there were slower than even Python and Perl. I found the hGetBuf function but I couldn't figure out how to get my data back from the Ptr. Eventually I ended up with this:
    import System( getArgs )
    import qualified Data.ByteString as B
    main = getArgs >>= (B.readFile.head) >>= B.putStr
    time cat /var/log/pacman.log | wc
    297783 1945570 32536814
    real 0m0.624s
    user 0m0.620s
    sys 0m0.017s
    time ./test /var/log/pacman.log | wc
    297783 1945570 32536814
    real 0m0.671s
    user 0m0.640s
    sys 0m0.040s
    where "test" is the program above compiled with "ghc --make test.hs -o test -O".
    So, my questions:
    1) Is there any way to shave off the last few milliseconds to make it as fast as cat for that operation?
    2) Can someone give a clear example of how to use hGetBuf, specifically how to "marshall" a Ptr into a String or ByteString?
    I've tried to find examples but the very few that I've found haven't been that helpful.
    Slightly off-topic yet tangential questions:
    *) Does anyone else find it irksome that you have to dig through Haskell's abstraction to deal with the underlying system? I'm only just beginning to learn it but I've already come across a few things which feel like square blocks hammered into round holes.

    @CBM80
    Thanks for the link. I've bookmarked it for now.
    @brisbin33
    I've been coding a bit in Haskell lately and it feels like I'm getting comfortable with it, including monads (e.g. state transformers, system IO). I think I'm over the first big conceptual stumbling block and it's starting to feel much more intuitive. Nevertheless I still generally agree with that statement, although now I would have worded it differently.
    It feels like they've pulled a sheet over the underlying system. You can still see the basic shape of it and where different bits protrude, but you're supposed to pretend that it's all nice and smooth. If you want to do anything serious with it, you have to cut a hole through the sheet (i.e. use the foreign function interface and another language which can get to it) to gain full access.
    When I wrote that I think I actually had Ints and Integers in mind. The difference between them is only how the underlying system represents them. From a purely abstract|functional perspective, there should be no distinction between them. I understand that the language is ultimately bound by the hardware and I actually appreciate that the distinction can enable the programmer to optimize his code. It just annoys me that there is a layer of abstraction that prevents me from exerting more control over the types. For example, it would be nice to be able to declare my own "PositiveInt" type and have it represented by an unsigned int in the underlying system. I can "see" from within Haskell how "Int" is represented, but I can't create a similar representation (without resorting to the FFI, afaik). There has been talk about this before and how to make all types user-definable. Some tutorial even give the dummy declaration of Int as "data Int = ...|-3|-2|-1|0|1|2|...". Obviously you would need some sort of meta-language to handle it though (which seems to be in the works).
    For the record, I think the FFI is great and I see great potential in it and can't wait to start using it.
    At some point I solved some of the problems on Project Euler (and got sidetracked with Perl and Python solutions). My quest to learn Haskell has also been periodic and so far I haven't gone back to PE but I intend to. Btw, I found it insightful to compare my solutions to the solutions posted in the wiki. They're full of little epiphanies.
    *edit*
    I realize that many apparent limitations are likely due to my own ignorance. I'm reading through the pages posted by CBM80 and already see some ways to get around the abstraction.
    Last edited by Xyne (2010-05-06 19:03:17)

  • Fastest way to read and write to/from muliple DAQ channels

    I'm developing an application in which multiple DAQ channels will be monitored, for example, displacement, load, strain, temperature and so on. My question is what is the fastest and most efficent way to check the value of these inputs, compare them to user inputs and then decide whether or not to write a different voltage to an output. I'm using LV 7.1 on Win 2000.

    I guess my first question would be are all these channels going to be on one card? Are you going to just make single measurements (are you interested in instantaneous values vs. "waveforms")? Doing a "Sample channels" for instance will make a single measurement of several channels (on the same board) and present you with an array of the resulting data, one element per channel. How fast do you need to sample, are all of the signals going to be on the same board? The latest NI-Daq allows configuring tasks that mix channel types (T'couples, straight voltage measurements, etc.) which makes configuring the DAQ easier.
    If you can give us a little more information regarding what you are trying to accomplish it will help.
    P.M.
    Putnam
    Certified LabVIEW Developer
    Senior Test Engineer
    Currently using LV 6.1-LabVIEW 2012, RT8.5
    LabVIEW Champion

  • What is the fastest way to detect on to off tag transition​s and then read 500 analog tags?

    what would be the fastest way to read 500 analog tags from the tag engine when a boolean tag transitions from on to off?? Right now I have a boolean indicator setup with the HMI wizard in a while loop with 20ms timer. The indicator feeds the boolean crossing ptbypt vi. When the output is true, I use one read multiple tags vi to get all 500 at once. I am reading data into the tag engine through an opc server and have around 2500+ tags. I need to read all of the data in less than 100ms. My plc logic is setup to zero out all of the 500 analog tags when the boolean indicator turns on again. Would I be better off using the trend tags vi to monitor the boolean indicator??

    Unclebump,
    You might try using read tag.vi

  • Fast way to read and write console

    hi guys
    what's the fastest way to read and write from console?
    For writing I'm using (I think thats the fastest way)
    System.out.println(foo)and I have to read the following - all numbers are ints (that means m + n + 1 lines):
    1 * "<n> <m>"
    n * "<a> <b> <c>"
    m * "<a> <b>"Actual I'm reading (the second line for example) with
    BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
    String[] items = br.readLine().split("[^\\-\\.0-9]+");
    Target mainTarget = new Target(Integer.parseInt(items[0]), Integer.parseInt(items[1]), Integer.parseInt(items[2]));
    for (int i = 1; i < m; i++) {
        items = br.readLine().split("[^\\-\\.0-9]+");
        mainTarget.addTarget(Integer.parseInt(items[0]), Integer.parseInt(items[1]), Integer.parseInt(items[2]));
    } But that isn't really fast...
    have you any idea?
    grz faetzminator

    faetzminator wrote:
    right, I have 1 + 1 + 1 =3 up to 10^6-1 + 10^6-1 + 1 =2*10^6-1 input lines with max 4999997 integers (and I have to store 2999999 of them)
    ok, if that is the fastest way... (maybe the reading of the input file is so slow; I run eg "java Foo < input > output")All those br.readLine calls where you never check if the return is null are dangerous.
    I really don't understand what you are doing. You mentioned System.in at first and now reading from files.
    I also am pretty sure that you are in fact running out of memory.

  • Fastest way to grant cube permissions per AMO (250 roles, 30 cubes)?

    Hi there,
    can anybody tell me the fastest way to grant cube permissions in a scenario, where for example 250 roles have to be granted for 30 cubes?
    Now, I do it with AMO, iterating throgh the roles, setting cube permissions.
    My method for granting access looks like this:
    public void GrantCubePermission(Role pRole, Database pDatabase, string pCubeName, ReadAccess pReadAccess, WriteAccess pWriteAccess, ReadSourceDataAccess pReadSourceDataAccess, bool pProcess, ReadDefinitionAccess pReadDefinitionAccess)
    try
    if (pRole == null) return;
    Cube cube = pDatabase.Cubes.FindByName(pCubeName);
    if (cube == null) return;
    CubePermission cubePermission = cube.CubePermissions.FindByRole(pRole.ID);
    if (cubePermission == null)
    cubePermission = cube.CubePermissions.Add(pRole.ID);
    cubePermission.Read = pReadAccess;
    cubePermission.Write = pWriteAccess;
    cubePermission.ReadSourceData = pReadSourceDataAccess;
    cubePermission.Process = pProcess;
    cubePermission.ReadDefinition = pReadDefinitionAccess;
    cubePermission.Update(UpdateOptions.AlterDependents, UpdateMode.UpdateOrCreate);
    catch (Exception ex)
    Msg(ex.ToString(), "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
    Doing it this way, the operation tooks about 4 seconds per role (the given method is executed 30 times per role, the number of the cubes to be granted for).
    Finally, for 250 roles, the operation tooks about 16 minutes.
    Is there a way to do it faster?

    Did you consider XMLA ?
    <
    Createxmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
        <
    ParentObject>
            <
    DatabaseID>DAtabasename</DatabaseID>
        </
    ParentObject>
        <
    ObjectDefinition>
            <
    Rolexmlns:xsd="http://www.w3.org/2001/XMLSchema"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2"xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2"xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100"xmlns:ddl200="http://schemas.microsoft.com/analysisservices/2010/engine/200"xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200"xmlns:ddl300="http://schemas.microsoft.com/analysisservices/2011/engine/300"xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300"xmlns:ddl400="http://schemas.microsoft.com/analysisservices/2012/engine/400"xmlns:ddl400_400="http://schemas.microsoft.com/analysisservices/2012/engine/400/400">
                <
    ID>Role</ID>
                <
    Name>ReadRole</Name>
                <
    Members>
                    <
    Member>
                        <
    Name>domain\user</Name>
                    </
    Member>
                    <
    Member>
                </
    Members>
            </
    Role>
        </
    ObjectDefinition>
    </
    Create>

  • Fastest way to write array of million longs to file

    Hi,
    I have an array of 200 million longs. What is the fastest way to write this array to a file, so that I can read it back later. I tried looping, but that is too slow.
    Thanks,
    Taran

    Maxideon wrote:
    Can't he store a portion of those longs into a very large byte array, write the array, and then rinse and repeat? This would increase size of the file writes and should spead up the time...It's still looping. Simply using buffering will decrease the I/O overhead significantly. No need to munge pre-I/O buffering ourselves.

  • What is the best way to read and manipulate large data in excel files and show them in Sharepoint

    Hi ,
    I have a large excel file that has 700,000 records in it. The excel file has a few columns that change every day.
    What is the best way to read the data form the excel file in fastest and most efficient way.
    2 nd Problem,
    I have one excel file that has many rows each row contain some data that has certain keywords.
    What I want is  to segregate the data of rows into respective sheets(tabs ) in the workbook.
    for example in rows have following data 
    1. Alfa
    2beta
    3 gama
    4beta
    5gama
    6gama
    7alfa
    in excel
    I want there to be 3 tabs now with each of the key words alfa beta and gamma.

    Hi,
    I don't really see any better options for SharePoint. SharePoint use other production called 'Office Web App' to allow users to view/edit Microsoft Office documents (word, excel etc.). But the web version of excel doesn't support that much records as well
    as there's size limitations (probably the default max size is 10MB).
    Regarding second problem, I think you need some custom solutions (like a SharePoint timer job/webpart ) to read and present data.
    However, if you can reduce the excel file records to something near 16k (which is supported rows in web version of excel) then you can use SharePoint Excel service to refresh data automatically in the excel file in SharePoint from some external sources.
    Thanks,
    Sohel Rana
    http://ranaictiu-technicalblog.blogspot.com

  • What is the fastest way of getting data?

    With a scanning electron microscope, I need to scan a 512*512 pixel area with a pixel repetition of 15000 (two channels), meaning averaging over 15000 measurements. Simultaneously I have to adjust the voltage output for every pixel.
    I am using a 6111E Multifunction I/O board in a 800MHz P3. The whole task has do be done as fast as possible (not more than 20 minutes altogether).
    What is the fastest way to get this huge amount of data with averaging and output in between? (E.g. do I use buffered read with hardware triggering or is there a faster way?)

    Using the NI-DAQ API (not LabView) will give you a significant amount of more control over what happens and when to the data stream; which translates to a more efficient program. But you need to program in C/C++ or Delphi then. The Measurement Studio provides ActiveX controls that are like the LabView ones for C&C++ (they�re slow like the LabView ones though � not a lot you can do about the Windows GDI).
    What are you trying to sample 15000 times? The 512*512 pixel field?
    That�s almost 15Gigs of data! And it means you need to process data at 12.8MB/s to finish it in 20 minutes. I hope you know C, x86 assembly and MMX.
    I would setup a huge circular buffer (NI-DAQ calls them �double buffers�), about 30 seconds worth or so, to use with SCAN_Start. Then I would proces
    s the actual buffer the card is DMA�ing the data into with a high priority thread. Progressively sum the scan values from the 16bit buffer (the samples are only 12 bit, but the buffer should still be 16bits wide) into a secondary buffer of DWORDs the size of the screen (512*512), and you�ll need two of those, one for each channel. Once the 15000 scans are complete, convert each entry into a float divide by 15000.0f, and store it in a third buffer of floats.
    If you wish to contract this out, send me an email at [email protected]

  • Fastest way to transfer information from my old MacBook Pro to a new one?

    What is fastest way to transfer data between MacBook pros? It takes forever on wi-if.

    Sometimes the fastest way would be just to transfer only your files via a external hard drive via drag and drop methods into the same named accounts on the new machine. This is especially useful if you want to avoid corruption of the old machine and have a chance to clean house of older files.
    By the time one fiddles around finding a Firewire cable and adpaters, a standard USB 3 external hard drive could be done already and be gotten at any local computer or office store.
    Also Retina's have SSD's and any data on them is not scrubbable like hard drives can be. Scrubbing SSD's would involve the entire drive being filled and this would wear them out prematurely as they have limited writes to each sector.
    If you want to use Migration or Setup Assistant, there is some element of risk involved because your copying the corruption over from the previous machine if it exists. If you know the previous software is fine on the older machine, then you can use Carbon Copy Cloner to clone the whole OS X boot parititon to the external USB drive, then hook this up to the new machine and run Migration/Setup Assistant against it, just like you would in Firewire Target Disk Mode if you don't want to wait to get a Thunderbolt to Firewire adpater from Apple.
    If you already have the Thunderbolt to Firewire adapter and a cable/adpater to the old Mac, then that's the fastest way.

  • Fastest way to load a BufferedImage on a JPanel

    I am looking for the fastest way to display a BufferedImage on a JPanel.
    I am using JAI to take in photo files (JPG, BMP, GIF, TIFF, PNG) and create thumbnails (BufferedImage).
    I was reading through the forums and saw you can either
    1)overwrite the Graphics method or
    2)imageicon->JLabel->JPanel
    Currently, I am doing number 2, but I was wondering what the best way truly is.

    as you arn't doing any animation or that kind of thing, using Swing Components will work just fine.

  • Fastest way to create child class from parent?

    As the subject states, what do you folks find is fastest when creating child classes directly from the parent? (esp. when the parent is in a lvlib) I thought I'd post up and ask because the fastest way I've found to get working takes a few steps.
    Any suggestions ae appreciatized!
    -pat

    Thanks for the quick response Ben!
    Yea, I apologize, in your response I realize my OP was more than vague haha (it hapens when you get used to your own way of doing things I guess huh)- I'm trying to create a child from a parent so that it has all of the methods that the parent has.
    In order to do so I currently have to open and close LV a few times during my current process so that vi's in memory dont get mixed up- Currently I save a copy of the parent class in a sub dir of where it is saved, close out of LV, open the new 'copy of parent.lvclass', save as>>rename 'child class.lvclass', close LV, and open up the project to 'add file', then right click>>properties>>inheritance.
    Is this the only way to do this?
    Thanks again!
    -pat
    p.s. I'm tempted to steal your cell phone sig, hope you dont mind haha good stuff!

  • Fastest Way To Create Ultrabeat Instrument From Wav Loop?

    Hello, can anyone share the fastest way to create and instrument from a WAV file?
    I know you can convert a wav to ESX24 instrument but I use Ultrabeat not ESX.
    Im finding my self divinding each shot into region and then exporting that to its own file.
    Is there a faster/simpler way?
    -Thanks

    You're right! It does work, however, if you follow these steps:
    1. Right-click on wave file in Arrange and choose "Slice at Transient Markers" or use the corresponding key command.
    2. Select all the split regions right-click on them and choose: Convert > Convert to new audio file(s)
    3. Again, with all the regions selected choose: Convert to new sampler track.
    4. Now launch Ultrabeat. Click on the Import button anc choose the .exs file you've just created (will be located in your project folder). Then drag the audio files into UB as desired.
    You could skip Part 3 altogether and simply slice at transients, convert to new audio file(s) and then in a new kit of UB drag them into the sample window of OSC for each part...

  • Fastest way to create buttons from TOC entries?

    I am trying to make a TOC interactive by applying rollovers and links to each entry. What is the fastest way to create buttons out of each entry? Would the "create outlines" command be appropriate? As it stands now, I am copying an entry, pasting it into a new frame, aligning it over the original entry, then converting to button and applying behaviors. There's got to be a more efficient way to do this!
    Using CS3 Design Standard.
    Thanks for any help you can provide.

    Matt,
    function(){return A.apply(null,[this].concat($A(arguments)))}
    Sorry about the attachment. Not sure what happened
    See
    Announcement: File Attachments temporarily disabled
    above. The "temporarily" was an understatement at the time -- this has been in effect for, what? the past 2 years? Now it's just a poignant Daily Reminder of Jive's many shortcomings.

  • Fastest way to create a track of Whole notes beginning to end

    I am trying to create a tapper track consisting of a series or low G (G2) whole notes.
    What is the fastest way of doing this. (they really should be quantized) And, they should also be on beat-one only, throughout an occasional meter change.
    In logic you could drag blocks. I realize I could simply loop and then convert loop to real midi...
    But.. would this run amuck with meter changes. (Remember I just want notes beginning on beat one, and ONLY beat one.
    -Thanks

    Where you have sig changes, copy the region, and adjust it's length to fit the meter … you only need 1 region, rather than try to anticipate the myriad of time sigs that may crop up.
    I understand exactly what you mean, and it makes a lot of sense. I think it's strictly a question of the pattern of usage. If his style when writing is to frequently throw in some odd time sig that he hasn't used much before and is not likely to use again anytime soon, then what you say seems like the best approach. On the other hand, I'm assuming he's more in a situation where he tends to throw in, say, a measure of 2/4, 3/4 or 5/4 every now and then. And it's just a few sigs like that which he tends to use frequently. Therefore it might be slightly easier to have a few regions like that already prepared, and in a place where I can grab them easily. If I have them marked clearly by name and color, I can feel comfortable that I'm avoiding an error. Whereas if I'm creating them over and over again each time I need one, the snap settings and zoom level are relevant, and it's a little too easy to make a careless mistake, if the phone rings at the wrong time.
    Also, he sometimes wants whole notes, and sometimes quarter notes. And maybe there are some other variations like this. This would also be a reason to have some material prepared that I can use over and over again.
    But of course if one day I decide to emulate Stockhausen and throw in a bar in 142/8 time, it makes sense to create that only when I need it, because if I create that region in advance it probably won't get used much.
    So in practice I could picture him doing it a combination of both ways, because chances are certain sigs are used often and others are used rarely. I think it really depends on the writing style, and the pattern of how the unusual sigs are distributed. And also if he sees an advantage in the marking by color. That could be considered helpful, or it could be considered ugly and distracting.

Maybe you are looking for

  • HT3918 what is the beep sound when turning on computer after replacing ram

    what is the beeping sound when turning on imac after replacing memory/RAM ?

  • CS4 Mac Can I move a row in a table?

    I've made a table, but I now would like to move row 7 to occupy the position of row 2, thus pushing the others down so that the previous row two would now be row 3.  In Excel, I would select the row and then click and hold the selected row with the m

  • Association Wizard; DB Constraints, Composition, & Cascades

    A couple of clarifications please: 1. When would I want (ie 'be better off) with a composite association (fk) that is NOT in the database and only at he bc4j level. 2. If I do want the fk in the database, how can I specify (page 3 of 3 of the wizard)

  • My printer.... HP Photosmart B110a...?

    It's not printing in colour.... I have put in new cartridges, but its coming out blank... Im using word, it worked once, pretty sketchy picture, but now nothing at all... not even black and white. HELP

  • Adobe Reader X (10.1.4)

    when i try to remove Adobe Reader X using windows add/remove programmes using XP I am told to verify the patch package exists &that i can access it or i should contact vendor to verify it is a Valid windows installer patch package, who I presumed wou