Java library for large list sorts in small amount of memory

Hi All
Wondering whether anybody would know of a ready made java library for sorting large lists of objects. We will write ourselves if necessary but I'm hoping this is the sort of problem that has been solved a million times and may hopefully have been wrapped behind some nice interface
What we need to do is something along the line of:
Load a large list with objects (duh!)
If that list fits into an allowed amount of memory great. Sort it and end of story.
If it won't then have some mechanism for spilling over onto disk and perform the best sort algorithm possible given some spill over onto disk.
These sorts need to be used in scenarios such as a random selection of 10% of the records in a 100G unsorted file and delivered sorted, with the jobs doing the sorts being limited to as little as 64MB of memory (these are extremes but ones that happen quite regularly).
Some of you might think "just use a database". Been there. Done that. And for reasons beyond the scope of this post its a no goer.
Thanking you for all your help in advance.

Of course this kind of sorting was common back in days of yore when we had mainframes with less than 1MB or ram.
The classic algorithm uses serveral files.
1) Load as much as will fit
2) sort it
3) Write it to a temporary file
4) Repeat from 1, altnerating output between two or three temp files.
Your temporary files then each contain a series of sorted sequences (try saying that three times fast).
You then merge these sequences into longer sequences, creating more temp files. Then merge the longer sequences into even longer ones, and so on until you're left with just the one sequence.
Merging doesn't require any significant memory use however long the sequences. You only need to hold one record from each temp file being merged.

Similar Messages

  • HELP Java library for interpolation

    Hi, we need for a text recognition project a java library for improving the pixel resolution. Now we get a low pixel resolution from our device and the results of the text recognition software are not so good. Our idea is to interpolate the recognized pixel. How know a suitable Java library.

    rene1000 wrote:
    Hi, we need for a text recognition project a java library for improving the pixel resolution. Now we get a low pixel resolution from our device and the results of the text recognition software are not so good. This is not possible in any language or any platform anywhere in the universe.
    Our idea is to interpolate the recognized pixel. How know a suitable Java library.I'm confused, now it sounds to me like you just want to scale the image so it's larger?
    Please try to clearly describe what you need to do.

  • Java library for JPG Compression

    I have to compress the JPG images with good quality. Please suggest a good java library for image compression.
    Thanks

    Demo:
    import java.awt.*;
    import java.net.*;
    import java.io.*;
    import java.util.*;
    import javax.imageio.*;
    import java.awt.image.*;
    import javax.swing.*;
    import javax.swing.event.*;
    public class ImageCompressExample implements Runnable, ChangeListener {
        private BufferedImage original;
        private JLabel compressedLabel;
        private JSlider slider;
        private int byteCount;
        public ImageCompressExample(String url) throws IOException {
            original = ImageIO.read(new URL(url));
        public void run() {
            slider = new JSlider();
            slider.setMajorTickSpacing(10);
            slider.setPaintTicks(true);
            slider.setPaintLabels(true);
            slider.addChangeListener(this);
            compressedLabel= new JLabel();
            updateCompressedLabel();
            JPanel labelPanel = new JPanel(new GridLayout(2,1));
            JLabel originalLabel = new JLabel(new ImageIcon(original));
            originalLabel.setBorder(BorderFactory.createTitledBorder("original image"));
            labelPanel.add(originalLabel);
            labelPanel.add(compressedLabel);
            JFrame f = new JFrame();
            f.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
            f.getContentPane().add(labelPanel, BorderLayout.CENTER);
            f.getContentPane().add(slider, BorderLayout.SOUTH);
            f.pack();
            f.setResizable(false);
            f.setLocationRelativeTo(null);
            f.setVisible(true);
        public void stateChanged(ChangeEvent e) {
            if (!slider.getValueIsAdjusting())
                updateCompressedLabel();
        private void updateCompressedLabel() {
            int value = slider.getValue();
            compressedLabel.setIcon(new ImageIcon(compress(original, value/100f)));
            String title = String.format("compression quality = %d%%, bytes = %,d", value,  byteCount);
            compressedLabel.setBorder(BorderFactory.createTitledBorder(title));
        public BufferedImage compress(BufferedImage image, float quality) {
            try {
                Iterator<ImageWriter> writers = ImageIO.getImageWritersBySuffix("jpeg");
                ImageWriter writer = writers.next();
                ImageWriteParam param = writer.getDefaultWriteParam();
                param.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
                param.setCompressionQuality(quality);
                ByteArrayOutputStream out = new ByteArrayOutputStream();
                writer.setOutput(ImageIO.createImageOutputStream(out));
                writer.write(null, new IIOImage(image, null, null), param);
                byte[] data = out.toByteArray();
                this.byteCount = data.length;
                ByteArrayInputStream in = new ByteArrayInputStream(data);
                return ImageIO.read(in);
            } catch (IOException e) {
                throw new RuntimeException(e);
        public static void main(String[] args) {
            try {
                String url = "http://www.rsportscars.com/foto/03/carreragt06.jpg";
                EventQueue.invokeLater(new ImageCompressExample(url));
            } catch (IOException e) {
                e.printStackTrace();
    }

  • SharePoint Library for Large Amounts of Engineering Data

    We are currently using traditional project directory folders for large projects with sometimes tens of thousands of documents. 
    We are planning on migrating the data to SharePoint and the path forward in unclear.
    Initially it was recommended to use a library, not numerous folders, to contain the data so that searching of data in improved. 
    That sounded great.  The 1<sup>st</sup> project used to pilot this for other project is divided into 20 different modification packages. 
    A library category was created for MODS with selectable options of the 20 mod package names and “No Defined” (default value). 
    Some data items are shared between more than one MOD so this category can have more than one assignment.
    When we looked at the directory structure in place we found no consistency in folder names, no consistency in directory structure. 
    Many folders have 5 or 6 (or more) levels of subdirectories. 
    Ideally we want no more than 4 or 5 categories of meta data to define all data. 
    Mapping from chaos into a comparatively small number of categories is daunting.
    When searching this forum I find that libraries should be limited to 2,000 items. 
    There are tens of thousands of items in our pilot project. 
    Surely someone somewhere has encountered this organizational problem. 
    I could use some advice from someone who have been there before.

    John,
    The limit of 2000 is not a hard limit, the actual no of items you can store in a list is 30,000,000. however more item would have impact on performance on rendering and lock on the SQL table.
    Also the limit that you have mentioned (2000) is list view threshold limit and  actually it is 5000.
    One important aspect is Boundaries are hard limit, which you cannot exceed and Supported limits are limits based on tests, which can be exceeded but may cause issues.
    Being said that , I would suggest you to check out this link on
    SharePoint Server 2010 capacity management: Software boundaries and limits
    http://technet.microsoft.com/en-us/library/cc262787(v=office.14).aspx
    and explore other ways of optimizing your list
    here are some references that would help you to optimize -
    http://office.microsoft.com/en-us/sharepoint-foundation-help/manage-lists-and-libraries-with-many-items-HA010377496.aspx
    http://technet.microsoft.com/en-us/library/cc262813(v=office.14).aspx
    http://office.microsoft.com/en-us/sharepoint-server-help/sharepoint-lists-v-techniques-for-managing-large-lists-RZ101874361.aspx
    Hope this helps!
    Ram - SharePoint Architect
    Blog - http://www.SharePointDeveloper.in
    Please vote or mark your question answered, if my reply helps you

  • Java library for local search algorithms?

    Hi everybody,
    Could anyone please help me with the following?
    I am looking for a Java library with already implemented local search algorithms. Does such a thing exist? Would anyone recommend one?
    I am testing several AI programs and need to select the best parameters for each one (in a relatively large number of experiments). I could write my own search algorithm, of course, but it would be so much easier to use several already implemented algorithms and choose the best one.
    Thanks in advance for any pointers.
    Anna

    Here is an interesting question. Say you build the central control system of a rocket ship using Java, should you then be able to ask questions about rocket ships in Java forums?
    Well you can try, but I wouldn't hold your breath to find another rocket scientist in such a place. You may want to go to a forum where rocket scientists congregate in stead. Just a friendly tip.

  • CAML query performance for large lists

    I have a list with more than 10000 items. I am retrieving the items and displaying it in a RAD Grid on my page using CAML query. While retrieving the items, around 1000 records are retrieved due to filter. I have enabled paging in my grid and PageSize is
    set to 25. I have noticed that the load time of my page is very slow as it retrieves all the 1000 records at once.
    Is it possible to retrieve just 25 records for the first page on load. On click on the Next button or Page number it should retrieve the next set of 25 records for that particular page.
    I want to know if there is any way to link CAMl query paging with RAD grid paging
    Any code example would be greatly helpful.

    Hi,
    For pagination in SPListItem use the SPQuery.ListItemCollectionPosition property. 
    http://msdn.microsoft.com/en-us/library/microsoft.sharepoint.spquery.listitemcollectionposition(v=office.15).aspx
    check the usefull urls
    http://omourad.blogspot.in/2009/07/paging-with-listitemcollectionposition.html
    http://www.anmolrehan-sharepointconsultant.com/2011/10/client-object-model-access-large-lists.html
    Anil

  • Global java-library for graphical mapping?

    we hava a global .jar-java library we want to use in all software components (and not import them into all software comonents)
    in xi2.0 there was a procedure to update library.txt and reference.txt:
    reference library:XILookup library:jco
    (The XILookup class needs access to JCO)
    reference IntegrationServices library:XILookup
    (make XILookup visible to mapping runtime)
    reference ExchangeRepository library:XILookup
    (make XILookup visible to design-time)
    now in xi 3.0 we deployed the library as a j2ee library using nwds
    but how can i create the referenze to the exchangerepository and mapping runtime?
    thanks for any help
    joerg

    Hello
    I have the same problem. I want deploy also a jar library on xi system and after use this library for a messages mapping.
    I won't use the external Archives, i must use the library for different names spaces.
    I use for the deploying a library module in the NetWeaver developer studio. The deploying is working. But the mapping never found the jar library.
    Thanks for help.
    Regards Tom

  • [Java]Java Library for CVS and SVN

    Hi!
    I need Java libraries for my application. I want to manage repository CVS and SVN directly form Java code and not with a graphic interface.
    I've found some SVN library(for example SVNKit) but I don't found any CVS library.
    Can you help me?
    Thanks,
    Stefano.

    Keep googlin'! They're out there! It's worth grabbing the source code for Eclipse or NetBeans or something, and digging about in there. If they're not using a third-party library to handle CVS, they've probably at least written their own. Nothing to stop you, for instance, grabbing part of the Eclipse Team bundle and using it yourself.
    Question is: Why do you want to do this? If it's some wacky "self-checking in code", forget it.

  • Java library for Translating text

    I need to translate texts.. english <-> german.. english <-> french and so on.
    I would like to know if you know any Offline, software (commercial too!) that offers some java library to translate text strings.
    Thank you!

    duffymo wrote:
    I just heard from a reliable source that Vegas is a pit. I've never had an urge to go, but this kills any spark there might have been. Even Jessica Simpson wearing nothing but a pink Romo jersey in a Palms luxury suite would be enough to persuade me.
    As for TO, I think he's not even fit to be an Orville Redenbacker spokesclown.
    %I've mentioned it to my wife as a vacation spot. I haven't been to Las Vegas since I was two (many years ago) but I heard it is more "family-friendly." Amusement parks for the kids, the Hoover Dam, and a jumping-off spot for a Grand Canyon visit.
    The other option was the Dunkin Donuts in Willimantic but Mr. Patel who owns it, is a bit of a jerk. ;-)

  • Java Library for MS Word (.doc) to PDF Conversion

    Hi,
    My customer would like to use BI Publisher's PDF Binding and Merging features to combine BIP PDF outputs with another documents in Solaris platform. However, currently those documents are all in Word .doc format only and the customer does not want to consider converting those into other formats like RTF.
    Does BI Publisher provide library for converting .doc format directly into PDF? If not, does anyone know a Java library on the market that can best do the job?
    Geoffrey

    From: <[email protected]><br /><br />| @graffiti, Even if a printer does work, it doesn't solve my on-screen appearance. I<br />| want the document to look good both on-screen and printed. At presented, the on-screen<br />| doesn't look good.<br /><br />| @David, can you please elaborate a bit further? The signature should be a WMF file? I<br />| looked at the various file types from the Photoshop Save-As dropdown menu, and WMF was<br />| not among them. One of my critical elements is to be able to save the signature with a<br />| transparent background, not a white background.<br /><br />| Can you elaborate more on the "low compression ratio"? I have got no clue what that is<br />| about or where to change it.<br /><br /><br />I don't what to tell 'ya about WMF so here is a good Wiki on it...<br />http://en.wikipedia.org/wiki/Windows_Metafile<br /><br />As for the compression ratio...<br /><br />{ the following is based upon my installation of Acrobat 9 but most versions are<br />relatively the same }<br /><br />In "Printers and faxes"<br />Right-Click on "Adobe PDF"<br />Choose; "Print preferences"<br />Under "default settings" choose "Edit"<br />Now choose "Images"<br /><br />You will find settings for "compression" and "Image quality"<br /><br />The objective is "high quality" and "low" or no compression.<br /><br /><br />-- <br />Dave<br />http://www.claymania.com/removal-trojan-adware.html<br />Multi-AV - http://www.pctipp.ch/downloads/dl/35905.asp

  • Java Library for dynamic PDF form creation similar to LiveCycle Designer

    Hi
    I have a requirement as below :
    Requirement :  I need to create a dynamic PDF form with a barcode of type PDF417. Where a user can fill the form offline and after click on some button it will save the form offline and generate a barcode in the same PDF. Later on usaer can take a printout or send the saved pdf as it is.
    Currently I am able to create such pdf from using LiveCycle Designer. But I need to create it manually using designer and then need to apply Reader Extension on it using livecycle server.
    I want to do this programatically. I would like to create a similar form using some Java Library.
    Is it possible to create it dynamicaly(using programs)? how ?
    Does any one know how to acheive this ?
    Can anyone help me please ?
    Thank you very much in advance.

    I heard about LiveCycle ES3 server and was wondering if it could be of any use in my scenario. Can some one explain how to use jar files in standalone application .  I explored the  livecycle forms api but could not figure out how it may be used ?

  • Java library for function graphing

    Hello.
    I have searched Google up and down, but have not found a library for drawing functions in a coordinate system.
    Does anyone have any experiences with libraries, or know of an easier way of doing this?
    What I need is basically to draw functions (eg. y = 2x) in a coordinate system.
    Thanks in advance
    Dennis Johnsen

    Ulverbeast wrote:
    Hey
    Yes, like that one, except that one can only draw 3 simple kinds of functions.
    Do you know of any others that might be able to draw more different kinds of functions?
    JFreeChart can draw anything if you provide it with a set of coordinates to draw at.
    It's up to you to provide the coordinates though.
    So what you're really looking for is a function parser I guess.

  • JAVA library for authen. RADIUS or Novell NDS?

    Probably certainly to me not here...
    The question consists in the following:
    What libraries (if are available) for JAVA for authentication through RADIUS and/or Novell NDS are available?
    Please at presence of such libraries to write to me on E-mail: [email protected]
    Thank for attention.
    K$V

    Does that mean that you are not going to come back here and check any answers and expect someone to email you the answer. I doubt that happening.
    I think there is a product called Radiator for Radius and Novell provides a Java toolkit for its directory servers. You might be able to use these.
    Anyways, I did not have a right answer.
    Sorry, could not help being arrogant.

  • CAML Query for Large List.

    Hi All,
    I am having list in which there are more 60000 items in to it. Can you guys please help me out that what is the best practice to execute query on such large list ?
    Is it a good way to increase thresold limit ? I don't think so. What else I can do ?
    Any idea ? Please suggest.
    Thanks in advance.

    Hi Jaydeep,
    I'hv updated my code as per your link described : but still its showing an error. Below I am pasting my source code.
    Below is my code :
    SPList list = spWeb.Lists.TryGetList("ListName");
    SPQuery query = new SPQuery();
    //query.QueryThrottleMode = SPQueryThrottleOption.Override;
    query.Query = @"<Where>
    <Eq>
    <FieldRef Name='TCode' />
    <Value Type='Text'>" + tcode + @"</Value>
    </Eq>
    </Where>";
    query.ViewAttributes = "Scope=\"Recursive\"";
    query.RowLimit = 200;
    do
    SPListItemCollection itemCollcetion = list.GetItems(query);
    foreach (SPListItem item in itemCollcetion)
    if (!tRoles.Contains(item["Title"].ToString()))
    tRoles.Add(item["Title"].ToString());
    query.ListItemCollectionPosition = itemCollcetion.ListItemCollectionPosition;
    } while (query.ListItemCollectionPosition != null);
    Limit it set to 10000 records and my list is having 13000 records.  Above code executes under run with elevated previliges.
    Any suggestions ?
    Thanks in advance.

  • Java Library for SFTP Access

    Hi,
    Currently I am uploading files via SFTP Software on a specific server.
    I have to automate this step. How can I do this step with Java.
    -Aykut

    Sorry, but the advice searching this on google is not that easy.
    If I would ask for, "how to create a JTable"
    than you all would be right, "search on google!"
    because most of the examples are usable.
    Of course I already searched on google.
    Searching on goole is faster than writing this post, this is out of question.
    But I was looking for an Open Source Library which is also to be recommended,
    means it's easy to use and works accurate.
    Of course you get hundreds of links in google, by searching "Java sftp".
    But how do you know which one is good one?
    Non of you guys could answer to this question, except you already use it
    or you already read or heard about it.
    And what was my intention?
    => To get a recommendation, and not a lesson about searching on google (By the way, why not yahoo :))))
    For those who may also interessted using Java and SFTP, http://www.networkworld.com/columnists/2005/050205internet.html
    or directly http://www.jcraft.com/jsch/index.html
    Thanks
    -Aykut

Maybe you are looking for