A replacement for the Quicksort function in the C++ library

Hi every one,
I'd like to introduce and share a new Triple State Quicksort algorithm which was the result of my research in sorting algorithms during the last few years. The new algorithm reduces the number of swaps to about two thirds (2/3) of classical Quicksort. A multitude
of other improvements are implemented. Test results against the std::sort() function shows an average of 43% improvement in speed throughout various input array types. It does this by trading space for performance at the price of n/2 temporary extra spaces.
The extra space is allocated automatically and efficiently in a way that reduces memory fragmentation and optimizes performance.
Triple State Algorithm
The classical way of doing Quicksort is as follows:
- Choose one element p. Called pivot. Try to make it close to the median.
- Divide the array into two parts. A lower (left) part that is all less than p. And a higher (right) part that is all greater than p.
- Recursively sort the left and right parts using the same method above.
- Stop recursion when a part reaches a size that can be trivially sorted.
 The difference between the various implementations is in how they choose the pivot p, and where equal elements to the pivot are placed. There are several schemes as follows:
[ <=p | ? | >=p ]
[ <p | >=p | ? ]
[ <=p | =p | ? | >p ]
[ =p | <p | ? | >p ]  Then swap = part to middle at the end
[ =p | <p | ? | >p | =p ]  Then swap = parts to middle at the end
Where the goal (or the ideal goal) of the above schemes (at the end of a recursive stage) is to reach the following:
[ <p | =p | >p ]
The above would allow exclusion of the =p part from further recursive calls thus reducing the number of comparisons. However, there is a difficulty in reaching the above scheme with minimal swaps. All previous implementation of Quicksort could not immediately
put =p elements in the middle using minimal swaps, first because p might not be in the perfect middle (i.e. median), second because we don’t know how many elements are in the =p part until we finish the current recursive stage.
The new Triple State method first enters a monitoring state 1 while comparing and swapping. Elements equal to p are immediately copied to the middle if they are not already there, following this scheme:
[ <p | ? | =p | ? | >p ]
Then when either the left (<p) part or the right (>p) part meet the middle (=p) part, the algorithm will jump to one of two specialized states. One state handles the case for a relatively small =p part. And the other state handles the case for a relatively
large =p part. This method adapts to the nature of the input array better than the ordinary classical Quicksort.
Further reducing number of swaps
A typical quicksort loop scans from left, then scans from right. Then swaps. As follows:
while (l<=r)
while (ar[l]<p)
l++;
while (ar[r]>p)
r--;
if (l<r)
{ Swap(ar[l],ar[r]);
l++; r--;
else if (l==r)
{ l++; r--; break;
The Swap macro above does three copy operations:
Temp=ar[l]; ar[l]=ar[r]; ar[r]=temp;
There exists another method that will almost eliminate the need for that third temporary variable copy operation. By copying only the first ar[r] that is less than or equal to p, to the temp variable, we create an empty space in the array. Then we proceed scanning
from left to find the first ar[l] that is greater than or equal to p. Then copy ar[r]=ar[l]. Now the empty space is at ar[l]. We scan from right again then copy ar[l]=ar[r] and continue as such. As long as the temp variable hasn’t been copied back to the array,
the empty space will remain there juggling left and right. The following code snippet explains.
// Pre-scan from the right
while (ar[r]>p)
r--;
temp = ar[r];
// Main loop
while (l<r)
while (l<r && ar[l]<p)
l++;
if (l<r) ar[r--] = ar[l];
while (l<r && ar[r]>p)
r--;
if (l<r) ar[l++] = ar[r];
// After loop finishes, copy temp to left side
ar[r] = temp; l++;
if (temp==p) r--;
(For simplicity, the code above does not handle equal values efficiently. Refer to the complete code for the elaborate version).
This method is not new, a similar method has been used before (read: http://www.azillionmonkeys.com/qed/sort.html)
However it has a negative side effect on some common cases like nearly sorted or nearly reversed arrays causing undesirable shifting that renders it less efficient in those cases. However, when used with the Triple State algorithm combined with further common
cases handling, it eventually proves more efficient than the classical swapping approach.
Run time tests
Here are some test results, done on an i5 2.9Ghz with 6Gb of RAM. Sorting a random array of integers. Each test is repeated 5000 times. Times shown in milliseconds.
size std::sort() Triple State QuickSort
5000 2039 1609
6000 2412 1900
7000 2733 2220
8000 2993 2484
9000 3361 2778
10000 3591 3093
It gets even faster when used with other types of input or when the size of each element is large. The following test is done for random large arrays of up to 1000000 elements where each element size is 56 bytes. Test is repeated 25 times.
size std::sort() Triple State QuickSort
100000 1607 424
200000 3165 845
300000 4534 1287
400000 6461 1700
500000 7668 2123
600000 9794 2548
700000 10745 3001
800000 12343 3425
900000 13790 3865
1000000 15663 4348
Further extensive tests has been done following Jon Bentley’s framework of tests for the following input array types:
sawtooth: ar[i] = i % arange
random: ar[i] = GenRand() % arange + 1
stagger: ar[i] = (i* arange + i) % n
plateau: ar[i] = min(i, arange)
shuffle: ar[i] = rand()%arange? (j+=2): (k+=2)
I also add the following two input types, just to add a little torture:
Hill: ar[i] = min(i<(size>>1)? i:size-i,arange);
Organ Pipes: (see full code for details)
Where each case above is sorted then reordered in 6 deferent ways then sorted again after each reorder as follows:
Sorted, reversed, front half reversed, back half reversed, dithered, fort.
Note: GenRand() above is a certified random number generator based on Park-Miller method. This is to avoid any non-uniform behavior in C++ rand().
The complete test results can be found here:
http://solostuff.net/tsqsort/Tests_Percentage_Improvement_VC++.xls
or:
https://docs.google.com/spreadsheets/d/1wxNOAcuWT8CgFfaZzvjoX8x_WpusYQAlg0bXGWlLbzk/edit?usp=sharing
Theoretical Analysis
A Classical Quicksort algorithm performs less than 2n*ln(n) comparisons on the average (check JACEK CICHON’s paper) and less than 0.333n*ln(n) swaps on the average (check Wild and Nebel’s paper). Triple state will perform about the same number of comparisons
but with less swaps of about 0.222n*ln(n) in theory. In practice however, Triple State Quicksort will perform even less comparisons in large arrays because of a new 5 stage pivot selection algorithm that is used. Here is the detailed theoretical analysis:
http://solostuff.net/tsqsort/Asymptotic_analysis_of_Triple_State_Quicksort.pdf
Using SSE2 instruction set
SSE2 uses the 128bit sized XMM registers that can do memory copy operations in parallel since there are 8 registers of them. SSE2 is primarily used in speeding up copying large memory blocks in real-time graphics demanding applications.
In order to use SSE2, copied memory blocks have to be 16byte aligned. Triple State Quicksort will automatically detect if element size and the array starting address are 16byte aligned and if so, will switch to using SSE2 instructions for extra speedup. This
decision is made only once when the function is called so it has minor overhead.
Few other notes
- The standard C++ sorting function in almost all platforms religiously takes a “call back pointer” to a comparison function that the user/programmer provides. This is obviously for flexibility and to allow closed sourced libraries. Triple State
defaults to using a call back function. However, call back functions have bad overhead when called millions of times. Using inline/operator or macro based comparisons will greatly improve performance. An improvement of about 30% to 40% can be expected. Thus,
I seriously advise against using a call back function when ever possible. You can disable the call back function in my code by #undefining CALL_BACK precompiler directive.
- Like most other efficient implementations, Triple State switches to insertion sort for tiny arrays, whenever the size of a sub-part of the array is less than TINY_THRESH directive. This threshold is empirically chosen. I set it to 15. Increasing this
threshold will improve the speed when sorting nearly sorted and reversed arrays, or arrays that are concatenations of both cases (which are common). But will slow down sorting random or other types of arrays. To remedy this, I provide a dual threshold method
that can be enabled by #defining DUAL_THRESH directive. Once enabled, another threshold TINY_THRESH2 will be used which should be set lower than TINY_THRESH. I set it to 9. The algorithm is able to “guess” if the array or sub part of the array is already sorted
or reversed, and if so will use TINY_THRESH as it’s threshold, otherwise it will use the smaller threshold TINY_THRESH2. Notice that the “guessing” here is NOT fool proof, it can miss. So set both thresholds wisely.
- You can #define the RANDOM_SAMPLES precompiler directive to add randomness to the pivoting system to lower the chances of the worst case happening at a minor performance hit.
-When element size is very large (320 bytes or more). The function/algorithm uses a new “late swapping” method. This will auto create an internal array of pointers, sort the pointers array, then swap the original array elements to sorted order using minimal
swaps for a maximum of n/2 swaps. You can change the 320 bytes threshold with the LATE_SWAP_THRESH directive.
- The function provided here is optimized to the bone for performance. It is one monolithic piece of complex code that is ugly, and almost unreadable. Sorry about that, but inorder to achieve improved speed, I had to ignore common and good coding standards
a little. I don’t advise anyone to code like this, and I my self don’t. This is really a special case for sorting only. So please don’t trip if you see weird code, most of it have a good reason.
Finally, I would like to present the new function to Microsoft and the community for further investigation and possibly, inclusion in VC++ or any C++ library as a replacement for the sorting function.
You can find the complete VC++ project/code along with a minimal test program here:
http://solostuff.net/tsqsort/
Important: To fairly compare two sorting functions, both should either use or NOT use a call back function. If one uses and another doesn’t, then you will get unfair results, the one that doesn’t use a call back function will most likely win no matter how bad
it is!!
Ammar Muqaddas

Thanks for your interest.
Excuse my ignorance as I'm not sure what you meant by "1 of 5" optimization. Did you mean median of 5 ?
Regarding swapping pointers, yes it is common sense and rather common among programmers to swap pointers instead of swapping large data types, at the small price of indirect access to the actual data through the pointers.
However, there is a rather unobvious and quite terrible side effect of using this trick. After the pointer array is sorted, sequential (sorted) access to the actual data throughout the remaining of the program will suffer heavily because of cache misses.
Memory is being accessed randomly because the pointers still point to the unsorted data causing many many cache misses, which will render the program itself slow, although the sort was fast!!.
Multi-threaded qsort is a good idea in principle and easy to implement obviously because qsort itself is recursive. The thing is Multi-threaded qsort is actually just stealing CPU time from other cores that might be busy running other apps, this might slow
down other apps, which might not be ideal for servers. The thing researchers usually try to do is to do the improvement in the algorithm it self.
I Will try to look at your sorting code, lets see if I can compile it.

Similar Messages

  • I bought PhotoShop 12.0 for Windows 8.  I cannot find the Curves function in the program.  Is Curves excluded from this version? If so, how can I upgrade to get Curves?  Thank you!

    I bought PhotoShop 12.0 for Windows 8.  I cannot find the Curves function in the program.  Is Curves excluded from this version? If so, how can I upgrade to get Curves?  Thank you!

    teddy27 a écrit:
    Thank you, Michel, for your helpful answer.  I have downloaded SmartCurve and saved the files in C:\program files\adobe\Photoshop elements\plug-ins.  I re-started PhotoShop.  Do I access SmartCurve to adjust curves from within PhotoShop? 
    You can acces it from the menu 'Filters', there is a line called 'Easy.Filter', click it and it shows 'Smartcurve'.
    Separately, can I upgrade my PhotoShop Elements to Photoshop without signing up to the CC 'rental' system of monthly payments?  Sincere thanks,  Ted
    I know that you don't need the CC 'rental' subscription to have Photoshop, but as far as I know, there is no upgrade path from Elements except periodical special offers.

  • Replacement for these Obsolete Function Modules

    Hi,
      Can anyone tell me the replacement for these Obsolete Function Modules .
    1) K_BUSINESS_PROCESS_READ
    2) HELP_VALUES_GET_WITH_TABLE_EXT
    3) G_SET_AVAILABLE
    Regards,
    Arun

    hi check thes for any function modeules...
    http://www.erpgenie.com/abap/functions.htm
    http://www.sap-img.com/abap/function-list.htm
    regards,
    venkat.

  • I have downloaded IOS7 on my iPhone and all calendar events have disappeared. The general functions of the calendar have changed and are definitely not 'user friendly'. How can I retrieve my calendar events. Will Apple improve the calendar function

    I have downloaded IOS7 on my iPhone and all calendar events have disappeared. The general functions of the calendar have changed and are definitely not 'user friendly'. How can I retrieve my calendar events. Will Apple improve the calendar function or revert to the previous system. Even the typing function on IOS 7 is faulty - very slow to respond to the keyboard. I no longer enjoy using my iPhone. Can anyone assist. Thank you

    Very strange! All of my calendar events have reappeared. This has happened one week after downloading iOS 7
    The calendar however,  is not easy to use.
    The typing function on the phone has  become even slower. Have to wait for each letter to show on screen.

  • The search function in the itunes store does not work.  It will accept a request and suggest searches but then it locks up and will not search.  Clicking on the magnifying glass or a suggested search does nothing.  Re-installing itunes has not helped.

    The search function in the itunes store does not work.  It will accept a request and suggest searches but then it locks up and will not search.  Clicking on the magnifying glass or a suggested search does nothing.  Re-installing itunes has not helped.

    everything you stated here is exactly what i have done and have got nowhere. i have windows 7 64 bit on a hp 8 g of ram desktop. im also looking for help

  • Is there a way to use the 'search' function outside the search menu

    Is it possible, to activate the 'search function' outside the 'search menu' on the iPod classic ? With 30000 songs and more on an iPod, it would be great, if I could use this option i.e. in cover flow, display songs, or display albums. This way, I won't get any blisters on my fingers, from scrolling from "A hard day's night" to "You can't do that".
    Thanks for your help !

    A couple of suggestions:
    1. If you just have a few items you don't want included when you shuffle then find them in your library or playlist, pull up the info on the song/book/etc (File menu, Get info option), go to the "Options" tab and click the box that says "Skip when shuffling"
    2. If there are a lot of items, then you may want to create a smart playlist (File menu, New Smart Playlist option) and set it up so that it excludes all the items you don't want. Then you can play that playlist on shuffle.
    Hope that helps!
    MacBook 2.0 GHz white   Mac OS X (10.4.7)   30GB 5G iPod (with video)

  • My iPod Touch will not play through external speakers when it is docked.  It only plays through it's internal speaker.  All the other functions on the iPod can be controlled by the external speakers (volume,track, etc).  How do I fix this?

    My iPod Touch will not play through external speakers when it is docked.  It only plays through it's internal speaker.  All the other functions on the iPod can be controlled by the external speakers (volume,track, etc).  How do I fix this?

    I would try in oreder:
    - Resetting the iPod:
    Press and hold the On/Off Sleep/Wake button and the Home
    button at the same time for at least ten seconds, until the Apple logo appears.
    - Restoring the iPod via iTunes.  First from backup and if problem persists, restore to factory defaults/new iPod.

  • How to enhance the search functionality of the Locator Bar in SAP CRM Order

    Hello All,
    I have one following requirement in SAP CRM 4.0 :
    it is necessarry to search for Orders according to one of the Z-fields(under Z-tab) in SAP CRM.
    Therefore i need to  enhance the search functionality of the Locator Bar with this option.
    Can anyone help me how to achieve this .
    Cheers
    Sreedhar

    HI, Sreedhar
    Check this thread:
    Re: New search help on CRMD_ORDER locator
    Denis.

  • Whenever I have a pdf document open, and then switch to another tab to view another webpage, the scrolling function on the touchpad will not work. This seems to be a glitch with Mozilla, according to Dell. How do I fix this?

    When I open a pdf document (like an E-version of a newspaper) in Mozilla and then switch to another tab to view another webpage, I am unable to use the scrolling function on the touchpad. I have to use the up and down arrows to get other pages to scroll. When I close that pdf document, everything works fine. I talked with customer support with Dell, and he said it was probably a glitch with Mozilla. So I tried to do the same thing on IE, and there was no problem scrolling with the touchpad while a pdf document was open on another tab. I would like to get a fix for this so I don't have to keep closing open pdf documents in order to scroll with the touchpad while on other tabs. Thank you

    Hi elliezzz,
    If you are having issues with your iPhone not being recognized by iTunes, you may find the following article helpful:
    iOS: Device not recognized in iTunes for Mac OS X
    http://support.apple.com/kb/ts1591
    Regards,
    - Brenden

  • PUT BACK THE MISSING THE SCREEN FUNCTION IN THE PRINT DIALOG BOX!!!!

    Hello all...
    I am a creative director in the screen print (garment industry) for over 20 years.  We have always used PS to directly print and control our halftone screens and angles through the screen function in the print dialog box.  Now it has miraculously disappeared after so many versions previous.
    I emplore Adobe and their team to please update this function back into CS5.
    Although it is a function many might not even understand how to use, it is IMPERATIVE to the screen printer/artist in this industry.
    There are workarounds, of course, but they are much more tedious and time consuming.
    I truly cannot convince enough that we folks in this industry want this function back!!!!!

    You might want to also post in these Fora:
    Photoshop Feature Requests
    http://feedback.photoshop.com/photoshop_family/products/photoshop_family_photoshop

  • Does anyone have an example VI about how to call the animatewindow function in the user32.dll using CLN in Labview?

      I want to call the WinAPI function-animatewindow in user32.dll to produce some special effect when showing or hidding windows, but i don't know how to using this Win API to achieve my purpose?
      Does anyone have an example VI about this application?
      Thanks in advance for your help.

    You have to use the Call Library Function Node to call Windows API functions. The animatewindow function itself has some pretty simple parameters. You first need to get the window handle. There are a set of Windows API Function Utilities (32-bit) for LabVIEW that you can use. In there there is a VI (Get Window Refnum) that gets the window handle. It's a simple call to a Windows API function. You would call the animatewindow function in the same way. In this case there are 3 parameters: the window handle (returned by a FindWindow API call), a DWORD (32-bit integer) for the duration, and another DWORD for the flags.

  • The transfer function of the PID block doesn't show the derivator.

    Hello,
    I am trying to use the PID vi, but I when I try the box by itself, it doesn't behave as a "clasic" PID should behave. The main problem is that I don't manage to see the derivator. In the attached vi I compare the transfer function of the PID vi with the transfer function of a PID built by me. My version shows all what a PID should have: integral section (with decreasing magnitude and -90 phase), center area (with constant magnitude and 0 phase), and derivative area (with increasing magnitude and +90 phase).
    The PID vi only shows the integral part.
    You can also select a step input, and see the output directly. If you choose a large enough derivative time (100 times bigger than the integrator time), and you look closely to the first part of the output, you will see the pick due to the derivator in my version, but not in the PID.vi version.
    Does anyone knows what am I doing wrong? 
    Kind regards,
    Pablo Estevez
    Solved!
    Go to Solution.
    Attachments:
    TestPID.vi ‏31 KB

    Dear Nathand,
    Thanks for your answer, I tried the change and you are right. That shows that this not a standard PID, since that means (and actually I can see it now by checking inside the vi) that it is not using the derivative of the error but the derivative of the process variable. I know that this is used sometimes to prevent the effect of fast changing set-points, but it is a shame that they do not comment on it in the help, and that this is not a selectable feature. Do you know if there is a way to edit these pre-packaged vi's? 
    One more question, about the labview style. I included the sequences just to group terms and make the code more readable to separate the integrator from the derivator and not have a knot of entangled signals. Specially when I run the clean-up diagram, it gets very entangled. I have been looking for another way to do that (container boxes, groups). It would be nice if you could suggest me something I can do for it.
    Thanks again,
    Pablo 

  • Loading in a second Actions panel with the same functionality as the first, but independent from it

    The actions panel is perfect at what it does; however, if you have a few action sets and, in order to eliminate scrolling, you need to expand them into a double row of buttons, it quickly becomes difficult to maintain. As soon as you delete an action or add an action to it, it disorders the others and you have to reorganize again, which becomes a vicious cycle. I like to keep things organised and neat with all that I do and pretty much the only way to have this feature is to expand it on the back end of the software or, if possible, create a script that loads a second Actions panel with the same functionality as the first but completely independent from it. I assume that the SDK will not allow us to expand on the software in this particular way, as Photoshop is not open source; however, maybe this suggestion will be taken into account for future updates or versions since I have yet to find a suitable way to do this.
    Thank you in advance.
    Kind regards

    That would depend on the links on the page. Some pages have links that refer to different areas on the same page. If you want to force a link to open in a new tab, you can middle-click it.

  • The Playhead Function in the iOS7 Music Player

    I am having troubles with the playhead function with the new music app for the ios7.
    If I am playing a track and I want to use the playhead function whilst playing a track it at times lags and often bumps to different part of the track.  The playhead function in the earlier ios's were much easier.
    The most frustrating part if I am using the playhead function in my locked screen.  At most times the locked screen will move to the left as it thinks I am wanting to unlock my main screen.
    Any Help?

    When in shuffle mode, the device actually generates a randomized playlist and then sticks to it; it doesn't randomly select the next track each time.
    If repeat is off, the playlist will run through each song and then stop when there are no more to play. When repeat is on, it will play through to the end, generate a newly randomized playlist, and then play every song once more.

  • Regarding the basic function of  the *_LEXERs

    We are using Oracle Text to index XML documents that contain an "abstracts" group element. Each document can contain one or more abstracts written in a different language (primarily English, French and German). Thus a single document can contain multiple languages.
    We are wondering if we need to use any of the various LEXER types to deal with the different languages. My understanding of the documentation is that MULTI_LEXER or WORLD_LEXER would not be appropriate, since they work on the document level. AUTO_LEXER documentation seems vague, however... would AUTO_LEXER be able to sense language changes within a document?
    http://download.oracle.com/docs/cd/B28359_01/text.111/b28304/cdatadic.htm#i1007538
    Or, perhaps I needn't be concerned at all, since English, French and German are all "whitespace deliimited languages"-- If the sole function of the lexer is to parse tokens, then is there really any difference between the English lexer and the French/German lexers?
    Thanks.

    Dear Nathand,
    Thanks for your answer, I tried the change and you are right. That shows that this not a standard PID, since that means (and actually I can see it now by checking inside the vi) that it is not using the derivative of the error but the derivative of the process variable. I know that this is used sometimes to prevent the effect of fast changing set-points, but it is a shame that they do not comment on it in the help, and that this is not a selectable feature. Do you know if there is a way to edit these pre-packaged vi's? 
    One more question, about the labview style. I included the sequences just to group terms and make the code more readable to separate the integrator from the derivator and not have a knot of entangled signals. Specially when I run the clean-up diagram, it gets very entangled. I have been looking for another way to do that (container boxes, groups). It would be nice if you could suggest me something I can do for it.
    Thanks again,
    Pablo 

Maybe you are looking for

  • Wireless Keyboard Page Up and Page Down Help

    The wireless keyboard for my iMac omits many common keys found on full size keyboards. One application I am running (Battlefield 2142) requires the use of "page up" and "page down" among others. By random effort, in happens that "Control" + "up arrow

  • Query- Alternative way

    Hi All, Below is the query used to retrieve records based on organization and responsibility parameters select * from WIP_DISCRETE_JOBS where ORGANIZATION_ID = :ORG_ID and :ORG_ID !=-1 union all select * from WIP_DISCRETE_JOBS where ORGANIZATION_ID i

  • Where did the advance tab go on icloud

    Where did the advance tab go on icloud so you can delete from photo stream?

  • Recently keep getting the message "You've been signed out" [was:dew1188]

    Why do i keeps getting the message "You've been signed out" when I try to enter CC? This is new. I've always been able to get into CC and have downloaded Photoshop and many other apps.

  • Start SLD Bridge failed

    I cannot start Data Supplier Bridge. My Bridge have status 'Stopped'. When I hit 'Start Bridge' nothing happens. I have checked Server name and Service. They seem right. Where can I find messages/log telling what is wrong. Best regards Hans-Jorn