What is the "EventLoggingEnumLogEntry" function in the run-time engine?

I have some code which isn't utilizing all of the processors on my system, but it should be, so I profiled the execution of the compiled executable code using "Intel VTune Amplifier".  That program told me that my code is spending a lot time in synchronization or threading overhead, and that the main function that its constantly executing (>50%) is called "EventLoggingEnumLogEntry", and it's contained in lvrt.dll.  It's also spending a fair amount of time executing a function "LvVariantCStrSetUI8Attr".
Can anyone tell me what these functions are and what in my LV code would activate them?  I can't even start to find where the bottleneck is in my code with the information I currently have...
Thanks!

Joey,
Thanks for the reply!
I've already parrallelized the loops, and the problem is that each core is only using about 25-40% of its capacity.  I can't figure out what is limiting the speed of execution on each core.  One of the papers I read suggested that I can find the bottleneck using the Intel VTune program.  I used that instead of the Labview native profiler because the LV profiler will tell me where my code is spending most of its time, but it doesn't tell me if there is some other bottleneck (like perhaps one thread is waiting for another or something like that).  I had already taken the information in the profiler as far as I know how to, by finding and optimizing the most commonly executed code in my application.
There is an event structure in my application, but once I press the "GO" button, which kicks off the intensive computing, I have the event structure disabled by placing it within a case structure and using a boolean to keep LV from executing the event structure.  Under certain conditions (like completion of the computing), the boolean will be flipped and enable the event structure, but that only happens AFTER the computation is complete.  I've confirmed that LV is not executing the event structure with breakpoints.
Does the mere presence of the event structure, even without execution, cause LV to spend extra resources checking for events?  I also tried seperating the application into two while loops - one containing the event structure, and one containing the processing code.  When the first while loop terminates (again, when pressing the "GO" button), the second will execute with no case structure in the loop.  That did not seem to relieve the bottleneck.
 This thread is the first thread that I posted regarding this topic.  One of the last replies includes some of my code, if you are curious to look at it.
Thanks!

Similar Messages

  • Upgraded from LabVIEW 8 to 2013 and now VI asks to find the installati​on package for Run-Time Engine 7.0

    I recently installed LabVIEW 2013 on a computer (running Windows XP 32-bit).  The machine also has LabVIEW 8 installed, which is what I was using prior to upgrading.  I opened a VI that was created in LV8 and then saved it and all its subVIs in LV2013.  Now when I open the VI in LV2013 and try to run it, a window pops up asking me to find the "lvruntimeeng.msi" installation package for LabVIEW Run-Time Engine 7.0.  If I cancel that dialog box and the subsequent message stating that the installion files were not found, the VI appears (at least from intial inspection) to run normally.
    I closed LabVIEW, downloaded Run-Time Engine 7.0 from the NI website, and tried to install it, but I received a message saying that it is already installed (as I had suspected).  How can I determine what part of the VI and/or its subVIs is trying to make use of Run-Time Engine 7.0?  Alternatively, how can I get LabVIEW to instead use the Run-Time Engine 2013 that was installed when I upgraded to LabVIEW 2013?
    Solved!
    Go to Solution.

    Bob_Schor wrote:
    Are you running your VI from a Project?  If so, you can look at Dependencies and get an idea what "dependent" VIs you might have.  There may be "something old" in your LabVIEW 8 code that has been superceded in 2013, but still "hangs around" -- if you can identify it, you can probably replace it with its "more modern" equivalent.
    If you do not have the VI in a Project, you can simply open LabVIEW, create a new blank project, and add your top level VI to it.  If all of your relevant VIs are in a single folder, add the entire folder.  Now look in Dependencies.
    BS
    Yes I am running the VI from within a LV Project.  After some more searching in the NI Knowledgebase I was able to fix the problem by using the following procedure:
    1.  Use the Measurement and Automation Explorer to uninstall Run-Time Engine 7.0
    2.  Restart the PC
    3.  Open the project, close the project choosing to "save all"
    4.  Restart the PC
    5.  Re-install Run-Time Engine 7.0 using a file downloaded from ni.com
    6.  Restart the PC
    7.  Open the project, close the project choosing to "save all"
    8.  Open the project and run the VI.  No more messages about LabVIEW trying to find Run-Time Engine 7.0.

  • Why does my LabVIEW 2010 Program ask for the LabVIEW 8.5.1 run-time engine every time it is launched after being installed?

    It is a simple timer program that shows the elapsed time (see attached pics). I can't think of a single thing that should be causing this ... It started happening right after I selected scale objects with front panel resizes and recompiled, and re-installed the app. It was not happening before that when installed without the selection of resizing, but then I removed the selection and it still happens. Any Ideas would be appreciated.
    Solved!
    Go to Solution.
    Attachments:
    Timer Block Diagram.JPG ‏100 KB
    Timer Front Panel.JPG ‏34 KB

    A bit more info:
    If I cancel through all of the dialog boxes (see attached) when trying to launch my little app, it runs normally. Looks like it really doesn't need the 8.5.1 engine after all.
    Attachments:
    Run-time Engine 8.5.1 dialog box.JPG ‏25 KB
    Run-time Engine 8.5.1 dialog box - 1.JPG ‏15 KB

  • Are LabVIEW VISA functions supported by LabVIEW Run Time Engine?

    I have an executable file that I created using LabVIEWS Application Builder. When I run the application on another computer that does not have LabVIEW installed, it crashes with a Windows message stateing the my application has generated erros and will be shut down by Windows. I have isolated the problem as having something to to with VISA Reasource Names.
    Does anyone know if there has been any issues regarding the use of VISA function in LabVIEW Run Time Engine? The version of LabVIEW that I have is 6.0.2. The version of Run Time Engine is 6.0 .
    Dan

    Yes there are issues with the RTE 6.0 and VISA. Use the RTE 6.0.2
    LabVIEW, C'est LabVIEW

  • Installshi​eld Merge Module Rather Than Run-Time Engine

    I create installations for our company using Installshield 11
    Professional.  At this point, the only way to install the
    necessary files to run a Lab View application is to add the Lab View
    7.1 run time engine into the setup.  
    What I am looking for is an alternative.  I am hoping there is
    some sort of merge module out there which can be added to Installshield
    so I don't have to call the Lab View run-time engine installer from my
    installer. 
    Does such a thing exist? 
    Or does anyone have any advice on how I can make a Lab View installer using InstallShield and make the installation unified?
    I don't prefer the current setup I have where our software installation
    begins, then the run-time engine installer takes over, and then our
    setup continues.
    Thanks for any help,
    Adam

    jacko wrote:
    Hi Rolf,
    It seems to me that you are explaining the problem I encountered after finding the merge modules which Chris led me to.  That problem was which merge modules to use? 
    I have in fact been trying to identify which modules I'd need, and was
    going to do it by trial and error.  If there is an easier way then
    I'll give your method a shot. 
    Is this what you are refering to in your message above? 
    Well,
    I'm not sure I understand you correct, but yes I think that is what my
    message meant. Basically if you look at the two VIs and try them out
    you will more or less see how they work.
    You can then try to create a small tool, that given a *.msm file as
    input will tell you which other msm files you need to include in your
    Install Shield installer. If that is easier than trial and error is of
    course something you can depate over. Also note that the order of
    inclusion of msm files in an installer seems to be important too. The
    lowest one in a depency chain should be first I believe.
    For your particular problem you would look at the LVRunTime.msm
    file to get a list of all other modules you need to include in your
    installer. I actually would suspect that Install Shield should have
    some functionality to list dependencies of merge modules, too.
    Rolf Kalbermatter
    Rolf Kalbermatter
    CIT Engineering Netherlands
    a division of Test & Measurement Solutions

  • Unable to locate labview run time engine (lab view 8.5)

    Hello I have a problem whit the labview 8.5 installer. I have created I project with included a .vi file (a simple panel wich control a serial port, with parameters (baud rate, stop bits, ...)), then I have created an application for this file (.exe) and then the installer (I have included the labview run time engine 8.5). When I run the setup.exe  I have the following error message:
    Unable to locate labview run time engine, the application require a labview run time engine 8.5 or compatible...
    What I'm missing??
    Thanks
    Maurizio

    are there one other reasons why it would not work.  I have asimilar problem.
    I intsalled version 8 of run time engine and acrtivated software.  Now I am trying to run compiled application provided to me by third party.
    When I click on application it says
    Unable to locate LV run time engine
    Test server requires a version 6.0 or compatible LV engine.
    How do I know if the engine I installed and activated is running and is it supposed to be able to execute a program which is expectin rev 6 of the rund engine?

  • Why is VI not executable using Labview 7.1 Run-Time Engine?

    Hi,
    In our Teststand sequence, we have a Labview Action step which is returning an error: "The VI is not executable", along with the error code: -18002.
    This error occurs on a Production PC using the L.V. 7.1 Run-Time Engine.
    I'm aware of Mass Compiling to ensure all the VI's are the correct version, but this has not resolved the problem.
    The software was installed on the production PC using a Teststand created installer, with the Labview Adapter set
    correctly to the Run-Time Engine, i.e. not trying to open the Labview Developer Server to execute the step at run time.
    Because it's the 1st Labview module to be executed, I added another similar step to run beforehand, i.e. the new simple
    Labview code ran first. This executed correctly, but then the next step caused the error above.
    The small number of associated sub-vi's are on the target PC, and have been mass compiled.
    I've had -18002 errors before, but this is one I cannot cure.
    All suggestions gratefully received.
    Gary.

    Hi Guys,
    Regarding this -18002 error when using the Run Time Engine instead of the Development Environment, what is the VI within the step that is failing (in any of the cases you've mentioned) trying to do? Are they built around any toolkits?
    In my experiments with a simple VI (with just a user dialogue) using the LabVIEW 7.1 RTE in both TestStand 3.1 and 3.5 this morning, I haven't found any issues
    There is a possibility if any of the sub-VIs or toolkits you use contain a Merge Errors.vi, that the Run Time Engine picks up a copy of the VI built for previous versions of the RunTime Engine (due to the order in which the sub-VI directories are scanned), and it cannot convert it.
    There are a few solutions for this, the easiest one looks to be to copy the error.llb\Merge Errors.vi for LabVIEW 7.1 into the same directory as the VI you're calling (so TestStand can pick it up easily).
    Can you let me know if this solution helps at all? If not, would it be possible to post up any of the code or sequence you're having problems with?
    Best wishes,
    Mark
    Applications Engineer
    National Instruments

  • Advenced Signal Processing 7.5 Run Time Engine in LVIEW 8.2 Installer

    Hello,
    I am trying ot create an installer using LVIEW 8.2 and I need to include the Advanced Signal Processing Toolkit run-time engine but this is not listed in the 'Additional installers'.
    I have run and installed the Advanced Signal Processing Toolkit (7.5) and the run time engine on my machine.
    How can I include this run time engine in my installer?

    The Application Builder of LabVIEW cannot let you select the Advanced Signal Processing Runtime Engine like you select the LabVIEW runtime engine and other driver engines.   You have to do it by yourself.
    What the Advanced Signal Processing Runtime Engine installer does is to install the DLLs to the system32 folder.  So what you need to do is to add the DLLs to your project and configure your installer to install the DLLs to the system32 folder.
    See the attachement for more hints.
    Attachments:
    ASPTRuntime.doc ‏87 KB

  • Run-Time Engine error

    I have created an automated test and had it running as an executable on a stand alone test computer. I updated my LabView software to 8.6.1. I just made some changes to the test and recompiled it. When I tried to run it on the stand alone computer it gave me an error about the LabVIEW 8.6.1 run-time engine missing. I thought when the project was built all the files that pertained to running it were also built into the folder. Has anyone run into this that could offer some advice.
    Thanks
    Chirs
    Solved!
    Go to Solution.

    Did you create an installer or simply build the exe? If you only built the exe you will need to install the run-time engine on the target machine. If you created an installer you can include the installation of the run-time engine as part of the installer.
    Mark Yedinak
    "Does anyone know where the love of God goes when the waves turn the minutes to hours?"
    Wreck of the Edmund Fitzgerald - Gordon Lightfoot

  • NI 5660 Driver DLL Errors when using Teststand 2010 and LabVIEW Run-Time Engine 2010

    This problem seems similar to the post "Resource not found error in executable on developmen​t machine." but I didn't want to repost under that thread because I only happened upon it by chance and none of my searches brought me there... so I made a more descriptive Subject.
    I am working on a system that uses a PXI Chassis with a NI 5600 Downconverter and a NI 5620 high speed digitizer, among other PXI Cards. 
    I inherited working code written in LabVIEW 2010, running with the LabVIEW Run-Time Engine 2010.  The code was using a custom executive and my task was to rewrite the test using TestStand 2010.  I reused the majority of the old code.  The old code used NI-5660 to control the 5600 and 5620.  When I run my sequence using the LV Development System and TestStand, it runs without any issues.  When I change the adapter over to LabVIEW Run-Time Engine 2010, all of my NI5660 VIs become broken due to DLL issues.  It warns that the nipxi5600u​.dll was not initialized correctly.  Many of the errors are associated with the NI Tuner and NI Scope. After this LabVIEW will crash randomly, and the seqeunce will not work in TestStand even when back with the LV Development Adapter.  The only way to recover after this is to restart the computer - TestStand automatically reverts back to the development system, the VIs are no longer broken and the sequence works again. 
    I have all of my VIs associated with a project. After reading a little bit about DLLs and TestStand, I found all of the DLLs in the dependencies section of my project and added them to my TestStand workspace.  I also used Dependency Walker to track down the problems with the nipxi5600u​.dll, the 2 DLL files that it said were not found already existed in the same folder as the original DLL (C:\Windows\System32).  I have also performed a Mass Compile to make sure everything was running in LV 2010.  If I skip the steps involving the 5660, my entire sequence runs fine. 
    The previous code was running with the LabVIEW Run-Time Engine without any issues.  Is there just a step I am missing?  Has anyone seen anything like this before?  I can send screenshots of errors to provide more detail if necessary. 

    I have tried some more things and still can't get it to work.  I have added the VIs mentioned in the Notes On Creating Modulation Executables KB both to the TestStand workspace and the LabVIEW project holding all of my VIs.  This did not change the results. 
    When I try to run my sequence, The first error I get is shown in Error 1445.bmp.  This happens when I try to use the NI 5660 initialize.vi.  If I click ignore, the next error I see is shown in Error -20551.bmp.  When I try to open the VI to look at it, I get the 2 DLL errors shown in Error loading nipxi5600u.bmp and Error loading nidaq32.bmp.  When I close TestStand, I get the error LabVIEW Fatal Error.bmp. 
    Attachments:
    Error1445.JPG ‏164 KB
    Error -20551.JPG ‏174 KB
    Error loading nipxi5600u.JPG ‏9 KB

  • Announcement: LabWindows/CVI 2010 SP1 Run-Time Engine Updated

    A new version of the LabWindows/CVI 2010 SP1 Run-Time Engine (10.0.1.434) is now available for download. The new version includes Security Update 5Q5FJ4QW which resolves security vulnerabilities in components installed with LabWindows/CVI 2010 SP1 and earlier and LabVIEW 2011 and earlier. Further details can be found at KnowledgeBase Article 5Q5FJ4QW: How Does National Instruments Security Update 5Q5FJ4QW Affect Me? Installing the security update will have the same effect as installing the new version of the Run-Time Engine.
    The update can be downloaded from the Drivers and Updates page. The LabWindows/CVIRun-Time Engine is a free download.
    National Instruments
    Product Support Engineer

    The correct link should be this one
    Proud to use LW/CVI from 3.1 on.
    My contributions to the Developer Zone Community
    If I have helped you, why not giving me a kudos?

  • What's the cost of Labwindows run time engine

    Dear Sir,
    I have some questions about CVI and hope you can help me figure it out as usual.
    1st, what's the advantage of labwindows against Labview? I know one is C programing environment and the other graphical environment. But the Labview seems more user-friendly.
    2nd, What is run-time engine according to your definition? I know Matlab has similar component and is free of charge. Does the run time engine in Labwindows charge? If so, how much?
    3rd, Previously, I thought labwindows is used to convert Labview program into C source code which is faster (how fast, compared with labview) and very popular in the industries, so companies do not have to buy labview and run it for just simple application in the field. But if the run time engine costs a lot, I can find no reason to use labwindows.
    Thank you very much!

    1) LabVIEW is a graphical based programming environment and
    LabWindows/CVI is a C based programming environment.  Both can be used
    for similar tasks.  Users familiar with C may prefer CVI.  LabVIEW is
    typically easier to learn if you are unfamiliar with both.
     2) 
    The Run-Time Engine is a separate component that can be installed to
    execute LabWindows/CVI programs and LabVIEW programs.  It is free of
    charge for both LabVIEW and LabWindows/CVI.
    3)  
    LabWindows/CVI does not convert any LabVIEW programs into C code. 
    LabVIEW programs are already compiled as you write them, so you won't
    need to convert them to C to use them.  You should see similar
    performance in similar LabVIEW and CVI code.
    Allen P.
    NI

  • When creating an application installer in LV, what run-time engine or driver must be installed to install the VISA interactive control?

    I've created an application installer using LabVIEW's application builder and use it to install NI MAX.  However, after running the installer, the VISA interactive control is disabled in NI MAX.
    The installer installs the following NI components:  NI LabVIEW Run-Time 2014 SP-1(64-bit), NI LabWindows/CVI Shared Run-Time Engine 2013 SP2, NI Measurement & Automation Explorer 14.5, NI-488.2 Application Development Support (includes run-time), NI-VISA Configuration Support 14.0.1, NI-VISA Runtime 14.0.1, NI-VISA Server 14.0, NI Systems Configuration Runtime 14.5.0, vision run-time, dc-power run-time.
    If I download and run the 488.2 installer, the VISA interactive control is enabled in NI MAX.  But, the installer created with the application builder does not seem to install the necessary components.
    What needs to be added to the installer to enable  the VISA interactive control (VISAIC)?
    Thanks.
    Solved!
    Go to Solution.

    From Pedro Munoz, Applications Engineer, National Instruments
    Sorry for the confusion with the forum post that Jon sent you. I did some research on our internal database and I found out the component will not be installed by any of the components added from the additional installers section on the configuration of the installer. As you have already found out you need to install the full installer in order to enable this feature.
    I know that this might be an inconvenient for you because you wanted to have one installer to run. On this case may I suggest using the NI Batch Installer Builder.
    The NI Batch Installer Builder allows building installers that contain National Instruments software from several products. That way you can create an installer for you application in LabVIEW (and not include the drivers in the additional installer section), then use NI Batch Installer Builder to include the installer for your application and the full version of the drivers that you mentioned.
    Here is the download link:
    http://www.ni.com/download/ni-batch-installer-builder-14.5/5193/en/
    And in here you can find instructions on how to get started with it:
    http://zone.ni.com/reference/en-XX/help/374206A-01/
    Let me know if you have any question.
    Regards
    Pedro Munoz
    Applications Engineer
    National Instruments
    http://www.ni.com/support

  • What happened to the browse function in the iTunes store?

    What happened to the browse function in the iTunes store?

    What do you mean what happened to it ? If you are having problems with the store then what are they ?

  • A replacement for the Quicksort function in the C++ library

    Hi every one,
    I'd like to introduce and share a new Triple State Quicksort algorithm which was the result of my research in sorting algorithms during the last few years. The new algorithm reduces the number of swaps to about two thirds (2/3) of classical Quicksort. A multitude
    of other improvements are implemented. Test results against the std::sort() function shows an average of 43% improvement in speed throughout various input array types. It does this by trading space for performance at the price of n/2 temporary extra spaces.
    The extra space is allocated automatically and efficiently in a way that reduces memory fragmentation and optimizes performance.
    Triple State Algorithm
    The classical way of doing Quicksort is as follows:
    - Choose one element p. Called pivot. Try to make it close to the median.
    - Divide the array into two parts. A lower (left) part that is all less than p. And a higher (right) part that is all greater than p.
    - Recursively sort the left and right parts using the same method above.
    - Stop recursion when a part reaches a size that can be trivially sorted.
     The difference between the various implementations is in how they choose the pivot p, and where equal elements to the pivot are placed. There are several schemes as follows:
    [ <=p | ? | >=p ]
    [ <p | >=p | ? ]
    [ <=p | =p | ? | >p ]
    [ =p | <p | ? | >p ]  Then swap = part to middle at the end
    [ =p | <p | ? | >p | =p ]  Then swap = parts to middle at the end
    Where the goal (or the ideal goal) of the above schemes (at the end of a recursive stage) is to reach the following:
    [ <p | =p | >p ]
    The above would allow exclusion of the =p part from further recursive calls thus reducing the number of comparisons. However, there is a difficulty in reaching the above scheme with minimal swaps. All previous implementation of Quicksort could not immediately
    put =p elements in the middle using minimal swaps, first because p might not be in the perfect middle (i.e. median), second because we don’t know how many elements are in the =p part until we finish the current recursive stage.
    The new Triple State method first enters a monitoring state 1 while comparing and swapping. Elements equal to p are immediately copied to the middle if they are not already there, following this scheme:
    [ <p | ? | =p | ? | >p ]
    Then when either the left (<p) part or the right (>p) part meet the middle (=p) part, the algorithm will jump to one of two specialized states. One state handles the case for a relatively small =p part. And the other state handles the case for a relatively
    large =p part. This method adapts to the nature of the input array better than the ordinary classical Quicksort.
    Further reducing number of swaps
    A typical quicksort loop scans from left, then scans from right. Then swaps. As follows:
    while (l<=r)
    while (ar[l]<p)
    l++;
    while (ar[r]>p)
    r--;
    if (l<r)
    { Swap(ar[l],ar[r]);
    l++; r--;
    else if (l==r)
    { l++; r--; break;
    The Swap macro above does three copy operations:
    Temp=ar[l]; ar[l]=ar[r]; ar[r]=temp;
    There exists another method that will almost eliminate the need for that third temporary variable copy operation. By copying only the first ar[r] that is less than or equal to p, to the temp variable, we create an empty space in the array. Then we proceed scanning
    from left to find the first ar[l] that is greater than or equal to p. Then copy ar[r]=ar[l]. Now the empty space is at ar[l]. We scan from right again then copy ar[l]=ar[r] and continue as such. As long as the temp variable hasn’t been copied back to the array,
    the empty space will remain there juggling left and right. The following code snippet explains.
    // Pre-scan from the right
    while (ar[r]>p)
    r--;
    temp = ar[r];
    // Main loop
    while (l<r)
    while (l<r && ar[l]<p)
    l++;
    if (l<r) ar[r--] = ar[l];
    while (l<r && ar[r]>p)
    r--;
    if (l<r) ar[l++] = ar[r];
    // After loop finishes, copy temp to left side
    ar[r] = temp; l++;
    if (temp==p) r--;
    (For simplicity, the code above does not handle equal values efficiently. Refer to the complete code for the elaborate version).
    This method is not new, a similar method has been used before (read: http://www.azillionmonkeys.com/qed/sort.html)
    However it has a negative side effect on some common cases like nearly sorted or nearly reversed arrays causing undesirable shifting that renders it less efficient in those cases. However, when used with the Triple State algorithm combined with further common
    cases handling, it eventually proves more efficient than the classical swapping approach.
    Run time tests
    Here are some test results, done on an i5 2.9Ghz with 6Gb of RAM. Sorting a random array of integers. Each test is repeated 5000 times. Times shown in milliseconds.
    size std::sort() Triple State QuickSort
    5000 2039 1609
    6000 2412 1900
    7000 2733 2220
    8000 2993 2484
    9000 3361 2778
    10000 3591 3093
    It gets even faster when used with other types of input or when the size of each element is large. The following test is done for random large arrays of up to 1000000 elements where each element size is 56 bytes. Test is repeated 25 times.
    size std::sort() Triple State QuickSort
    100000 1607 424
    200000 3165 845
    300000 4534 1287
    400000 6461 1700
    500000 7668 2123
    600000 9794 2548
    700000 10745 3001
    800000 12343 3425
    900000 13790 3865
    1000000 15663 4348
    Further extensive tests has been done following Jon Bentley’s framework of tests for the following input array types:
    sawtooth: ar[i] = i % arange
    random: ar[i] = GenRand() % arange + 1
    stagger: ar[i] = (i* arange + i) % n
    plateau: ar[i] = min(i, arange)
    shuffle: ar[i] = rand()%arange? (j+=2): (k+=2)
    I also add the following two input types, just to add a little torture:
    Hill: ar[i] = min(i<(size>>1)? i:size-i,arange);
    Organ Pipes: (see full code for details)
    Where each case above is sorted then reordered in 6 deferent ways then sorted again after each reorder as follows:
    Sorted, reversed, front half reversed, back half reversed, dithered, fort.
    Note: GenRand() above is a certified random number generator based on Park-Miller method. This is to avoid any non-uniform behavior in C++ rand().
    The complete test results can be found here:
    http://solostuff.net/tsqsort/Tests_Percentage_Improvement_VC++.xls
    or:
    https://docs.google.com/spreadsheets/d/1wxNOAcuWT8CgFfaZzvjoX8x_WpusYQAlg0bXGWlLbzk/edit?usp=sharing
    Theoretical Analysis
    A Classical Quicksort algorithm performs less than 2n*ln(n) comparisons on the average (check JACEK CICHON’s paper) and less than 0.333n*ln(n) swaps on the average (check Wild and Nebel’s paper). Triple state will perform about the same number of comparisons
    but with less swaps of about 0.222n*ln(n) in theory. In practice however, Triple State Quicksort will perform even less comparisons in large arrays because of a new 5 stage pivot selection algorithm that is used. Here is the detailed theoretical analysis:
    http://solostuff.net/tsqsort/Asymptotic_analysis_of_Triple_State_Quicksort.pdf
    Using SSE2 instruction set
    SSE2 uses the 128bit sized XMM registers that can do memory copy operations in parallel since there are 8 registers of them. SSE2 is primarily used in speeding up copying large memory blocks in real-time graphics demanding applications.
    In order to use SSE2, copied memory blocks have to be 16byte aligned. Triple State Quicksort will automatically detect if element size and the array starting address are 16byte aligned and if so, will switch to using SSE2 instructions for extra speedup. This
    decision is made only once when the function is called so it has minor overhead.
    Few other notes
    - The standard C++ sorting function in almost all platforms religiously takes a “call back pointer” to a comparison function that the user/programmer provides. This is obviously for flexibility and to allow closed sourced libraries. Triple State
    defaults to using a call back function. However, call back functions have bad overhead when called millions of times. Using inline/operator or macro based comparisons will greatly improve performance. An improvement of about 30% to 40% can be expected. Thus,
    I seriously advise against using a call back function when ever possible. You can disable the call back function in my code by #undefining CALL_BACK precompiler directive.
    - Like most other efficient implementations, Triple State switches to insertion sort for tiny arrays, whenever the size of a sub-part of the array is less than TINY_THRESH directive. This threshold is empirically chosen. I set it to 15. Increasing this
    threshold will improve the speed when sorting nearly sorted and reversed arrays, or arrays that are concatenations of both cases (which are common). But will slow down sorting random or other types of arrays. To remedy this, I provide a dual threshold method
    that can be enabled by #defining DUAL_THRESH directive. Once enabled, another threshold TINY_THRESH2 will be used which should be set lower than TINY_THRESH. I set it to 9. The algorithm is able to “guess” if the array or sub part of the array is already sorted
    or reversed, and if so will use TINY_THRESH as it’s threshold, otherwise it will use the smaller threshold TINY_THRESH2. Notice that the “guessing” here is NOT fool proof, it can miss. So set both thresholds wisely.
    - You can #define the RANDOM_SAMPLES precompiler directive to add randomness to the pivoting system to lower the chances of the worst case happening at a minor performance hit.
    -When element size is very large (320 bytes or more). The function/algorithm uses a new “late swapping” method. This will auto create an internal array of pointers, sort the pointers array, then swap the original array elements to sorted order using minimal
    swaps for a maximum of n/2 swaps. You can change the 320 bytes threshold with the LATE_SWAP_THRESH directive.
    - The function provided here is optimized to the bone for performance. It is one monolithic piece of complex code that is ugly, and almost unreadable. Sorry about that, but inorder to achieve improved speed, I had to ignore common and good coding standards
    a little. I don’t advise anyone to code like this, and I my self don’t. This is really a special case for sorting only. So please don’t trip if you see weird code, most of it have a good reason.
    Finally, I would like to present the new function to Microsoft and the community for further investigation and possibly, inclusion in VC++ or any C++ library as a replacement for the sorting function.
    You can find the complete VC++ project/code along with a minimal test program here:
    http://solostuff.net/tsqsort/
    Important: To fairly compare two sorting functions, both should either use or NOT use a call back function. If one uses and another doesn’t, then you will get unfair results, the one that doesn’t use a call back function will most likely win no matter how bad
    it is!!
    Ammar Muqaddas

    Thanks for your interest.
    Excuse my ignorance as I'm not sure what you meant by "1 of 5" optimization. Did you mean median of 5 ?
    Regarding swapping pointers, yes it is common sense and rather common among programmers to swap pointers instead of swapping large data types, at the small price of indirect access to the actual data through the pointers.
    However, there is a rather unobvious and quite terrible side effect of using this trick. After the pointer array is sorted, sequential (sorted) access to the actual data throughout the remaining of the program will suffer heavily because of cache misses.
    Memory is being accessed randomly because the pointers still point to the unsorted data causing many many cache misses, which will render the program itself slow, although the sort was fast!!.
    Multi-threaded qsort is a good idea in principle and easy to implement obviously because qsort itself is recursive. The thing is Multi-threaded qsort is actually just stealing CPU time from other cores that might be busy running other apps, this might slow
    down other apps, which might not be ideal for servers. The thing researchers usually try to do is to do the improvement in the algorithm it self.
    I Will try to look at your sorting code, lets see if I can compile it.

Maybe you are looking for