Runtime Compilation to Increase Performance

Hi everyone,
I've been struggling with this problem for a few days now and would be glad if you could help. I'm working on a neural networks project. I've written a GUI to edit and train neural networks for a certain task. In order to be able to play with the network architecture in a flexible manner I've exploited the OOP concepts generously. But once the network architecture is set I don't need this flexibility anymore and I want to compile a specific class that has the same functionality as that network where all the computation is carried out over local variables (I even made them static) instead of array elements and object references. I've successfully written this code that generates a string where loops are unrolled and all the if, switch statements are resolved and the variables that are not supposed to vary at this point are inserted as constants at the (runtime) compile time.This should speed up right? Especially considering all the array bounds checking and stuff java is doing.
When I try to compile this string, everything is fine and the mathematical functionality is perfectly imitated. But the problem is, for small networks, the code works twice as fast, as I would expect but for large networks the speed drops to something like 4 times slower! For a typical large network the resulting binary is 80k, so I wouldnt expect it to fill the cache (or should I?) anyway, even if it did fill the cache that shouldnt slow it down by a factor of 8. The slowdown sounds more like the difference between JITted and interpreted code. I suspect that as that single method is 80k long it might not be JITted (sorry if I annoy you with that crappy term). If you think this is the case, is there any way to bypass that? Or what do you think the reason could be. I already tried to decrease the number of method calls before compilation to 1 (with the command line argument -XX:CompileThreshold:1) just to try, but it didnt help.
I converted the resulting string also to C++ and it runs 4 times faster than the unrolled original method(Even for large networks). (It serves also as a nice test showing that java is half as fast as such an arithmetic and trigonometric intensive task)
I would be so glad if you could help me attain the x2 performance I obtain with the small networks in the general case. And I would be super duper glad if you could suggest a way I could runtime compile it in C++ and attach it to my program through JNI. Runtime compiling C++ code sounds messy but I keep some hope since the operations are purely mathematical and they don't need any platform specific thing.
Thanks a lot
Edited by: enobayram on Mar 5, 2009 5:51 AM

Thanks for your answer, I was thinking about the same thing, but I am not exactly trying to out-smart it. During that runtime compilation I know much more than the VM does. I know that the neuron axon function family is constant, so I bypass a switch statement. I know that function flatness will not change after that point so I insert it as a constant. I also avoid using arrays since I know how many elements there should be, and I can use individual local variables instead. This also explains the speed increase in small networks and also in the C++ experiment.
I've checked those options, and the only relevant one I could find is that XX:CompileThreshold. By the way, why do you think runtime compiling C++ is less messy? With java everything needed is included in the standard library. JavaCompiler class and the ClassLoader class are sufficient. With C++ I guess I would have to excite some C++ compiler through a system call so that it generates a .dll (or .so) then I would have to have it loaded to memory and interface it to my program. It could even get messier if I tried to recompile it when the network changes as then I would have to unload the dll and rebuild it. If you had an easier way in mind I would be so glad to hear.

Similar Messages

  • How can i increase performance of interface

    while i am runing interface in odi it takes 5 days.so how can i increase performance of interface.
    source contains: 30 crores of records
    i want copy 30 crores to target.
    source:oracle
    target:oracle
    i am using lkm:lkm sql to sql
    IKM:ikm control append
    Edited by: 967609 on 25 Oct, 2012 2:55 AM
    Edited by: 967609 on 25-Oct-2012 10:13

    IT IS CREATED VIEW AND SYNONYM.
    MY SERVER NAME IS REPA,
    ANTHOER SERVERNAE IS MISREPL
    create or replace view REPA.C$_0XX_TR
         C1_TJD,
         C2_CID,
         C3_BOO,
         C4_TYPE,
         C5_GRP,
         C6_POAM,
         C7_BALINT,
         C8_DUIN2,
         C9_CRLMT,
         C10_IRN,
         C11_TDUE,
         C12_CHKHLD,
         C13_WDLMT,
         C14_ZSBU,
         C15_BAL,
         C16_MCHG,
         C17_LCHG,
         C18_ACR,
         C19_CR,
         C20_DR,
         C21_CRCD
    ) as
    select * from (
    select     
         XX.TJD     C1_TJD,
         XX.CID     C2_CID,
         XX.BOO     C3_BOO,
         XX.TYPE     C4_TYPE,
         XX.GRP     C5_GRP,
         XX.POAM     C6_POAM,
         XX.BALINT     C7_BALINT,
         XX.DUIN2     C8_DUIN2,
         XX.CRLMT     C9_CRLMT,
         XX.IRN     C10_IRN,
         XX.TDUE     C11_TDUE,
         XX.CHKHLD     C12_CHKHLD,
         XX.WDLMT     C13_WDLMT,
         XX.ZSBU     C14_ZSBU,
         XX.BAL     C15_BAL,
         XX.MCHG     C16_MCHG,
         XX.LCHG     C17_LCHG,
         XX.ACR     C18_ACR,
         XX.CR     C19_CR,
         XX.DR     C20_DR,
         XX.CRCD     C21_CRCD
    from     REPA.XX@REMOTE XX
    where     (1=1)
    create synonym     STG.C$_0XX_TR
    for           REPA.C$_0XX_TR@remote
    insert into     STG.XX_TR
         TJD,
         CID,
         BOO,
         TYPE,
         GRP,
         POAM,
         BALINT,
         DUIN2,
         CRLMT,
         IRN,
         TDUE,
         CHKHLD,
         WDLMT,
         ZSBU,
         BAL,
         MCHG,
         LCHG,
         ACR,
         CR,
         DR,
         CRCD
    select
    TJD,     CID,
         BOO,
         TYPE,
         GRP,
         POAM,
         BALINT,
         DUIN2,
         CRLMT,
         IRN,
         TDUE,
         CHKHLD,
         WDLMT,
         ZSBU,
         BAL,
         MCHG,
         LCHG,
         ACR,
         CR,
         DR,
         CRCD
    FROM (
    select      
         C1_TJD TJD,
         C2_CID CID,
         C3_BOO BOO,
         C4_TYPE TYPE,
         C5_GRP GRP,
         C6_POAM POAM,
         C7_BALINT BALINT,
         C8_DUIN2 DUIN2,
         C9_CRLMT CRLMT,
         C10_IRN IRN,
         C11_TDUE TDUE,
         C12_CHKHLD CHKHLD,
         C13_WDLMT WDLMT,
         C14_ZSBU ZSBU,
         C15_BAL BAL,
         C16_MCHG MCHG,
         C17_LCHG LCHG,
         C18_ACR ACR,
         C19_CR CR,
         C20_DR DR,
         C21_CRCD CRCD
    from     STG.C$_0XX_TR
    where          (1=1)     
    ) ODI_GET_FROM
    i am getting following error
    ODI-1228: Task INT_DBLINK (Integration) fails on the target ORACLE connection STG.
    Caused By: java.sql.SQLException: ORA-12154: TNS:could not resolve the connect identifier specified
         at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:462)
         at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:405)
         at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:931)
         at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:481)
         at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:205)
         at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:548)
         at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:217)
         at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:1115)
         at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1488)
         at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3769)
         at oracle.jdbc.driver.OraclePreparedStatement.execute(OraclePreparedStatement.java:3954)
         at oracle.jdbc.driver.OraclePreparedStatementWrapper.execute(OraclePreparedStatementWrapper.java:1539)
         at oracle.odi.runtime.agent.execution.sql.SQLCommand.execute(SQLCommand.java:163)
         at oracle.odi.runtime.agent.execution.sql.SQLExecutor.execute(SQLExecutor.java:102)
         at oracle.odi.runtime.agent.execution.sql.SQLExecutor.execute(SQLExecutor.java:1)
         at oracle.odi.runtime.agent.execution.TaskExecutionHandler.handleTask(TaskExecutionHandler.java:50)
         at com.sunopsis.dwg.dbobj.SnpSessTaskSql.processTask(SnpSessTaskSql.java:2913)
         at com.sunopsis.dwg.dbobj.SnpSessTaskSql.treatTask(SnpSessTaskSql.java:2625)
         at com.sunopsis.dwg.dbobj.SnpSessStep.treatAttachedTasks(SnpSessStep.java:558)
         at com.sunopsis.dwg.dbobj.SnpSessStep.treatSessStep(SnpSessStep.java:464)
         at com.sunopsis.dwg.dbobj.SnpSession.treatSession(SnpSession.java:2093)
         at oracle.odi.runtime.agent.processor.impl.StartSessRequestProcessor$2.doAction(StartSessRequestProcessor.java:366)
         at oracle.odi.core.persistence.dwgobject.DwgObjectTemplate.execute(DwgObjectTemplate.java:216)
         at oracle.odi.runtime.agent.processor.impl.StartSessRequestProcessor.doProcessStartSessTask(StartSessRequestProcessor.java:300)
         at oracle.odi.runtime.agent.processor.impl.StartSessRequestProcessor.access$0(StartSessRequestProcessor.java:292)
         at oracle.odi.runtime.agent.processor.impl.StartSessRequestProcessor$StartSessTask.doExecute(StartSessRequestProcessor.java:855)
         at oracle.odi.runtime.agent.processor.task.AgentTask.execute(AgentTask.java:126)
         at oracle.odi.runtime.agent.support.DefaultAgentTaskExecutor$2.run(DefaultAgentTaskExecutor.java:82)
         at java.lang.Thread.run(Thread.java:662)

  • Best way in using models & Increasing performance

    Hi all,
    I had some doubts in creation of model objects.
    1.How many RFCs can a model object can contain?
    2.I had a business senario where i had to use 4 Fm for performing a task.If i craete a single model object for this 4 Fun modules.Will it increase performance or else creating a model object for each fun module.will incresae the performance.
    3.Are there any good docs in SDN for Best practices or performance increasing in creation and using of model objects.please do paste the links or anyone have any docs plz send me.
    Thanks & Rgards,
    Lokesh

    HI...
    1.How many RFCs can a model object can contain?
    SAP recommonds...
    RFC Connection pools are specific to JCO Destination.
    Therefore, all deployed applications using the same model object pointing to the same JCO destination will share the SAME CONNECTION POOL.
    This fact defines both the scope of the connection management and determines the number of oncurrent application that may use the JCO destination.
    A MODEL OBJECT SHOULD CONTAIN THOSE RFMS THAT SUPPLY THE FUNCTIONALITY OF EITHER A DISCRETE BUSINESS TASK OR SOME ATOMIC SUBSET OF THE BUSINESS TASK
    -> HAVING ONE RFM PER MOEDL IS INEFFICIENT FROM A CONNECTION MANAGEMENT POINT OF VIEW.
    -> HAVING ALL YOUR RFMS IN ONE BIG MODEL OBJECT IS INEFFICIENT FROM A REUSE POINT OF VIEW
    2.I had a business senario where i had to use 4 Fm for performing a task.If i craete a single model object for this 4 Fun modules.Will it increase performance or else creating a model object for each fun module.will incresae the performance.
    As above described if the RFMs supply the functionality for a single task then put it in one model
    3.Are there any good docs in SDN for Best practices or performance increasing in creation and using of model objects.please do paste the links or anyone have any docs plz send me.
    This is described in JA310 ( Web dynpro JAVA) book. you can download it from marketplace.
    PradeeP

  • How to increase performance of adobe forms of MSS Business package

    Hi
    We have implemented MSS business package with  PCR adobe forms.
    Portal NW04 SP18, ERP 2004 , ADS is Nw04sp16 and abode reader 7.0.7.
    we have develped own PCR using existing ISR frame work.
    every thing working fine.but user facing performance problem like some times while opening pcr form ,browser gets hang up.
    Is there any way to increase performance of adobe forms of PCR.
    thanks In advance
    Gopal

    Hi!
    Interactive Forms need a lot of performance on the client side. If the client hangy up I think this is realted to client issues.
    Also I would update the forms server to be the same version as the other NW components (Portal).
    Sigi

  • Intel "Save Power / Increase Performance" popup

    Currently starting to replace the older X series laptops with the X220's here at work.  The problem we are running into is that annoying green popup window from Intel HD graphics asking to save power or increase performance.  The window can't be moved and the users can't do much of anything for the couple minutes it stays on the screen.  It's becoming a real hassle for us helpdesk guys.
    How do we get rid of this?
    Removing the drivers doesn't work, they just reinstall after reboot.

    From an unrelated thread out there on the web:
    When unplugging the mains ac adapter in Speed mode I had an annoying green Icon in the middle of my screen for 2 minutes that said "Save power" -> "Increased performance" stays for some time on top of other windows. I renamed the file C:\Windows\System32\nvvsvc.exe This however unelegant appears to have eliminated the annoying popup, if anyone has a better solution for this do let me know.
    Untested, unlikely, and a little scary.  It's the only thing I could find that even suggested a solution.  Otherwise, there are just a few other people complaining about the same thing on a variety of platforms.
    Z.
    The large print: please read the Community Participation Rules before posting. Include as much information as possible: model, machine type, operating system, and a descriptive subject line. Do not include personal information: serial number, telephone number, email address, etc.  The fine print: I do not work for, nor do I speak for Lenovo. Unsolicited private messages will be ignored. ... GeezBlog
    English Community   Deutsche Community   Comunidad en Español   Русскоязычное Сообщество

  • Increase performance query more than 10 millions records significantly

    The story is :
    Everyday, there is more than 10 million records which the data in textfiles format (.csv(comma separated value) extension, or other else).
    Example textfiles name is transaction.csv
    Phone_Number
    6281381789999
    658889999888
    618887897
    etc .. more than 10 million rows
    From transaction.csv then split to 3 RAM (memory) tables :
    1st. table nation (nation_id, nation_desc)
    2nd. table operator(operator_id, operator_desc)
    3rd. table area(area_id, area_desc)
    Then query this 3 RAM tables to result physical EXT_TRANSACTION (in harddisk)
    Given physical External Oracle table name EXT_TRANSACTION with column result is :
    Phone_Number Nation_Desc Operator_Desc Area_Desc
    ======================================
    6281381789999 INA SMP SBY
    So : Textfiles (transaction.csv) --> RAM tables --> Oracle tables (EXT_TRANSACTION)
    The first 2 digits is nation_id, next 4 digits is operator_id, and next 2 digits is area_id.
    I ever heard, to increase performance significantly, there is a technique to create table in memory (RAM) and not in harddisk.
    Any advice would be very appreciate.
    Thanks.

    Oracle uses sophisticated algorithms for various memory caches, including buffering data in memory. It is described in Oracle® Database Concepts.
    You can tell Oracle via the CACHE table clause to keep blocks for that table in the buffer cache (refer to the URL for the technical details of how this is done).
    However, this means there are now less of the buffer cache available to cache other data often used. So this approach could make accessing one table a bit faster at the expense of making access to other tables slower.
    This is a balancing act - how much can one "interfere" with cache before affecting and downgrading performance. Oracle also recommends that this type of "forced" caching is use for small lookup tables. It is not a good idea to use this on large tables.
    As for your problem - why do you assume that keeping data in memory will make processing faster? That is a very limited approach. Memory is a resource that is in high demand. It is a very finite resource. It needs to be carefully spend to get the best and optimal performance.
    The buffer cache is designed to cache "hot" (often accessed) data blocks. So in all likelihood, telling Oracle to cache a table you use a lot is not going to make it faster. Oracle is already caching the hot data blocks as best possible.
    You also need to consider what the actual performance problem is. If your process needs to crunch tons of data, it is going to be slow. Throwing more memory will be treating the symptom - not the actual problem that tons of data are being processed.
    So you need to define the actual problem. Perhaps it is not slow I/O - there could be a user defined PL/SQL function used as part of the ELT process that causes the problem. Parallel processing could be use to do more I/O at the same time (assuming the I/O subsystem has the capacity). The process can perhaps be designed better - and instead of multiple passes through a data set, crunching the same data (but different columns) again and again, do it in a single pass.
    10 million rows are nothing ito what Oracle can process on even a small server today. I have dual CPU AMD servers doing over 2,000 inserts per second in a single process. A Perl program making up to a 1,000 PL/SQL procedure calls per second. Oracle is extremely capable - as it today's hardware and software. But that needs a sound software engineering approach. And that approach says that we first need to fully understand the problem before we can solve it, treating the cause and not the symptom.

  • How to use hints in Obiee to increase performance

    Anybody please tell me how can we use hints to increase performance of ad hoc and dashboard reports in obiee

    Hi,
    Check this,
    http://www.howtoexam.com/index.php?option=com_content&view=article&id=75%3Ausing-hints-in-obiee-rpd-and-answers&catid=790%3Acomputers-and-software&Itemid=166
    Rgds,
    Dpka

  • Runtime compilation

    Hi Forte-users,
    I have a question about runtime compilation . Under what conditions does Forte
    compile a method at runtime. My logical conclusion would be Forte complies
    stuff that has been changed or has a direct impact due to the change, but this
    is not true as I found out the hardway that it even compiles methods or objects
    that are residing in some other Projects which are not even supplier to the
    Project that one had made changes to.
    Please let me know what the criteria that Forte uses to decide whether a
    methods needs to be complied runtime or not. Is there any way to stop run-time
    compilation?
    Thanks
    Mihir Chitre

    Example?
    var myFunc:Function;
    myFunc = function()
    var i:int = 5;
    trace(i);
    //works fine
    myFunc = "function(){ var i:int = 5; trace(i);}";
    //TypeError: Error #1034: Type Coercion failed: cannot
    convert "function(){ var i:int = 5; trace(i);}" to Function.
    // at Sandbox_fla::MainTimeline/Sandbox_fla::frame1()
    myFunc = (Function)("function(){ var i:int = 5; trace(i);}");
    //EvalError: Error #1066: The form function('function body')
    is not supported.
    // at Sandbox_fla::MainTimeline/Sandbox_fla::frame1()
    myFunc();

  • How can you increase performance ?

    Hello,
    I am building a site but it's framerate is very slow. 
    I read some articles that Sprites are much faster then movieclips. 
    For example i have 5 long movieclips (4300 pixels in width). These clips scroll in horizontal direction to get a parallax effect.
    One of these movieclips contains other movieclips. (all movieclips are not animating and are just graphics).
    So I thought to increase performance to convert all movieclips to Sprites as I read that Sprites are better for performance.
    I came up with this method:
    function castMovieClipToSprite(source:MovieClip, recursive:Boolean = true):void {
         for (var i:int = 0; i < source.numChildren; i++) {
         var child:DisplayObject = source.getChildAt(i) as DisplayObject;
         if (child is MovieClip && recursive) {
         castMovieClipToSprite(MovieClip(child), recursive);
         child = Sprite(child);
    But the performance has slightly improved, but if i run the debugger I see that the child objects are still of type MovieClip.
    Does anyone know other way to increase performance?
    Thanks,
    Chris.

    I also run some more test and noticed some strange behaviours,
    Testlinks:
    - Without the gradient ( http://www.rhbmprogress.nl/temp/cs/performanceTest1/ )
    - With the gradient ( http://www.rhbmprogress.nl/temp/cs/performanceTest2/ );
    Ps. I did not see any difference if I disabled the SWFProfiler so I left it on for test purpuse.
    Desktop test,
    Firefox version:  4.0.1.
    Flash version: 10.2.152.32  (debugger version)
    Test1 - > 60 fps
    Test2 -> 55 fps
    Internet Explorer version: 9.0811.1642
    Flash version: 10.3.181.23 (debugger version)
    Test1 - >60 fps
    Test2 ->55 fps
    Other destop computer test:
    Firefox version 3.6.13
    Flash version: 10.1.52.14 (no debugger)
    Test 1-> 58 fps
    Test 2-> 35 fps
    Internet Explorer version 9.0.8112.1642
    Flash version: 10.3.181.23 (no debugger)
    Test1-> 60 fps
    Test2-> 37 fps
    As you can see there are some major difference between flash versions and browser types and versions.
    So the solid fill seems to have a stable fps, but the alpha gradient fill does not appear so.
    I wonder what flash version you used and what browser type and version

  • How Increase performance of delete operation

    Hi,
    How Increase performance of delete operation. This delete is done on a table which has around millions of records and loaded back every day .
    The statement is in a procedure and is as follows.
    #$%%$#$;
    commit;
    delete from TVRBC_SITE_ROLLUP_T;
    commit;

    Hi,
    execute immediate 'truncate table TVRBC_SITE_ROLLUP_T';
    Regards,
    Oleg
    Message was edited by:
    tsiboleg

  • Can I make cluster of two Mac mini's to increase performance

    I like to know if it is poseble to make cluster of two Mac mini's to increase performance
    or to do load balance and if it is poseble how it done ?

    I also need to know... I would like to know if such a thing will be possible via thunderbolt...
    Thanks!

  • Is there any alternative for this code to increase performance

    hi, i want alternate code for this to increase performance.
    DATA : BEGIN OF itab OCCURS 0,
                  matnr LIKE zcst-zmatnr,
                 checked TYPE i,
                 defected TYPE i,
               end of itab.
    SELECT DISTINCT zmatnr FROM zcst INTO TABLE itab WHERE
       zmatnr IN s_matnr AND
          zwerks EQ p_plant AND
          zcastpd IN s_castpd AND
          zcatg IN s_categ.
    LOOP AT itab.
        ind = sy-tabix.
    SELECT COUNT( DISTINCT zcst~zcastn )
           FROM zcst INNER JOIN zvtrans
           ON ( zcstzcastn = zvtranszcastn AND
                zcstzmatnr = zvtranszmatnr AND
                zcstzwerks = zvtranszwerks AND
                zcstgjahr  = zvtransgjahr )
           INTO itab-checked
           WHERE
               zcst~zmatnr = itab-matnr AND
               zcst~zwerks EQ p_plant AND
               zcastpd IN s_castpd AND
               zcatg IN s_categ.
    SELECT COUNT( DISTINCT zcst~zcastn )
          FROM zcst INNER JOIN zvtrans
          ON ( zcstzcastn = zvtranszcastn AND
               zcstzmatnr = zvtranszmatnr AND
               zcstzwerks = zvtranszwerks AND
               zcstgjahr  = zvtransgjahr )
          INTO itab-defected
          WHERE
              zcst~zmatnr = itab-matnr AND
              zcst~zwerks EQ p_plant AND
              zcastpd IN s_castpd AND
              zcatg IN s_categ AND
              zvtrans~zdcode <> '   '.
      MODIFY itab INDEX ind.
      ENDLOOP.
    i think, select within loop is reducing the performance
    pls reply

    Hi,
    types : BEGIN OF t_itab ,
        matnr LIKE zcst-zmatnr,
       checked TYPE i,
       defected TYPE i,
    end of t_itab.
    data : itab type table of t_itab,
             wa_itab type t_itab.
    and instead of looping as in ur code try to use for all entries and
    use nested loop.

  • When will Runtime.totalMemery() being increased?

    When will Runtime.totalMemery() being increased?
    Is it when there is not enough memory even after gc or when there is some threshold to reach?
    When it increases, how much will it be increased? e.g. by 50% or by a fixed amount or by the amount needed?

    A single JVM will only use as much memory as you tell it to. This is a parameter that you give to the JVM on the command line when it starts.
    To see them all, run Java -X, this also provides help for all the commands.
    The one that you are interested in is -Xmx
    which is the maximum amount of heap that java will use.
    good luck

  • Doese Cisco ASA 5500 has module increase performance VPN?

    Dear All,
    Doese Cisco ASA 5510 and 5505 has module for increase performance VPN ?
    Best Regards,
    Rechard

    Rechard,
    There is one built into every ASA. If you need better performance because you're limited by engine performance... you need to most likely move up to a bigged model.
    Here is the datasheet for reference:
    http://www.cisco.com/en/US/products/ps6120/prod_models_comparison.html
    M.

  • Mail has 16k  messages, and performance is very slow, with loading times taking up to 5 seconds every time I open Mail. How can I increase performance?

    Mail has 16k  messages, and performance is very slow, with loading times taking up to 5 seconds every time I open Mail.
    How can I increase performance?
    I'm running a MacBook Air 4GB 1.7GHz  10.7.2.
    Graham

    One possible solution would be to organise your inbox into folders.
    Its never relly good on any system to have one folder that has everything in it.
    Try going to you web gui for that mail account and organise your folders and move mails from your inbox into corresponding folders for better organisation.
    Several folders containing the same amount of one folder will usually load a little quicker as the folder may not be accessed to download its content unless veiwed.
    So having 10 folders with organised content, and you inbox as an area thats to hold only new emails would work much much quicker with imap.
    Most imap servers will only update the contents of a folder when its veiwed.

Maybe you are looking for