Calculating load averages

Generic_137111-02 sun4v sparc SUNW,SPARC-Enterprise-T5220 Solaris 10:
I have couple of questions:
1. mpstat shows it has 63 CPUs but the spec says T5220 have 8 cores.
2. I don't understand what is threading technology in CPUs. What's the difference as compared to cores
3. How do I calculate load averages? When I do a top I don't know how to convert it into this CPU threading technology. How do I know if machine is overloaded

The Niagara processor is designed so there can be 8 parallel simultaneous threads (Hardware threads) running on a single core.
The T2 processor (Niagara 2) is capable of 8 cores, each capable of 8 parallel threads. So, effectively one T2 can run 64 threads simultaneously.
[http://blogs.sun.com/glennf/entry/getting_past_go_with_sparc|http://blogs.sun.com/glennf/entry/getting_past_go_with_sparc]
[http://blogs.sun.com/glennf/tags/cmt|http://blogs.sun.com/glennf/tags/cmt]
I'd recommend going through these two links (see above). It will save you a lot of grief when you go from single-threaded admins to multi-threaded admins.

Similar Messages

  • Load average Calculation

    Hi,
    How the load average in a solaris system is calculated. What is the threshold level of load average, which could be panic to server.
    Regards,
    Siva

    We had a large system with over 100 cpu's running Solaris 10, and the highest load point average (LPA) that I saw was over 1000. The system was slow but did not panic.
    I believe that the LPA divided by the number of cpu's will tell you the number of jobs per cpu which are runnable. If the LPA is larger that the number of CPU's then you are time slicing between the available jobs, and getting less than a full slice per job.

  • ST06: Load Average

    Hi
    I've been consulting two different Basis Consultants with this question and got two different answers, so I will just try this forum to figure out which one is right:
    I have a WEB AS server with 12 CPUs (running Business Warehouse) where the load average is between 4 and 10. In the detailed analysis I can see that most of the CPU’s are idle even with a Load Average on e.g. 8. My understanding of load average is the number of work-processes within a certain period (1 min; 5 min etc.) waiting for a CPU to be processed. Furthermore I heard that this load average as a rule of thumb should not be higher than 2.
    Answer from Basis-Consultant1:
    Load average should be calculated according to the CPU’s so in ST06 it is allowed to have a load average of (12 CPU’s * 2 work processes) 24. Which means that 10 is not much and is probably due to lack of parallelism in the processes.
    Answer from Basis-Consultant2
    According to the SAP opinion - like a rule of thumb -  if the average load is around 1 percent it is OK,  if it is 3 percent there could be a serious bottleneck. But there are also more things to consider (CPU utlization per hour, memory consumption etc.). So the whole picture has to be evaluated.
    Can anybody help me here.... which one is right?
    Thanks in advance.
    Best regards,
    Keld Pilegaard

    Hi,
    Check out this PDF "Best Practices for Performance Tuning SAP R3 and Oracle, Part I" in Sdn, It will give u a clear idea about load average
    https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/5d0db4c9-0e01-0010-b68f-9b1408d5f234
    Kind Regards
    Umesh K

  • High cpu load average

    Hi Experts,
    I have a SOA deployed on AS 10.1.3.2 which is integerated with BI EE 10.1.3.2 on OHEL 4.
    With this setup, I have seeing very high load average on cpu side. When I stop the soa oc4j the load average comes to normal level of under 1. While with soa process started it goes as high as 15 which is pretty abnormal.
    Any pointers to debug what could be the issue will be helpfu.
    Thanks,
    Rishi

    Hi Experts,
    I have a SOA deployed on AS 10.1.3.2 which is integerated with BI EE 10.1.3.2 on OHEL 4.
    With this setup, I have seeing very high load average on cpu side. When I stop the soa oc4j the load average comes to normal level of under 1. While with soa process started it goes as high as 15 which is pretty abnormal.
    Any pointers to debug what could be the issue will be helpfu.
    Thanks,
    Rishi

  • Question about  Load Average in the AWR report

    Hi,
    I've some database in 11.2 RAC on AIX.
    I was analyzing the root causes of eviction.
    Looking AWR Report before the reboot I see:
    DB1
    Host CPU (CPUs:    6 Cores:    3 Sockets: )
    ~~~~~~~~         Load Average
                   Begin       End     %User   %System      %WIO     %Idle
                    4.18     12.33     60.9      12.6       1.6      26.5
    Instance CPU
    ~~~~~~~~~~~~
                  % of total CPU for Instance:      27.4
                  % of busy  CPU for Instance:      37.3
    %DB time waiting for CPU - Resource Mgr:      10.6
    DB2
    Host CPU (CPUs:    6 Cores:    3 Sockets: )
    ~~~~~~~~         Load Average
                   Begin       End     %User   %System      %WIO     %Idle
                    3.77    13.93     60.7      12.5       1.6      26.7
    Instance CPU
    ~~~~~~~~~~~~
                  % of total CPU for Instance:       6.9
                  % of busy  CPU for Instance:       9.5
      %DB time waiting for CPU - Resource Mgr:       0.0
    Do you think these value ar high?
    This is vmstats at the time of reboot:
    DATA
    RUN
    BCK
    AVM
    FRE
    PRE
    PPI
    PPO
    PFR
    PSR
    PCY
    FIN
    FSY
    FCS
    CUS
    CSY
    CID
    CWA
    07/21/2013
      00:08:17
    31
    0
    7.400.345
    579.923
    0
    81
    0
    0
    0
    0
    3.292
    187.010
    19.560
    84
    16
    0
    0
    07/21/2013
      00:08:17
    17
    1
    7.390.187
    589.884
    0
    176
    0
    0
    0
    0
    3.681
    169.994
    21.482
    81
    19
    0
    0
    07/21/2013
      00:08:17
    27
    1
    7.402.121
    577.816
    0
    115
    0
    0
    0
    0
    3.150
    157.210
    18.503
    84
    16
    0
    0
    07/21/2013
      00:08:48
    19
    1
    7.422.966
    564.179
    0
    211
    0
    0
    0
    0
    2.396
    152.667
    19.368
    84
    16
    0
    0
    07/21/2013
      00:08:48
    19
    1
    7.427.693
    559.268
    0
    162
    0
    0
    0
    0
    2.990
    154.733
    19.843
    85
    15
    0
    0
    07/21/2013
      00:08:48
    23
    1
    7.441.204
    545.530
    0
    204
    0
    0
    0
    0
    2.137
    171.501
    18.151
    84
    16
    0
    0
    This is mpstat:
    DATA
    CPU
    MIN
    MAJ
    MPC
    INT
    CS
    ICS
    RQ
    MIG
    LPA
    SYSC
    US
    SY
    WT
    ID
    PC
    07/21/2013
      00:08:48
    0
    12896
    44
    0
    1279
    3030
    1362
    2
    367
    100
    27313
    86
    14
    0
    0
    0.49
    07/21/2013
      00:08:48
    1
    11055
    93
    0
    1123
    3137
    1315
    1
    222
    100
    31860
    85
    15
    0
    0
    0.51
    07/21/2013
      00:08:48
    2
    5938
    51
    0
    1465
    3840
    1294
    2
    532
    100
    29992
    85
    15
    0
    0
    0.49
    07/21/2013
      00:08:48
    3
    6266
    57
    0
    1247
    3177
    1046
    2
    511
    100
    22793
    85
    15
    0
    0
    0.51
    07/21/2013
      00:08:48
    4
    2661
    18
    0
    1729
    4087
    1707
    4
    264
    100
    24647
    85
    15
    0
    0
    0.49
    07/21/2013
      00:08:48
    5
    4211
    10
    0
    1395
    2709
    1101
    2
    209
    100
    21019
    86
    14
    0
    0
    0.51
    07/21/2013
      00:08:49
    0
    9372
    27
    0
    1150
    2583
    1219
    0
    245
    100
    47745
    82
    18
    0
    0
    0.47
    07/21/2013
      00:08:49
    1
    11327
    13
    0
    726
    1803
    794
    1
    130
    100
    25239
    87
    13
    0
    0
    0.52
    07/21/2013
      00:08:49
    2
    8970
    118
    0
    1459
    4396
    1517
    0
    602
    100
    24833
    81
    19
    0
    0
    0.49
    07/21/2013
      00:08:49
    3
    7328
    267
    0
    1329
    4136
    1273
    2
    586
    100
    25385
    81
    19
    0
    0
    0.51
    07/21/2013
      00:08:49
    4
    8793
    19
    0
    1133
    2583
    1036
    1
    235
    100
    24327
    86
    14
    0
    0
    0.50
    07/21/2013
      00:08:49
    5
    8239
    12
    0
    1309
    2846
    1165
    1
    277
    100
    18513
    86
    14
    0
    0
    0.50
    Thank you

    Thank you Jonathan,
    i'm looking ASH, 15 minutes before the crash.
    I've 13% of buffer busy waits and 13% of cpu quantum
                                                                   Avg Active
    Event                               Event Class        % Event   Sessions
    CPU + Wait for CPU                  CPU                  59.09       0.15
    buffer busy waits                   Concurrency          13.64       0.04
    resmgr:cpu quantum                  Scheduler            13.64       0.04
    The buffer busy waits was caused by an update of a table.
    There are ETL jobs that runs every nigth.
    Looking IO stats I notice a change in the use of the swap:
    before the crash:
    hdisk66        xfer:  %tm_act      bps      tps      bread      bwrtn   
                             1.0      8.2K     2.0        8.2K       0.0
                   read:      rps  avgserv  minserv  maxserv   timeouts      fails
                             2.0      6.7      3.8      9.6           0          0
                  write:      wps  avgserv  minserv  maxserv   timeouts      fails
                             0.0      0.0      0.0      0.0           0          0
                  queue:  avgtime  mintime  maxtime  avgwqsz    avgsqsz     sqfull
                             0.0      0.0      0.0      0.0        0.0         0.0
    near the crash:
    hdisk66        xfer:  %tm_act      bps      tps      bread      bwrtn   
                            71.0    241.7K    59.0      241.7K       0.0
                   read:      rps  avgserv  minserv  maxserv   timeouts      fails
                            59.0     12.1      0.2    183.5           0          0
                  write:      wps  avgserv  minserv  maxserv   timeouts      fails
                             0.0      0.0      0.0      0.0           0          0
                  queue:  avgtime  mintime  maxtime  avgwqsz    avgsqsz     sqfull
                             0.0      0.0      0.0      0.0        0.0         0.0

  • Problem in calculating the Average Daily Requirement

    Hello all,
    I didn't understand how the system calculates the average daily requirement in Dynamic Safety Stock process. The following process flow in given in SAP notes to find how the system calculates the average daily requirement:
    1. The system uses the defined parameters to determine the number of days used for calculating the average daily requirements. If the period is defined as a week, the period length as standard days (5 days) and the number of periods as 2, the system divides the total of the requirements by 10 days.
    2. The system then calculates the total of the requirements for this period.
    The system takes into account all requirements in the current period, even requirements that lie in the past but are still in the current period. For example, if the planning run is carried out in the middle of the month, then those requirements that were planned at the beginning of the month are also included in the calculation of the average daily requirements.
    3. The average daily requirement is calculated using the formula:
    Requirements in the specified number of periods / Number of days within the total period length
    I have run MRP on 02/23/2009 and the following results are generated in stock requirement list of the component part:
    Date     Dependent Requirement     MMSA Schedule Lines Quantity
    3/3/2009     10     31
    3/11/2009     20     20
    3/31/2009     30     30
    4/14/2009     40     49
    4/22/2009     50     50
    4/29/2009     60     60
    5/11/2009     70     55
    5/21/2009     80     80

    Hi,
    In addition to my previous reply,
    If you did following setting -
    Range of coverage in first period -
    min - blank
    tgt - 7
    max - blank
    number of periods - blank
    The system will calculate the safety stock for 7 days for each period; i.e., 7*3=21 and it will generate plnd orders as
    week1 = 51
    week2 = 14+21 = 35
    week3 = 10+21 = 31
    week4 = 30+21 = 51
    If you want to restrict your calculation till 2 periods then make following settings -
    Range of coverage in first period -
    min - blank
    tgt - 7
    max - blank
    number of periods - 2
    Range of coverage in second period -
    Make all blank
    Range of coverage in the rest of the horizon -
    min - blank
    tgt - 3
    max - blank
    It means for first two weeks the safety stock will be 21 (equivalent to 7 days) and for rest of the horizon it will be 3*3 = 9 (equivalent to 3 days)
    The Plnd orders will be -
    week 1 = 51
    week 2 = 35
    week 3 = 14+9 = 23
    week 4 = 30+9 = 39
    and so on.
    Regards,
    Amol

  • Why is my load average always above 1 regardless of cpu usage?

    My load average is often ridiculous. For example, when I wake from sleep it's usually 45 or so. Even when I'm doing nothing with the machine (CPU is about 10% in each core) I still see load averages of 1.2 to 1.6 or so. Why would this happen?
    Is there a way to figure out what is causing the load?

    Activity Monitor could see what happens on your MacBook.
    Resetting SMC could solve the problem.
    Intel-based Macs: Resetting the System Management Controller (SMC)

  • MacBook Pro Retina 2013 load average constantly above 1

    I have a recently purchased (3 month old) MacBook Pro Retina - Late 2013.
    I've noticed that the load averages appear to be rather consistently high.
    So after a fresh reboot, with nothing other than background applications and the dashboard running, I've noticed that the CPU load stays above 1.0
    The CPU itself is idle, at 100% nearly all of the time, it certainly doesn't correspond to '1' unit of load on this system.
    I suspect something is amiss but have been unable to figure anything out.
    I have tried the instrutions for clearing the SMC and this does not appear to have sorted anything.
    The in-built diagnostics suggest nothing is wrong.
    Any thoughts?

    Pre-Mavericks
    Open Activity Monitor in the Utilities folder.  Select All Processes from the Processes dropdown menu.  Click twice on the CPU% column header to display in descending order.  If you find a process using a large amount of CPU time (>=70,) then select the process and click on the Quit icon in the toolbar.  Click on the Force Quit button to kill the process.  See if that helps.  Be sure to note the name of the runaway process so you can track down the cause of the problem.
    Mavericks and later
    Open Activity Monitor in the Utilities folder.  Select All Processes from the View menu.  Click on the CPU tab in the toolbar. Click twice on the CPU% column header to display in descending order.  If you find a process using a large amount of CPU time (>=70,) then select the process and click on the Quit icon in the toolbar.  Click on the Force Quit button to kill the process.  See if that helps.  Be sure to note the name of the runaway process so you can track down the cause of the problem.

  • One of 4 node RAC always have higher load averages and higher than others

    Hello,
    We have a 4 node rac, 9208 on linux 4. When viewing top, we noticed the same one node always have a higher load average than the other 3 nodes. Is this normal. Loan balance is working fine but this one node always have higher load average. This is the node where we do the rac installation. Thank you.

    I do not remember what is the default for clb_goal (client load balancing) for 9i but 10g is LONG.
    check it
    select clb_goal from dba_services where name = <service name>
    you may have to change from LONG to SHORT OR SHORT to LONG depending your connection types.
    dbms_service.MODIFY_SERVICE(‘<service>’,clb_goal=> dbms_service.CLB_GOAL_LONG);
    Read the following article.
    http://www.databasejournal.com/features/oracle/article.php/3659411/Oracle-RAC-Administration---Part-15-Connection-Load-Balancing-and-FAN.htm

  • Load average on services.

    Hi,
    i have 2 node with ASM file system,
    Node 1 -> i have 7 services
    Node 2 -> i have 6 services,
    how to find out load average on each service on each node?
    Thanks

    SQL> desc V_$SERVICE_STATS
    Name                            Null?    Type
    SERVICE_NAME_HASH                        NUMBER
    SERVICE_NAME                             VARCHAR2(64)
    STAT_ID                             NUMBER
    STAT_NAME                             VARCHAR2(64)
    VALUE                                  NUMBER

  • System load average over 1.5 while cpu idle

    Running 10.9.4.  My load averages are continuously around 1.5 or higher while my cpu is around 95% idle, no apps running, I've just rebooted and logged in.  This is very common now for my machine.  Any suggestions?  Thanks

    On Unix systems that are idle I generally see near-zero load averages.  I would expect OS X to be in that general vicinity and seem to recall seeing near-zero numbers in the past when the system is idle.

  • Auto check calculating the Average

    I do not know how to approach an issue with my form, so any help or quide will be appreciate very much!
    For a Performance Form I have 5 sections for the managers to fill in.
    Every section includes an Assessment drop down list with 5 items to select from.
    Items for DDList (Named:Assessment):
    Outstanding
    Exceeds Expectations
    Meets Expectations
    Needs Improvement
    Does Not Meet Expectations
    Final at the end of the form I have a section named
    OVERALL SUMMARY OF PERFORMANCE with 5 check boxes named:
    Outstanding
    Exceeds Expectations
    Meets Expectations
    Needs Improvement
    Does Not Meet Expectations
    Is it possible with a script(Calculating the average?), AUTO to check(one of the check boxes for the Overall sum of performance?
    THANK YOU
       

    Hi Niall, I took your advise form your last sample cindle you have send me, I am close but still I have a problem at the end of the line!
    Here what I have till now:
    On change event for the Area1 the script below:
    switch 
    (xfa.event.newText){
    case 
    "Outstanding":
    NumericField1.rawValue
    = "5";
    break 
    case 
    "Exceeds expectations":
    NumericField2.rawValue
    = "4"; 
    break 
    case 
    "Meets expectations":
    NumericField3.rawValue
    = "3";
    break 
    case 
    "Needs improvement":
    NumericField4.rawValue
    = "2";
    break 
    case 
    "Does not meet expectations":
    NumericField5.rawValue
    = "1";
    break 
    For a NumericField1(Score for Outstanding) on Calculate event the script:
    var  vScore=0 ;  
    for  (var i=0; i<5; i++){
    if  (xfa.resolveNode("optionA[" + i + "]").rawValue=="5")vScore
    +=xfa.resolveNode("optionA[" + i + "]").rawValue;}
    NumericField1.rawValue= vScore ;
    This I am getting(for NumericField1) is for example select Outstanding for all DDL is:55555 than 25 which is the desire!
    How I can make it work, where is my mistake?
    Thanks Niall

  • Load averages over 1.00 after upgrading to Mavericks

    Hi,
    I have recently upgraded to OS X Mavericks on my old Macbook Pro (Late-2007, 4GB RAM) and I noticed that the load averages are always above 1.00, even when CPU is almost 100% idle. I know load averages don't reflect only CPU but after a reboot I see the same. Could this be because it's an old Macbook and it takes more resources from it to run the new OS X 10.9? This is so strange.
    Thanks for reading!
    L.

    I'm having the same "issue" since I first upgraded to Mavericks from Lion. At first I thought it had something to do with the way processes are scheduled/handled in Mavericks but now I realize that I DO get load averages below 1.0 (~0.7-0.8, never below this), but only when completely idle.
    I have an early 2011 Macbook 2.7 GHz i7with 16 GB RAM, so I find this kinda weird.
    Are you using any particular applications/extensions?
    (see also http://apple.stackexchange.com/questions/106828/avg-load-goes-up-after-upgrading -to-mavericks, altough there is no solution here).

  • ODSEE 11g - DPS Directory proxy server suddenly increase load average

    Hi all
    Recently upgraded from directory server 5.2 to ODSEE 11g, one directory proxy configure to one master directory server and one consumer directory server.
    all the three instances are in the same sparc t3 machine.
    Directory proxy server alerts server load average on the machine is above 6.00 normally it is 0.66. I'm not sure what is causing the sudden burst in the load ? the traffic is normal there is no abnormal requests coming to the server. proxy performance degrades over the span of 24 hours ....and Once i restart the proxy services (dpsadm restart) all load averages comes to normal and directory proxy runs normal for the next two - three weeks. The same cycle continues ...I'm not sure what was causing the sudden load increase.
    I increased the JVM heap size from 1GB to 2 GB still continue to have the problem ..did anyone else experience similar problem. How did you fix it....
    Any input or advise in the right direction is much appreciated.
    Thank you.

    server load I'm referring to "prstat command" - server load average suddenly shoot up from 0.66 to 6.00 ie) the CPU usage. Alert is from our server monitoring tool not related to directory proxy.
    Clients report connections time out (etime goes from etime=0 ..2..4.....) over 24 hours i can see the etime increases and eventually the proxy server get hung and non responsive. Once I restart all the performance back to normal at-least for another two weeks.
    I suspect there might be a memory leak or JVM Garbage collection issue -- any expert input how to figure this out will help.
    Here is the JVM args in the proxy server "Xms2g -Xmx2g -Xmn1g -XX:SurvivorRatio=4 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC"
    Here is a jstat during the problem
    ./jstat -gcutil -t 25365 2s 30
    Timestamp S0 S1 E O P YGC YGCT FGC FGCT GCT
    982106.4 0.00 26.17 4.26 92.25 59.52 523 60.979 689 1002.587 1063.566
    982108.4 0.00 26.17 4.40 92.25 59.52 523 60.979 689 1002.587 1063.566
    982110.4 0.00 26.17 4.80 92.25 59.52 523 60.979 689 1002.587 1063.566
    982112.4 0.00 26.17 5.10 92.25 59.52 523 60.979 690 1002.719 1063.698
    982114.4 0.00 26.17 5.15 92.25 59.52 523 60.979 690 1002.719 1063.698
    982116.4 0.00 26.17 5.32 92.25 59.52 523 60.979 691 1003.009 1063.988
    982118.4 0.00 26.17 5.72 92.25 59.52 523 60.979 691 1003.009 1063.988
    982120.4 0.00 26.17 5.80 92.25 59.52 523 60.979 691 1003.009 1063.988
    982122.4 0.00 26.17 5.93 92.25 59.52 523 60.979 692 1003.168 1064.146
    982124.4 0.00 26.17 6.03 92.25 59.52 523 60.979 692 1003.168 1064.146
    982126.4 0.00 26.17 6.15 92.25 59.52 523 60.979 693 1003.481 1064.460
    982128.5 0.00 26.17 6.18 92.25 59.52 523 60.979 693 1003.481 1064.460
    982130.5 0.00 26.17 6.25 92.25 59.52 523 60.979 693 1003.481 1064.460
    982132.5 0.00 26.17 6.29 92.25 59.52 523 60.979 694 1003.656 1064.635
    982134.5 0.00 26.17 6.31 92.25 59.52 523 60.979 694 1003.656 1064.635
    982136.5 0.00 26.17 6.36 92.25 59.52 523 60.979 695 1003.988 1064.967
    982138.5 0.00 26.17 6.89 92.25 59.52 523 60.979 695 1003.988 1064.967
    982140.5 0.00 26.17 6.99 92.25 59.52 523 60.979 695 1003.988 1064.967
    982142.5 0.00 26.17 7.08 92.25 59.52 523 60.979 696 1004.187 1065.165
    982144.5 0.00 26.17 7.31 92.25 59.52 523 60.979 696 1004.187 1065.165
    982146.5 0.00 26.17 7.82 92.25 59.52 523 60.979 697 1004.553 1065.531
    982148.5 0.00 26.17 7.92 92.25 59.52 523 60.979 697 1004.553 1065.531
    982150.5 0.00 26.17 8.01 92.25 59.52 523 60.979 697 1004.553 1065.531
    982152.5 0.00 26.17 8.17 92.25 59.52 523 60.979 698 1004.786 1065.764
    982154.5 0.00 26.17 8.26 92.25 59.52 523 60.979 698 1004.786 1065.764
    982156.5 0.00 26.17 8.38 92.25 59.52 523 60.979 699 1005.174 1066.153
    982158.5 0.00 26.17 8.74 92.25 59.52 523 60.979 699 1005.174 1066.153
    982160.5 0.00 26.17 8.88 92.25 59.52 523 60.979 699 1005.174 1066.153
    982162.5 0.00 26.17 8.96 92.25 59.52 523 60.979 700 1005.433 1066.412
    982164.5 0.00 26.17 9.09 92.25 59.52 523 60.979 700 1005.433 1066.412
    jstat after the restart
    ./jstat -gcutil -t 10084 2s 30
    Timestamp S0 S1 E O P YGC YGCT FGC FGCT GCT
    40312.6 0.00 25.13 88.49 1.98 63.68 21 2.366 0 0.000 2.366
    40314.6 0.00 25.13 88.58 1.98 63.68 21 2.366 0 0.000 2.366
    40316.6 0.00 25.13 88.71 1.98 63.68 21 2.366 0 0.000 2.366
    40318.6 0.00 25.13 88.99 1.98 63.68 21 2.366 0 0.000 2.366
    40320.6 0.00 25.13 89.31 1.98 63.68 21 2.366 0 0.000 2.366
    40322.6 0.00 25.13 89.36 1.98 63.68 21 2.366 0 0.000 2.366
    40324.6 0.00 25.13 89.42 1.98 63.68 21 2.366 0 0.000 2.366
    40326.6 0.00 25.13 89.53 1.98 63.68 21 2.366 0 0.000 2.366
    40328.6 0.00 25.13 89.60 1.98 63.68 21 2.366 0 0.000 2.366
    40330.6 0.00 25.13 89.72 1.98 63.68 21 2.366 0 0.000 2.366
    40332.6 0.00 25.13 90.11 1.98 63.68 21 2.366 0 0.000 2.366
    40334.6 0.00 25.13 90.56 1.98 63.68 21 2.366 0 0.000 2.366
    40336.6 0.00 25.13 90.67 1.98 63.68 21 2.366 0 0.000 2.366
    40338.6 0.00 25.13 90.75 1.98 63.68 21 2.366 0 0.000 2.366
    40340.6 0.00 25.13 91.09 1.98 63.68 21 2.366 0 0.000 2.366
    40342.6 0.00 25.13 91.36 1.98 63.68 21 2.366 0 0.000 2.366
    40344.6 0.00 25.13 91.47 1.98 63.68 21 2.366 0 0.000 2.366
    40346.6 0.00 25.13 91.53 1.98 63.68 21 2.366 0 0.000 2.366
    40348.7 0.00 25.13 91.64 1.98 63.68 21 2.366 0 0.000 2.366
    40350.7 0.00 25.13 91.77 1.98 63.68 21 2.366 0 0.000 2.366
    40352.7 0.00 25.13 91.87 1.98 63.68 21 2.366 0 0.000 2.366
    40354.7 0.00 25.13 91.95 1.98 63.68 21 2.366 0 0.000 2.366
    40356.7 0.00 25.13 92.11 1.98 63.68 21 2.366 0 0.000 2.366
    40358.7 0.00 25.13 92.19 1.98 63.68 21 2.366 0 0.000 2.366
    40360.7 0.00 25.13 92.24 1.98 63.68 21 2.366 0 0.000 2.366
    40362.7 0.00 25.13 92.85 1.98 63.68 21 2.366 0 0.000 2.366
    40364.7 0.00 25.13 93.19 1.98 63.68 21 2.366 0 0.000 2.366
    40366.7 0.00 25.13 93.40 1.98 63.68 21 2.366 0 0.000 2.366
    40368.7 0.00 25.13 93.44 1.98 63.68 21 2.366 0 0.000 2.366
    40370.7 0.00 25.13 93.47 1.98 63.68 21 2.366 0 0.000 2.366
    Any one else had similar behavior. Any input to the right direction is highly appreciated.
    Thanks.

  • Very high "load average" in top

    Hi,
    our OES11SP1 two-server-cluster (fully patched) shows a very high "load
    average" (>50, up to 110) in top in some circumstances. There are no
    problems in normal operation, but administrator actions like shutdown or
    cluster migrate might trigger the problem.
    For example when I enter 'halt', then there is the following line in
    /var/log/messages:
    Sep 12 20:27:18 srv1 shutdown[14675]: shutting down for system halt
    more than 20 minutes later:
    Sep 12 20:51:19 srv1 init: Switching to runlevel: 0
    Within thes 20 minutes nothing happens, but "average load" goes up to at
    least 50, with ndsd at top. Access to storage related tools and commands is
    not possible, for example 'nss /pool' hangs without any output.
    This happens on nearly every shutdown, but from time to time it doesn't. The
    same will sometimes be triggered by a cluster migrate.
    This only happens with our OES11SP1 cluster, it does not happen with OES11
    and OES2SP3; the only other difference I'm aware of: Novell CIFS is only
    running on the OES11SP1 cluster.
    Any ideas?
    Thanks,
    Mirko

    Sorry for the delay, it seems it's a bad habit of me to ask questions
    immediately before holidays...
    Yes, these servers have replicas, all of them... Cache size is set to 195328
    KB, which is about twice the DIB size. IIRC this was a recommendation I read
    somewhere at Novell. But I'll check that information again.
    Thanks,
    Mirko
    kjhurni wrote:
    >
    > Mirko Guldner;2283539 Wrote:
    >> top shows ndsd on top - but it's there in normal operation too, so I
    >> don't
    >> know if this means something.. (?) And it's not always the CPU which is
    >> at
    >> 100% - I have an example screenshot with: load average 50.20, 51.61,
    >> 41.0
    >> 3.2%us, 1.0%sy, 0.0%ni, 77.0%id 18%wa 0.0%hi 0.3%si 0.0%st. But this is
    >> only
    >> an example - this differs.
    >>
    >> Thanks,
    >> Mirko
    >>
    >> kjhurni wrote:
    >>
    >> >
    >> > Mirko Guldner;2283448 Wrote:
    >> >> Hi,
    >> >>
    >> >> our OES11SP1 two-server-cluster (fully patched) shows a very high
    >> "load
    >> >> average" (>50, up to 110) in top in some circumstances. There are no
    >> >> problems in normal operation, but administrator actions like
    >> shutdown
    >> >> or
    >> >> cluster migrate might trigger the problem.
    >> >>
    >> >> For example when I enter 'halt', then there is the following line in
    >> >> /var/log/messages:
    >> >>
    >> >> Sep 12 20:27:18 srv1 shutdown[14675]: shutting down for system halt
    >> >>
    >> >> more than 20 minutes later:
    >> >>
    >> >> Sep 12 20:51:19 srv1 init: Switching to runlevel: 0
    >> >>
    >> >> Within thes 20 minutes nothing happens, but "average load" goes up
    >> to
    >> >> at
    >> >> least 50, with ndsd at top. Access to storage related tools and
    >> commands
    >> >> is
    >> >> not possible, for example 'nss /pool' hangs without any output.
    >> >>
    >> >> This happens on nearly every shutdown, but from time to time it
    >> doesn't.
    >> >> The
    >> >> same will sometimes be triggered by a cluster migrate.
    >> >>
    >> >> This only happens with our OES11SP1 cluster, it does not happen with
    >> >> OES11
    >> >> and OES2SP3; the only other difference I'm aware of: Novell CIFS is
    >> >> only
    >> >> running on the OES11SP1 cluster.
    >> >>
    >> >> Any ideas?
    >> >>
    >> >> Thanks,
    >> >> Mirko
    >> >
    >> > Which process(es) does top show as being the culprit?
    >> >
    >> > In the past (on OES2 SP3) we had issues with CIFS causing ncp to
    >> cause
    >> > high utilization, but that was fixed a while ago.
    >> >
    >> > --Kevin
    >> >
    >> >
    >
    > I have seen ncp issues cause high ndsd utilization, but we've not yet
    > upgraded our cluster or DS servers to OES11 yet (waiting for new
    > hardware to go in place first).
    >
    > Out of curiosity, are the servers with high utilization also replica
    > servers? For some reason, during one of our upgrades on a replica
    > server (we have a server that contains all R/W copies of everything),
    > the cache size got set down really low and that caused all sorts of
    > issues.
    >
    > Maybe one of my collegues will wander by and offer additional insight,
    > as this may be eDir related and/or NCP related. Not sure if triggering
    > a core manually would help (but you'd have to send that to Novell and
    > open an SR to get it read).
    >
    > IF you suspect CIFS, do you have the ability to temporarily shut off
    > CIFS for like a few days to see if that's the culprit?
    >
    >

Maybe you are looking for

  • RElationship of Tables and Fields in cProjects 4.0

    Hi All Iam writing the functional spec for a custom status report requirement in cProjects. I have experienced deifficulty in getting the information on relationship of tables. Can anyone help me in this regard. I could get the list of tables in PLM,

  • What are the best printer deployment practices for Win Server 2012 R2?

    I have about 40 printers deployed around my school. My users move around my building and log into several computers throughout the day. I need to consistently get the correct group of printers to map to the computer upon startup and set a default pri

  • Can't open or print pdfs

    Please help. I am facing a deadline for work an have to find multiple references online. I want to save them to print or read later, or email them to my kindle. However, I can read them "live", but when i save them they cannot be opened later, iget t

  • Matte with texture

    Hello, I did a tutorial, part of which involved placing a ramp found in Mac/Library/Applic support/Live Type/Images into my project and filling it with a texture using the matte. My problem is, in the tutorial as well as in subsequent attempts of my

  • Inbound refinery : Problem to have same administrator than content server

    Hi, I've installed the Inbound Refinery. During the installation, I asked to have the Inbound Refinery as a proxied server in order to have the administrator of my Content Server as administrator of the Inbound Refinery. Both applications are on the