Performance Degradation of new Servers

Hi All,
We are experiencing massive performance degradation on production when the system is under heavy use. It seems to be at its worst around month end. The worst effected transactions are the Cost / Profit centre 'line item' reports using RCOPCA02 (and other similar programs).
We have the message server running on the database instance as well as 2x App servers. The Msg Srv and One of the App servers are very similar builds
4x AMD Opteron 875
16gb RAM (10gb Pagefile)
Both running Win Server 2003
We're using MS SQL 2000
The other message server is a bit weaker but has been around for some time (few years) and hasn't caused any issues.
We have recently moved the Msg Server from an old (much weaker) server to the new build and since then seem to have performance issues. Initially after the move we had issues with the number of Page Table Entries (where down at about 6000-8000). Using the /3GB /USERVA=2900 switches we have this up to about 49,000.
If anyone has had a similar experience or could offer some assistance it would be much appreciated!!!
Cheers,
Kye

While investigating a different issue we've found that we have 3 servers that share the same set of disks on the SAN. Two of these where Message Servers (R/3 and BW) when both of these where running the 'Average disk queue Length' was at times reaching 400! (should be around 1-3).
We've moved one of these instances back to DR which has relieved some of the pressure from the disks.
We have also added another index to the GLPCA table (which was causing most of the problems).

Similar Messages

Performance degradation factor 1000 on failover???

          Hi,
          we are gaining first experience with WLS 5.1 EBF 8 clustering on
          NT4 SP 6 workstation.
          We have two servers in the cluster, both on same machine but with
          different IP adresses (as it has to be)!
          In general it seems to work: we have a test client connecting to
          one of the servers and
          uses a stateless test EJB which does nothing but writing into weblogic.log.
          When this server fails, the other server resumes to work the client
          requests, BUT VERY VERY VERY SLOW!!!
          - I should repeat VERY a thousand times, because a normal client
          request takes about 10-30 ms
          and after failure/failover it takes 10-15 SECONDS!!!
          As naive as I am I want to know: IS THIS NORMAL?
          After the server is back, the performance is also back to normal,
          but we were expecting a much smaller
          performance degradation.
          So I think we are doing something totally wrong!
          Do we need some Network solution to make failover performance better?
          Or is there a chance to look closer at deployment descriptors or
          weblogic.system.executeThreadCount
          or weblogic.system.percentSocketReaders settings?
          Thanks in advance for any help!
          Fleming


See http://www.weblogic.com/docs51/cluster/setup.html#680201
          Basically, the rule of thumb is to set the number of execute threads ON
          THE CLIENT to 2 times the number of servers in the cluster and the
          percent socket readers to 50%. In your case with 8 WLS instances in the
          cluster, add the following to the java command line used to start your
          client:
          -Dweblogic.system.executeThreadCount=16
          -Dweblogic.system.percentSocketReaders=50
          Hope this helps,
          Robert
          Fleming Frese wrote:
          > Hi Mike,
          >
          > thanks for your reply.
          >
          > We do not have HTTP clients or Servlets, just EJBs and clients
          > in the same LAN,
          > and the failover should be handled by the replica-aware stubs.
          > So we thought we need no Proxy solution for failover. Maybe we
          > need a DNS to serve failover if this
          > increases our performance?
          >
          > The timeout clue sounds reasonable, but I would expect that the
          > stub times out once and than switches
          > to the other server for subsequent requests. There should be a
          > refresh (after 3 Minutes?) when the stub
          > gets new information about the servers in the cluster, so he could
          > check then if the server is back.
          > This works perfectly with load balancing: If a new server joins
          > the cluster, I automatically receives
          > requests after a while.
          >
          > Fleming
          >
          > "Mike Reiche" <[email protected]> wrote:
          > >
          > >It sounds like every request is first timing out it's
          > >connection
          > >attempt (10 seconds, perhaps?) on the 'down' instance
          > >before
          > >trying the second instance. How do requests 'failover'?
          > >Do you
          > >have Netscape, Apache, or IIS with a wlproxy module? Or
          > >do
          > >you simply have a DNS that takes care of that?
          > >
          > >Mike
          > >
          > >
          > >
          > >"Fleming Frese" <[email protected]> wrote:
          > >>
          > >>Hi,
          > >>
          > >>we are gaining first experience with WLS 5.1 EBF 8 clustering
          > >>on
          > >>NT4 SP 6 workstation.
          > >>We have two servers in the cluster, both on same machine
          > >>but with
          > >>different IP adresses (as it has to be)!
          > >>
          > >>In general it seems to work: we have a test client connecting
          > >>to
          > >>one of the servers and
          > >>uses a stateless test EJB which does nothing but writing
          > >>into weblogic.log.
          > >>
          > >>When this server fails, the other server resumes to work
          > >>the client
          > >>requests, BUT VERY VERY VERY SLOW!!!
          > >> - I should repeat VERY a thousand times, because a normal
          > >>client
          > >>request takes about 10-30 ms
          > >>and after failure/failover it takes 10-15 SECONDS!!!
          > >>
          > >>As naive as I am I want to know: IS THIS NORMAL?
          > >>
          > >>After the server is back, the performance is also back
          > >>to normal,
          > >>but we were expecting a much smaller
          > >>performance degradation.
          > >>
          > >>So I think we are doing something totally wrong!
          > >>Do we need some Network solution to make failover performance
          > >>better?
          > >>Or is there a chance to look closer at deployment descriptors
          > >>or
          > >>weblogic.system.executeThreadCount
          > >>or weblogic.system.percentSocketReaders settings?
          > >>
          > >>Thanks in advance for any help!
          > >>
          > >>Fleming
          > >>
          > >

Performance degradation: unfetched field [PublishingPageContent] caused extra roundtrip

Hi All,
   I am facing some serious application pool crash in one of my customer's Production site SharePoint servers. The Application Error logs in the event Viewer says -
Faulting application name: w3wp.exe, version: 7.5.7601.17514, time stamp: 0x4ce7afa2
Faulting module name: ntdll.dll, version: 6.1.7601.17514, time stamp: 0x4ce7c8f9
Exception code: 0xc0000374
Fault offset: 0x00000000000c40f2
Faulting process id: 0x1414
Faulting application start time: 0x01ce5edada76109d
Faulting application path: c:\windows\system32\inetsrv\w3wp.exe
Faulting module path: C:\Windows\SYSTEM32\ntdll.dll
Report Id: 5a69ec1e-cace-11e2-9be2-441ea13bf8be
At the same time the SharePoint ULS logs says -
1)
06/13/2013 03:44:29.53 w3wp.exe (0x0808)                       0x2DF0 SharePoint Foundation
        General                                8e2s
Medium              Unknown SPRequest error occurred. More information: 0x80070005      8b343224-4aa6-490c-8a2a-ce06ac160773
06/13/2013 03:44:35.03 w3wp.exe (0x0808)                       0x2DF0 SharePoint Foundation
        General
8e25      Medium              Failed to look up string with key "FSAdmin_SiteSettings_UserContextManagement_ToolTip", keyfile Microsoft.Office.Server.Search.
8b343224-4aa6-490c-8a2a-ce06ac160773
06/13/2013 03:44:35.03 w3wp.exe (0x0808)                       0x2DF0 SharePoint Foundation
        General                                8l3c
Medium              Localized resource for token 'FSAdmin_SiteSettings_UserContextManagement_ToolTip' could not be found for file with path: "C:\Program Files\Common Files\Microsoft Shared\Web
Server Extensions\14\Template\Features\SearchExtensions\ExtendedSearchAdminLinks.xml".              8b343224-4aa6-490c-8a2a-ce06ac160773
2)
06/13/2013 03:44:29.01 w3wp.exe (0x0808)                       0x2DF0 SharePoint Foundation
        Web Parts
emt4     High       Error initializing Safe control - Assembly:Microsoft.Office.SharePoint.ClientExtensions, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c TypeName: Microsoft.Office.SharePoint.ClientExtensions.Publishing.TakeListOfflineRibbonControl
Error: Could not load type 'Microsoft.Office.SharePoint.ClientExtensions.Publishing.TakeListOfflineRibbonControl' from assembly 'Microsoft.Office.SharePoint.ClientExtensions, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c'.
8b343224-4aa6-490c-8a2a-ce06ac160773
06/13/2013 03:44:29.50 w3wp.exe (0x0808)
0x2DF0 SharePoint Foundation                 Logging Correlation Data
xmnv     Medium              Site=/    8b343224-4aa6-490c-8a2a-ce06ac160773
3)
06/13/2013 03:43:59.67 w3wp.exe (0x263C)                       0x24D8 SharePoint Foundation
        Performance                   9fx9
Medium              Performance degradation: unfetched field [PublishingPageContent] caused extra roundtrip.     at Microsoft.SharePoint.SPListItem.GetValue(SPField fld,
Int32 columnNumber, Boolean bRaw, Boolean bThrowException)     at Microsoft.SharePoint.SPListItem.GetValue(String strName, Boolean bThrowException)     at Microsoft.SharePoint.SPListItem.get_Item(String fieldName)
at Microsoft.SharePoint.WebControls.BaseFieldControl.get_ItemFieldValue()     at Microsoft.SharePoint.Publishing.WebControls.RichHtmlField.RenderFieldForDisplay(HtmlTextWriter output)     at Microsoft.SharePoint.WebControls.BaseFieldControl.Render(HtmlTextWriter
output)     at Microsoft.SharePoint.Publishing.WebControls.BaseRichField.Render(HtmlTextWriter output)     at Microsoft.SharePoint.Publishing.WebControls.RichHtmlField.R...
b8d0b8ca-8386-441f-8fce-d79fe72556e1
06/13/2013 03:43:59.67*               w3wp.exe (0x263C)
0x24D8 SharePoint Foundation                 Performance
9fx9       Medium              ...ender(HtmlTextWriter output)     at System.Web.UI.Control.RenderChildrenInternal(HtmlTextWriter writer, ICollection
children)     at System.Web.UI.Control.RenderChildrenInternal(HtmlTextWriter writer, ICollection children)     at System.Web.UI.Control.RenderChildrenInternal(HtmlTextWriter writer, ICollection children)
at System.Web.UI.HtmlControls.HtmlContainerControl.Render(HtmlTextWriter writer)     at System.Web.UI.Control.RenderChildrenInternal(HtmlTextWriter writer, ICollection children)     at System.Web.UI.HtmlControls.HtmlForm.RenderChildren(HtmlTextWriter
writer)     at System.Web.UI.HtmlControls.HtmlForm.Render(HtmlTextWriter output)     at System.Web.UI.HtmlControls.HtmlForm.RenderControl(HtmlTextWriter writer)     at System.Web.UI.Control.RenderChildrenInternal(HtmlTextWrit...
b8d0b8ca-8386-441f-8fce-d79fe72556e1
06/13/2013 03:43:59.67*               w3wp.exe (0x263C)
0x24D8 SharePoint Foundation                 Performance
9fx9       Medium              ...er writer, ICollection children)     at System.Web.UI.Control.RenderChildrenInternal(HtmlTextWriter writer,
ICollection children)     at System.Web.UI.Page.Render(HtmlTextWriter writer)     at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)
at System.Web.UI.Page.ProcessRequest(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)     at System.Web.UI.Page.ProcessRequest()     at System.Web.UI.Page.ProcessRequest(HttpContext context)
at Microsoft.SharePoint.Publishing.TemplateRedirectionPage.ProcessRequest(HttpContext context)     at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()     at System.Web.HttpApplication.ExecuteStep(IExecutionSte...
b8d0b8ca-8386-441f-8fce-d79fe72556e1
06/13/2013 03:43:59.67*               w3wp.exe (0x263C)
0x24D8 SharePoint Foundation                 Performance
9fx9       Medium              ...p step, Boolean& completedSynchronously)     at System.Web.HttpApplication.PipelineStepManager.ResumeSteps(Exception
error)     at System.Web.HttpApplication.BeginProcessRequestNotification(HttpContext context, AsyncCallback cb)     at System.Web.HttpRuntime.ProcessRequestNotificationPrivate(IIS7WorkerRequest wr, HttpContext context)
at System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr managedHttpContext, IntPtr nativeRequestContext, IntPtr moduleData, Int32 flags)     at System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr managedHttpContext,
IntPtr nativeRequestContext, IntPtr moduleData, Int32 flags)     at System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr managedHttpContext, IntPtr nativeRequestContext, IntPtr module...
b8d0b8ca-8386-441f-8fce-d79fe72556e1
06/13/2013 03:43:59.67*               w3wp.exe (0x263C)
0x24D8 SharePoint Foundation                 Performance
9fx9       Medium              ...Data, Int32 flags)     at System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr managedHttpContext,
IntPtr nativeRequestContext, IntPtr moduleData, Int32 flags)                b8d0b8ca-8386-441f-8fce-d79fe72556e1
06/13/2013 03:43:59.67 w3wp.exe (0x263C)
0x24D8 SharePoint Foundation                 Performance                   g4zd
High       Performance degradation: note field [PublishingPageContent] was not in demoted fields.          b8d0b8ca-8386-441f-8fce-d79fe72556e1
Anybody has any idea whats going on? I need to fix this ASAP as we are suppose to go live in next few days.
Soumalya

Hello Soumalya,
Do you have an update on your issue? We are actually experiencing a similar issue at a new customer.
- Dennis | Netherlands | Blog |
Twitter

Performance Degradation from SSL

I have read articles which are showing that performance could go down to
1/10 with certain servers (Reference
http://isglabs.rainbow.com/isglabs/shperformance/SHPerformance.html) when
using SSL.
I am cuurently using WebLogic 4.5. Can anybody tell me what kind of
performance degradation would I see if I switch all my transactions from
normal unsecure http transactions to secure ones (SSL V3)?
Any help appreciated.
best regards, Andreas

Andreas,
Internal benchmarks (unofficial) have shown SSL to be 65-80% slower than
typical connections. So, anywhere between 3 to 5 times slower. This is the
same across all http servers.
In Denali (the next release), we're adding a performance pack enhancement
that includes native code impls of some of our crypto code. This should
show large speedups when it's released in March.
Thanks!
Michael Girdley
Sr. Product Manager
WebLogic Server
BEA Systems
ph. 415.364.4556
[email protected]
Andreas Rudolf <[email protected]> wrote in message
news:820bv6$br6$[email protected]..
I have read articles which are showing that performance could go down to
1/10 with certain servers (Reference
http://isglabs.rainbow.com/isglabs/shperformance/SHPerformance.html) when
using SSL.
I am cuurently using WebLogic 4.5. Can anybody tell me what kind of
performance degradation would I see if I switch all my transactions from
normal unsecure http transactions to secure ones (SSL V3)?
Any help appreciated.
best regards, Andreas

Performance degradation with -g compiler option

Hello
Our mearurement of simple program compiled with and without -g option shows big performance difference.
Machine:
SunOS xxxxx 5.10 Generic_137137-09 sun4u sparc SUNW,Sun-Fire-V250
Compiler:
CC: Sun C++ 5.9 SunOS_sparc Patch 124863-08 2008/10/16
#include "time.h"
#include <iostream>
int main(int argc, char ** argv)
   for (int i = 0 ; i < 60000; i++)
       int *mass = new int[60000];
       for (int j=0; j < 10000; j++) {
           mass[j] = j;
       delete []mass;
   return 0;
}Compilation and execution with -g:
CC -g -o test_malloc_deb.x test_malloc.c
ptime test_malloc_deb.xreal 10.682
user 10.388
sys 0.023
Without -g:
CC -o test_malloc.x test_malloc.c
ptime test_malloc.xreal 2.446
user 2.378
sys 0.018
As you can see performance degradation of "-g" is about 4 times.
Our product is compiled with -g option and before shipment it is stripped using 'strip' utility.
This will give us possibility to open customer core files using non-stripped exe.
But our tests shows that stripping does not give performance of executable compiled without '-g'.
So we are losing performance by using this compilation method.
Is it expected behavior of compiler?
Is there any way to have -g option "on" and not lose performance?

In your original compile you don't use any optimisation flags, which tells the compiler to do minimal optimisation - you're basically telling the compiler that you are not interested in performance. Adding -g to this requests that you want maximal debug. So the compiler does even less optimisation, in order that the generated code more closely resembles the original source.
If you are interested in debug, then -g with no optimisation flags gives you the most debuggable code.
If you are interested in optimised code with debug, then try -O -g (or some other level of optimisation). The code will still be debuggable - you'll be able to map disassembly to lines of source, but some things may not be accessible.
If you are using C++, then -g will in SS12 switch off front-end inlining, so again you'll get some performance hit. So use -g0 to get inlining and debug.
HTH,
Darryl.

Performance degradation encountered while running BOE in clustered set up

Problem Statement:
We have a clustered BOE set up in Production with 2 CMS servers (named boe01 and boe02) . Mantenix application (Standard J2EE application in a clustered set up) points to these BOE services hosted on virtual machines to generate reports. As soon as BOE services on both boe01 and boe02 are up and running , performance degradation is observed i.e (response times varies from 7sec to 30sec) .
The same set up works fine when BOE services on boe02 is turned off i.e only boe01 is up and running.No drastic variation is noticed.
BOE Details : SAP BusinessObjects environment XIR2 SP3 running on Windows 2003 Servers.(Virtual machines)
Possible Problem Areas as per our analysis
1) Node 2 Virtual Machine Issue:
This currently being part of the Production infrastructure, any problem assessment testing is not possible.
2) BOE Configuration Issue
Comparison report to check the build between BOE 01 and BOE 02 - Support team has confirmed no major installation differences apart from a minor Operating System setting difference.Question being is there some configuration/setting that we are missing ?
3) Possible BOE Cluster Issue:
Tests in staging environment ( with a similar clustered BOE setup ) have proved inconclusive.
We require your help in
- Root cause Analysis for this problem.
- Any troubleshooting action henceforth.
Another observation from our Weblogic support engineers for the above set up which may or may not be related to the problem is mentioned below.
When the services on BOE_2 are shutdown and we try to fetch a particular report from BOE_1 (Which is running), the following WARNING/ERROR comes up:-
07/09/2011 10:22:26 AM EST> <WARN> <com.crystaldecisions.celib.trace.d.if(Unknown Source)> - getUnmanagedService(): svc=BlockingReportSourceRepository,spec=aps<BOE_1> ,cluster:@BOE_OLTP, kind:cacheserver, name:<BOE_2>.cacheserver.cacheserver, queryString:null, m_replaceable:true,uri=osca:iiop://<BOE_1>;SI_SESSIONID=299466JqxiPSPUTef8huXO
com.crystaldecisions.thirdparty.org.omg.CORBA.TRANSIENT: attempt to establish connection failed: java.net.ConnectException: Connection timed out: connect minor code: 0x4f4f0001 completed: No
     at com.crystaldecisions.thirdparty.com.ooc.OCI.IIOP.Connector_impl.connect(Connector_impl.java:150)
     at com.crystaldecisions.thirdparty.com.ooc.OB.GIOPClient.createTransport(GIOPClient.java:233)
     at com.crystaldecisions.thirdparty.com.ooc.OB.GIOPClientWorkersPool.next(GIOPClientWorkersPool.java:122)
     at com.crystaldecisions.thirdparty.com.ooc.OB.GIOPClient.getWorker(GIOPClient.java:105)
     at com.crystaldecisions.thirdparty.com.ooc.OB.GIOPClient.startDowncall(GIOPClient.java:409)
     at com.crystaldecisions.thirdparty.com.ooc.OB.Downcall.preMarshalBase(Downcall.java:181)
     at com.crystaldecisions.thirdparty.com.ooc.OB.Downcall.preMarshal(Downcall.java:298)
     at com.crystaldecisions.thirdparty.com.ooc.OB.DowncallStub.preMarshal(DowncallStub.java:250)
     at com.crystaldecisions.thirdparty.com.ooc.OB.DowncallStub.setupRequest(DowncallStub.java:530)
     at com.crystaldecisions.thirdparty.com.ooc.CORBA.Delegate.request(Delegate.java:556)
     at com.crystaldecisions.thirdparty.org.omg.CORBA.portable.ObjectImpl._request(ObjectImpl.java:118)
     at com.crystaldecisions.enterprise.ocaframework.idl.ImplServ._OSCAFactoryStub.getServices(_OSCAFactoryStub.java:806)
     at com.crystaldecisions.enterprise.ocaframework.ServiceMgr.do(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.ServiceMgr.a(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.ServiceMgr.getUnmanagedService(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.AbstractStubHelper.getService(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.e.do(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.o.try(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.o.a(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.o.a(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.p.a(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.ServiceMgr.getManagedService(Unknown Source)
     at com.crystaldecisions.sdk.occa.managedreports.ps.internal.a$a.getService(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.e.do(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.o.try(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.o.a(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.o.a(Unknown Source)
     at com.crystaldecisions.enterprise.ocaframework.p.a(Unknown Source)
We see the above warning coming 2 or 3 times before the request is processed and then we see the report. We have checked our config's for the cluster but didn't find anything concrete.
Is this a normal behavior of the software or can we optimize it?
Any assistance that you can provide would be great

Rahul,
I have exactly the same problem running BO 3.1 SP3 in a 2 machine cluster on AIX. Exact same full install on both machines. When I take down one of the machines the performance is much better.
An example of the problem now is that when i run the command ./ccm.sh -display -username administrator -password xxx on the either box when they are both up and running, I sometimes receive a timeout error (over 15mins)
If I run SQLplus direct on the boxes to the CMS DB then the response is instant. Tnspings of course shows no problems
When I bring down one of the machines and run the command ./ccm.sh -display again then this brings back results in less than a minute...
I am baffled as to the problem so was wondering if you found anything from your end
Cheers
Chris

Performance degradation. Need advice on starting all over again.

I have never formatted my drive or reinstalled my OS X like I used to do with XP during the windows days. Now there is some performance degradation and opening of applications like Safari, iPhoto and others is really slow. I maintain Time Machine backups of my full Snow Leopard partition. What should I do? Format the HD, re install SL, or simply restore from TM or reinstall and then restore...?
I dont really want to carry my windows attitude to mac, of reformatting, reinstalling and then starting all apps from scratch. I wanna leverage out of my TM backup. Please advice.
Neerav
MacBook 2.4GHz Unibody (Late 2008), 2GB RAM, SL

The hatter wrote:
Those steps, repair permissions? only checks the installed application receipts -- worthless.
Disk Utility doesn't check for bad blocks, and Apple First Aid misses and doesn't fix directory problems that are picked up by 3rd party tools like Disk Warrior.
The hatter's comments do not represent a consensus of opinion about this & are at least partially misleading.
Permissions repairs are indeed limited to comparing receipt info to actual permissions settings, but that is hardly worthless. It is well documented that mis-set permissions will cause a number of problems & resetting them to receipts values is an effective cure for those specific problems. Obviously, that won't cure problems with other causes, but since there is no magic cure-all it would be foolish to expect it to behave like one.
Regarding Disk Utility, it is true that it can't repair certain problems that some 3rd party utilities can; however, it is very effective at identifying file system problems, including those for some file systems the 3rd party apps do not support. It is also the most conservative disk utility available, designed not to attempt any repair that could result in loss of data. This is one reason it isn't as powerful as the 3rd party ones -- it is best to use it first if you suspect you have file system problems & use the more powerful ones only when necessary.
To be fair, Disk Warrior includes a directory optimization function that Disk Utility doesn't. However, an "unoptimized" directory isn't a problem in & of itself, & it is debatable how much real world benefit there is to optimizing the directory, at least with the current OS & modern high performance drives. I used to see noticeable improvements by periodically using Disk Warrior with OS 9 & the drives of that era, but these days my Macs & Snow Leopard seem to do just fine without it.
Basically, it is simple: use the tool that best does what you need to do. There is no benefit from using a sledge hammer when a tack hammer will do; in fact, the sledge hammer may do more harm than good, or just wear you out for no good reason. Also consider the wisdom of the old saying that to a hammer everything looks like a nail. Sometimes, you don't need a tool at all, just the wisdom to know that you don't.
Regarding bad sectors, every drive has them. That is not a concern by itself but the drive suddenly developing new ones is a sure sign of serious problems. Drives keep track of this themselves. Utilities provide a way to query the drives about this & may provide early warning of impending failure, but since the drive is providing the info this is not 100% reliable. For this reason, whether you use one or not, it is extremely important to backup your important data to other devices regularly & often.

Performance degradation using Jolt ASP Connectivity for TUXEDO

We have a customer that uses Jolt ASP Connectivity for TUXEDO and is suffering
from a severe performance degradation over time.
Initial response times are fine (1 s.), but they tend to increase to 3 minutes
after some time (well, eh, a day or so).
Data:
- TUXEDO 7.1
- Jolt 1.2.1
- Relatively recent rolling patch installed (so no there are probably no JSH performance
issues and memory leaks as fixed in earlier patches)
The ULOG shows that during the night the JSH instances notice a timeout on behalf
of the client connection and do a forced shutdown of the client:
040911.csu013.cs.kadaster.nl!JSH.234333.1.-2: JOLT_CAT:1185: "INFO: Userid:
[ZZ_Webpol], Clientid: [AP_WEBSRV3] timed out due to inactivity"
040911.csu013.cs.kadaster.nl!JSH.234333.1.-2: JOLT_CAT:1198: "WARN: Forced
shutdown of client; user name 'ZZ_Webpol'; client name 'AP_WEBSRV3'"
This happens every 10 minutes as per configuration of the JSL (-T flag).
The customer "solved" the problem for the time being by increasing the connection
pool size on the IIS web server.
However, they didn't find a "smoking gun" - no definite cause for the problem.
So, it is debatable whether their "solution" suffices.
It is my suspicion the problem might be located in the Jolt ASP classes running
on the IIS.
Maybe the connection pool somehow loses connections over time, causing subsequent
users having to queue before they get served (although an exception should be
raised if no connections are available).
However, there's no documentation on the functioning of the connection pool for
Jolt ASP.
My questions:
1) What's the algorithm used for managing connections with Jolt ASP for TUXEDO?
2) If connections are terminated by a JSH, will a new connection be established
from the web server automatically? (this is especially interesting, because the
connection policy can be configured in the JSL CLOPT, but there's no info on how
this should be handled/configured by Jolt ASP connectivity for TUXEDO)
Regards,
Winfried Scheulderman

Hi,
For ASP connectivity I would suggest looking at the .Net client facility provided in Tuxedo 9.1 and later.
Regards,
Todd Little
Oracle Tuxedo Chief Architect

Performance degradation after setting filesystemio_option=setall from none.

Hi All,
We have facing performance degradation after setting filesystemio_option=setall from none on my two servers as mentioned below.
Red Hat Enterprise Linux AS release 4 (Nahant Update 7) 2.6.9 55.ELhugemem (32-bit)
Red Hat Enterprise Linux Server release 5.2 (Tikanga) 2.6.18 92.1.10.el5 (64-bit)
We are seeing lots of Disk I/O happening. We expected "*filesystemio_option=setall* " will improve performance but it is degrading. We getting slowness complains.
Please let me know do we need to set somethign else along with this ...like any otimizer parameter( e.g. optimizer_index_cost_adj, optimizer_index_caching).
Please help.

Hi Suraj,
<speculation>
You switched filesystemio_options to setall from none, so, the most likely reason for performance degradation after switching to setall is the implementation of directio. Direct I/O will skip the filesystem buffer cache, and and allow Oracle to read directly from disk to the database buffer cache. However, on a system where direct I/O is not implemented, which is what you had until you recently messed with that parameter, it's likely that you had an undersized database buffer cache, but that was ok, because many (most) of the physical I/Os your database was doing, were actually being serviced by the O/S filesystem buffer cache. But, you introduced direct I/O, and wiped out the ability of the O/S to service any physical I/Os from filesystem buffer cache. This means that every cache miss on the database buffer cache, turns into a real, physical, spin-the-disk, move-the-drive-head, physical I/O. And, you are suffering the performance consequences.
</speculation>
Ok, end of speculation. Now, assuming that what I've outlined above is actually going on, what to do? Why is direct I/O lower performing than buffered, non-direct I/O? Shouldn't it's performance be superior?
Well, when you have an established system that's using buffered I/O, and you switch to direct I/O, you almost always will have to increase the size of the database buffer cache. The problem is that you took a huge chunk of memory away from the the O/S, that it was using to buffer your I/Os and avoid physical I/O. So, now, you need to make up for it, by increasing the size of the database buffer cache. You can do this, without buying more memory for the box, because the O/S is no longer going to need to use so much memory for filesystem buffers.
So, what to do? Is it worth switching? Well, on balance, it makes sense to use direct I/O, and give Oracle a larger database buffer cache, for the simple fact that (particularly on a server that's dedicated to being an Oracle database server), Oracle has far more sophisticated caching algorithms, and a better understanding of the various types of data being cached, and so should be able to make more efficient use of the memory, than the (relatively) brain dead caching algorithms of the kernel and filesystem mechanisms.
But, once again, it all comes down to this:
What problem are you trying to solve? Did you have any I/O related issues? Do you have any compelling reason to implement direct I/O? Rule #1 is "if it ain't broke, don't fix it." Did you just violate rule #1? :-)
Finally, since you're on Linux, you can use the 'free' command to see how much memory is on the box, how much is free, and how much is dedicated to filesystem cache buffers. This response is already pretty long, so, I'm not going to get into details, however, if you're not familiar with the command, the results could be misleading. Read the man page, and try to be clear about understanding it before you make any assumptions about the output.
Hope that helps,
-Mark

Performance degradation after upgrading to yosemite

I'm experiencing performance degradation on MacBook Pro 15 including
number of spinning wheels,
instance of dark screen,
overheating after upgrading to Yosemite
diminished battery life
Is Yosemite the cause of this and other issues

I can't believe Apple have made this software available to the masses when it's clearly not ready. Since "upgrading" my Mac mini has been pretty useless. Programs take a substantially increased time to open compared to Mavericks, and then performance is poor at best. Opening files in Photoshop, for example, is painfully slow and I get the colour wheel almost all of the time - something which definitely didn't happen with Mavericks.
Aside from this, I've experienced a number of frustrating bugs today. When I open a new program on my left screen the right screen is changed to a fresh desktop. Why? When I turn my Mac on I get the login screen on the left monitor (as it always used to) but then the primary desktop is set to the right and I've been unable to keep the correct setting so far (it forgets that I've changed it following a reboot). The background of the top bar keeps disappearing so all of the icons, time, etc. just sit on top of the desktop background.
I hope Apple can release a fix, and quickly.

Please help me figure out why this thread's performance degrades over time

Hello.
The code below is from a Runnable that I've tested inside a Thread operating on a TreePath array of size 1500; the array 'clonedArrayB' is this TreePath array. The code is designed to create a more stepped version of setSelectionPaths(TreePath[]) from class JTree.
The performance decrease of the thread is very rapid; this is discernible simply from viewing the println speed. When it gets to about 1400 TreePaths added to the JTree selection, it's running at roughly 1/10 the speed it started running at.
I know there's no problem with maintaining a set of selected paths of that size inside a JTree. I also know that the thread stops when it should. So it must be some operation I'm performing inside the brief piece of code shown below that is causing the performance degradation.
Does anyone have any idea what could be causing the slowdown?
Many thanks for your help. Apologies if you would have liked an SSCCE, but I very much doubt it's necessary for this. Either you can see the problem or you can't. And sadly I can't x:'o(
int indexA = 0;
public void run() {
     // Prevent use of / Pause scanner
     try {
          scannerLock.acquire();
     } catch (InterruptedException exc) {
          Gecko.logException("Scanner lock could not be acquired by expansion thread", exc);
     while (!autoExpansionComplete) {
          while (indexA < clonedArrayA.length) {
               int markerA = indexA + 10;
               for (int a = indexA; a < markerA && a < clonedArrayA.length; a++) {
                    pluginTreeA.addSelectionPath(clonedArrayA[a]);
               indexA = markerA;
               System.out.println(indexA + "," + clonedArrayA.length);
                    if (autoExpansionComplete) {
                         break;
               stop();
};

Well, since I've had no responses, I tried to think of other ways to speed the code up.
I'd already made nearly every tweak I know. The only additional thing I could think of was to use addSelectionPaths(TreePath[]) on a subarray of the cloned array, instead of addSelectionPath(TreePath) on the cloned array's elements, since obviously it would be fewer method calls. It has sped things up an awful lot (my new code is shown below - I've left in some things I chopped out above, so you can see exactly what I see). The problem is though, obviously an increase in initial velocity doesn't solve the problem of deceleration occurring, if you get me.
// Clone the selection arrays to non-volatile arrays for better access
// speeds
final TreePath[] clonedArrayA = selectionPathsArrayA.clone();
final TreePath[] clonedArrayB = selectionPathsArrayB.clone();
// Create a new runnable to perform the selection task
Runnable selectionExpander = new Runnable() {
     /** Position within cloned array A */
     int indexA = 0;
     /** Position within cloned array B */
     int indexB = 0;
     /** Length of subarray grabbed from cloned array A */
     int lengthA;
     /** Length of subarray grabbed from cloned array B */
     int lengthB;
     /** Subarray destination */
     private TreePath[] subarray = new TreePath[100];
     public void stop() {
          autoExpansionComplete = true;
          automatedSelection = false;
          scannerLock.release();
      * Grabs 10 blocks of each selection paths array at a time, adding
      * these to the tree's current selection and then moving to the next
      * cycle
     public void run() {
          // Prevent use of / Pause scanner
          try {
               scannerLock.acquire();
          } catch (InterruptedException exc) {
               Gecko.logException("Scanner lock could not be acquired by expansion thread", exc);
          while (!autoExpansionComplete) {
               while (indexA < clonedArrayA.length || indexB < clonedArrayB.length) {
                    // Set subarray lengths
                    lengthA = subarray.length;
                    lengthB = subarray.length;
                    // If subarray length is greater than the number of
                    // remaining indices in the source array, set length to
                    // the number of remaining indices
                    lengthA = indexA + lengthA > clonedArrayA.length ? clonedArrayA.length - indexA : lengthA;
                    lengthB = indexB + lengthB > clonedArrayB.length ? clonedArrayB.length - indexB : lengthB;
                    // Create subarrays and add TreePath elements to trees'
                    // selections
                    System.arraycopy(clonedArrayA, indexA, subarray, 0, lengthA);
                    pluginTreeA.addSelectionPaths(subarray);
                    System.arraycopy(clonedArrayB, indexB, subarray, 0, lengthB);
                    pluginTreeB.addSelectionPaths(subarray);
                    // Remember the latest index reached in source arrays
                    indexA += lengthA;
                    indexB += lengthB;
                    System.out.println(indexA + "," + clonedArrayA.length);
                    System.out.println(indexB + "," + clonedArrayB.length);
                    if (autoExpansionComplete) {
                         break;
               stop();
// Create and start new thread to manage the selection task runner
selector = new Thread(selectionExpander);
selector.start();I really can't think what could be causing the slowdown. I've done everythng I can think of to increase the velocity, such as cloning the source arrays since they're volatile and access could be slightly slower as a result.
Nothing I try gets rid of the slowdown effect though :(
- Dave

When table with clustered columnstore indexe is partitioned the performance degrades if data is located in multiple partitions

Hello,
Below I provide a complete code to re-produce the behavior I am observing. You could run it in tempdb or any other database, which is not important. The test query provided at the top of the script is pretty silly, but I have observed the same
performance degradation with about a dozen of various queries of different complexity, so this is just the simplest one I am using as an example here. Note that I also included approximate run times in the script comments (this is obviously based on what I
observed on my machine). Here are the steps with numbers corresponding to the numbers in the script:
1. Run script from #1 to #7. This will create the two test tables, populate them with records (40 mln. and 10 mln.) and build regular clustered indexes.
2. Run test query (at the top of the script). Here are the execution statistics:
Table 'Main'. Scan count 5, logical reads 151435, physical reads 0, read-ahead reads 4, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Txns'. Scan count 5, logical reads 74155, physical reads 0, read-ahead reads 7, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
   CPU time = 5514 ms,
elapsed time = 1389 ms.
3. Run script from #8 to #9. This will replace regular clustered indexes with columnstore clustered indexes.
4. Run test query (at the top of the script). Here are the execution statistics:
Table 'Txns'. Scan count 4, logical reads 44563, physical reads 0, read-ahead reads 37186, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Main'. Scan count 4, logical reads 54850, physical reads 2, read-ahead reads 96862, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
   CPU time = 828 ms,
elapsed time = 392 ms.
As you can see the query is clearly faster. Yay for columnstore indexes!.. But let's continue.
5. Run script from #10 to #12 (note that this might take some time to execute). This will move about 80% of the data in both tables to a different partition. You should be able to see the fact that the data has been moved when running Step #
11.
6. Run test query (at the top of the script). Here are the execution statistics:
Table 'Txns'. Scan count 4, logical reads 44563, physical reads 0, read-ahead reads 37186, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Main'. Scan count 4, logical reads 54817, physical reads 2, read-ahead reads 96862, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
   CPU time = 8172 ms,
elapsed time = 3119 ms.
And now look, the I/O stats look the same as before, but the performance is the slowest of all our tries!
I am not going to paste here execution plans or the detailed properties for each of the operators. They show up as expected -- column store index scan, parallel/partitioned = true, both estimated and actual number of rows is less than during the second
run (when all of the data resided on the same partition).
So the question is: why is it slower?
Thank you for any help!
Here is the code to re-produce this:
--==> Test Query - begin --<===
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT COUNT(1)
FROM Txns AS z WITH(NOLOCK)
LEFT JOIN Main AS mmm WITH(NOLOCK) ON mmm.ColBatchID = 70 AND z.TxnID = mmm.TxnID AND mmm.RecordStatus = 1
WHERE z.RecordStatus = 1
--==> Test Query - end --<===
--===========================================================
--1. Clean-up
IF OBJECT_ID('Txns') IS NOT NULL DROP TABLE Txns
IF OBJECT_ID('Main') IS NOT NULL DROP TABLE Main
IF EXISTS (SELECT 1 FROM sys.partition_schemes WHERE name = 'PS_Scheme') DROP PARTITION SCHEME PS_Scheme
IF EXISTS (SELECT 1 FROM sys.partition_functions WHERE name = 'PF_Func') DROP PARTITION FUNCTION PF_Func
--2. Create partition funciton
CREATE PARTITION FUNCTION PF_Func(tinyint) AS RANGE LEFT FOR VALUES (1, 2, 3)
--3. Partition scheme
CREATE PARTITION SCHEME PS_Scheme AS PARTITION PF_Func ALL TO ([PRIMARY])
--4. Create Main table
CREATE TABLE dbo.Main(
SetID int NOT NULL,
SubSetID int NOT NULL,
TxnID int NOT NULL,
ColBatchID int NOT NULL,
ColMadeId int NOT NULL,
RecordStatus tinyint NOT NULL DEFAULT ((1))
) ON PS_Scheme(RecordStatus)
--5. Create Txns table
CREATE TABLE dbo.Txns(
TxnID int IDENTITY(1,1) NOT NULL,
GroupID int NULL,
SiteID int NULL,
Period datetime NULL,
Amount money NULL,
CreateDate datetime NULL,
Descr varchar(50) NULL,
RecordStatus tinyint NOT NULL DEFAULT ((1))
) ON PS_Scheme(RecordStatus)
--6. Populate data (credit to Jeff Moden: http://www.sqlservercentral.com/articles/Data+Generation/87901/)
-- 40 mln. rows - approx. 4 min
--6.1 Populate Main table
DECLARE @NumberOfRows INT = 40000000
INSERT INTO Main (
SetID,
SubSetID,
TxnID,
ColBatchID,
ColMadeID,
RecordStatus)
SELECT TOP (@NumberOfRows)
SetID = ABS(CHECKSUM(NEWID())) % 500 + 1, -- ABS(CHECKSUM(NEWID())) % @Range + @StartValue,
SubSetID = ABS(CHECKSUM(NEWID())) % 3 + 1,
TxnID = ABS(CHECKSUM(NEWID())) % 1000000 + 1,
ColBatchId = ABS(CHECKSUM(NEWID())) % 100 + 1,
ColMadeID = ABS(CHECKSUM(NEWID())) % 500000 + 1,
RecordStatus = 1
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
--6.2 Populate Txns table
-- 10 mln. rows - approx. 1 min
SET @NumberOfRows = 10000000
INSERT INTO Txns (
GroupID,
SiteID,
Period,
Amount,
CreateDate,
Descr,
RecordStatus)
SELECT TOP (@NumberOfRows)
GroupID = ABS(CHECKSUM(NEWID())) % 5 + 1, -- ABS(CHECKSUM(NEWID())) % @Range + @StartValue,
SiteID = ABS(CHECKSUM(NEWID())) % 56 + 1,
Period = DATEADD(dd,ABS(CHECKSUM(NEWID())) % 365, '05-04-2012'), -- DATEADD(dd,ABS(CHECKSUM(NEWID())) % @Days, @StartDate)
Amount = CAST(RAND(CHECKSUM(NEWID())) * 250000 + 1 AS MONEY),
CreateDate = DATEADD(dd,ABS(CHECKSUM(NEWID())) % 365, '05-04-2012'),
Descr = REPLICATE(CHAR(65 + ABS(CHECKSUM(NEWID())) % 26), ABS(CHECKSUM(NEWID())) % 20),
RecordStatus = 1
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
--7. Add PK's
-- 1 min
ALTER TABLE Txns ADD CONSTRAINT PK_Txns PRIMARY KEY CLUSTERED (RecordStatus ASC, TxnID ASC) ON PS_Scheme(RecordStatus)
CREATE CLUSTERED INDEX CDX_Main ON Main(RecordStatus ASC, SetID ASC, SubSetId ASC, TxnID ASC) ON PS_Scheme(RecordStatus)
--==> Run test Query --<===
--===========================================================
-- Replace regular indexes with clustered columnstore indexes
--===========================================================
--8. Drop existing indexes
ALTER TABLE Txns DROP CONSTRAINT PK_Txns
DROP INDEX Main.CDX_Main
--9. Create clustered columnstore indexes (on partition scheme!)
-- 1 min
CREATE CLUSTERED COLUMNSTORE INDEX PK_Txns ON Txns ON PS_Scheme(RecordStatus)
CREATE CLUSTERED COLUMNSTORE INDEX CDX_Main ON Main ON PS_Scheme(RecordStatus)
--==> Run test Query --<===
--===========================================================
-- Move about 80% the data into a different partition
--===========================================================
--10. Update "RecordStatus", so that data is moved to a different partition
-- 14 min (32002557 row(s) affected)
UPDATE Main
SET RecordStatus = 2
WHERE TxnID < 800000 -- range of values is from 1 to 1 mln.
-- 4.5 min (7999999 row(s) affected)
UPDATE Txns
SET RecordStatus = 2
WHERE TxnID < 8000000 -- range of values is from 1 to 10 mln.
--11. Check data distribution
SELECT
OBJECT_NAME(SI.object_id) AS PartitionedTable
, DS.name AS PartitionScheme
, SI.name AS IdxName
, SI.index_id
, SP.partition_number
, SP.rows
FROM sys.indexes AS SI WITH (NOLOCK)
JOIN sys.data_spaces AS DS WITH (NOLOCK)
ON DS.data_space_id = SI.data_space_id
JOIN sys.partitions AS SP WITH (NOLOCK)
ON SP.object_id = SI.object_id
AND SP.index_id = SI.index_id
WHERE DS.type = 'PS'
AND OBJECT_NAME(SI.object_id) IN ('Main', 'Txns')
ORDER BY 1, 2, 3, 4, 5;
PartitionedTable PartitionScheme IdxName index_id partition_number rows
Main PS_Scheme CDX_Main 1 1 7997443
Main PS_Scheme CDX_Main 1 2 32002557
Main PS_Scheme CDX_Main 1 3 0
Main PS_Scheme CDX_Main 1 4 0
Txns PS_Scheme PK_Txns 1 1 2000001
Txns PS_Scheme PK_Txns 1 2 7999999
Txns PS_Scheme PK_Txns 1 3 0
Txns PS_Scheme PK_Txns 1 4 0
--12. Update statistics
EXEC sys.sp_updatestats
--==> Run test Query --<===

Hello Michael,
I just simulated the situation and got the same results as in your description. However, I did one more test - I rebuilt the two columnstore indexes after the update (and test run). I got the following details:
Table 'Txns'. Scan count 8, logical reads 12922, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Main'. Scan count 8, logical reads 57042, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 251 ms, elapsed time = 128 ms.
As an explanation of the behavior - because of the UPDATE statement in CCI is executed as a DELETE and INSERT operation, you had all original row groups of the index with almost all data deleted and almost the same amount of new row groups with new data
(coming from the update). I suppose scanning the deleted bitmap caused the additional slowness at your end or something related with that "fragmentation".
Ivan Donev MCITP SQL Server 2008 DBA, DB Developer, BI Developer

Performance issue with new release

Dear all,
11.2 rac with ASM on solaris sparc standard edition.
Client was using 10.2.0.3 on solaris 10 before and it is working fine.
after migrating the data to the new 11g instance ,we've a performance
degradation in the application. application forms 10.1.2.3 .Since we
are using stanard edition, there are not much options available for
tuning. the same application when tested with 11gr2 on windows (non-rac)
is more faster than the current prod and the newly installed ones.
I scheduled statspack and info from statspack
Top 5 Timed Events                                                    Avg %Total
~~~~~~~~~~~~~~~~~~                                                   wait   Call
Event                                            Waits    Time (s)   (ms)   Time
CPU time                                                     2,468          87.0
Streams AQ: qmn coordinator waiting for s           18         103   5742    3.6
gc cr block busy                                   295          89    303    3.2
control file sequential read                   235,472          70      0    2.5
reliable message                                 8,701          24      3     .9
Host CPU (CPUs: 32 Cores: 4 Sockets: 1)
~~~~~~~~              Load Average
                      Begin     End      User System    Idle     WIO     WCPU
                       1.20    1.37      2.43    1.25   96.32    0.00    0.91
Instance CPU
~~~~~~~~~~~~                                       % Time (seconds)
                     Host: Total time (s):                441,564.6
                  Host: Busy CPU time (s):                 16,239.3
                   % of time Host is Busy:       3.7
             Instance: Total CPU time (s):                  2,682.5
          % of Busy CPU used for Instance:      16.5
        Instance: Total Database time (s):                  3,249.6
%DB time waiting for CPU (Resource Mgr):       0.0
Memory Statistics                       Begin          End
~~~~~~~~~~~~~~~~~                ------------ ------------
                  Host Mem (MB):     16,256.0     16,256.0
                   SGA use (MB):      4,078.9      4,078.9
                   PGA use (MB):        418.6        415.2
    % Host Mem used for SGA+PGA:         27.7         27.6
          -------------------------------------------------------------Objects are indexes are the same across the environment.stats are updated, but the process is time consuming in the new environment.
Please let me know how to proceed ?
Kai

Kai,
Please take a look in the sqlnet.ora file on the client computer, you should find it in network/admin directory. In that file, do you see either of these two lines?
sqlnet.authentication_services = (NONE)
sqlnet.authentication_services = (NTS)The above might explain why the configuration is fast on a Windows server, but slow when a Solaris server is accessed.
To Mark's point, execution plans can change between different versions - in fact Oracle Database 11.2.0.2 will perform something called adaptive cursor sharing, where an execution plan (you cannot check this using EXPLAIN PLAN from SQL*Plus) will potentially change as different bind variable values are supplied. However, if the application parses once and then holds the cursor open, the adaptive cursor sharing likely will not happen on future executions within that application - an unfortunate initial parse with unusual bind variable values might trigger a hard parse on 11.2.0.2 (thus possibly changing the execution plan), even when that SQL statement had been previously hard parsed. Randolf Geist recently posted an article related to this topic:
http://oracle-randolf.blogspot.com/2011/01/adaptive-cursor-sharing.html
Even with the Standard Edition of Oracle Database you still have a lot of options for troubleshooting this problem. In a 10046 trace file for a session look at the amount of time spent on the SQL*Net events. Do not look at just the summary provided by TKPROF, but look at the raw trace file. Is each SQL*Net related wait for a significant amount of time? If you do see long times for many of the SQL*Net waits, then you should focus your attention to the client, the server's network configuration, and the network hardware.
On the Standard Edition you can also look at the session level time model statistics and wait events to get a better idea of what is happening in the session. For example, if you have access to a spare Windows client computer you can try a script like the one in this blog article to help you review the time model statistics and dig into the session level time model statistics and wait events for a specific session:
http://hoopercharles.wordpress.com/2010/02/09/working-with-oracle%e2%80%99s-time-model-data-3/
Charles Hooper
Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table"
http://hoopercharles.wordpress.com/
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

SCOM reports "A significant portion of the database buffer cache has been written out to the system paging file. This may result in severe performance degradation"

This was discussed here, with no resolution
http://social.technet.microsoft.com/Forums/en-US/exchange2010/thread/bb073c59-b88f-471b-a209-d7b5d9e5aa28?prof=required
I have the same issue. This is a single-purpose physical mailbox server with 320 users and 72GB of RAM. That should be plenty. I've checked and there are no manual settings for the database cache. There are no other problems with
the server, nothing reported in the logs, except for the aforementioned error (see below).
The server is sluggish. A reboot will clear up the problem temporarily. The only processes using any significant amount of memory are store.exe (using 53GB), regsvc (using 5) and W3 and Monitoringhost.exe using 1 GB each. Does anyone have
any ideas on this?
Warning ESE Event ID 906.
Information Store (1497076) A significant portion of the database buffer cache has been written out to the system paging file. This may result in severe performance degradation. See help link for complete details of possible causes. Resident cache
has fallen by 213107 buffers (or 11%) in the last 207168 seconds. Current Total Percent Resident: 79% (1574197 of 1969409 buffers)

Brian,
We had this event log entry as well which SCOM picked up on, and 10 seconds before it the Forefront Protection 2010 for Exchange updated all of its engines.
We are running Exchange 2010 SP2 RU3 with no file system antivirus (the boxes are restricted and have UAC turned on as mitigations). We are running the servers primarily as Hub Transport servers with 16GB of RAM, but they do have the mailbox role installed
for the sole purpose of serving as our public folder servers.
So we theroized the STORE process was just grabbing a ton of RAM, and occasionally it was told to dump the memory so the other processes could grab some - thus generating the alert. Up until last night we thought nothing of it, but ~25 seconds after the
cache flush to paging file, we got the following alert:
Log Name:      Application
Source:        MSExchangeTransport
Date:          8/2/2012 2:08:14 AM
Event ID:      17012
Task Category: Storage
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      HTS1.company.com
Description:
Transport Mail Database: The database could not allocate memory. Please close some applications to make sure you have enough memory for Exchange Server. The exception is Microsoft.Exchange.Isam.IsamOutOfMemoryException: Out of Memory (-1011)
   at Microsoft.Exchange.Isam.JetInterop.CallW(Int32 errFn)
   at Microsoft.Exchange.Isam.JetInterop.MJetOpenDatabase(MJET_SESID sesid, String file, String connect, MJET_GRBIT grbit, MJET_WRN& wrn)
   at Microsoft.Exchange.Isam.JetInterop.MJetOpenDatabase(MJET_SESID sesid, String file, MJET_GRBIT grbit)
   at Microsoft.Exchange.Isam.JetInterop.MJetOpenDatabase(MJET_SESID sesid, String file)
   at Microsoft.Exchange.Isam.Interop.MJetOpenDatabase(MJET_SESID sesid, String file)
   at Microsoft.Exchange.Transport.Storage.DataConnection..ctor(MJET_INSTANCE instance, DataSource source).
Followed by:
Log Name:      Application
Source:        MSExchangeTransport
Date:          8/2/2012 2:08:15 AM
Event ID:      17106
Task Category: Storage
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      HTS1.company.com
Description:
Transport Mail Database: MSExchangeTransport has detected a critical storage error, updated the registry key (SOFTWARE\Microsoft\ExchangeServer\v14\Transport\QueueDatabase) and as a result, will attempt self-healing after process restart.
Log Name:      Application
Source:        MSExchangeTransport
Date:          8/2/2012 2:13:50 AM
Event ID:      17102
Task Category: Storage
Level:         Warning
Keywords:      Classic
User:          N/A
Computer:      HTS1.company.com
Description:
Transport Mail Database: MSExchangeTransport has detected a critical storage error and has taken an automated recovery action. This recovery action will not be repeated until the target folders are renamed or deleted. Directory path:E:\EXCHSRVR\TransportRoles\Data\Queue
is moved to directory path:E:\EXCHSRVR\TransportRoles\Data\Queue\Queue.old.
So it seems as if the Forefront Protection 2010 for Exchange inadvertently trigger the cache flush which didn't appear to happen quick or thuroughly enough for the transport service to do what it needed to do, so it freaked out and performed the subsequent
actions.
Do you have any ideas on how to prevent this 906 warning, which cascaded into a transport service outage?
Thanks!

Capture performance metrics across multiple servers

Hello. I'm still very new to Powershell but anyone know of a good Powershell v.3 -4 script that can capture performance metrics across multiple servers with an emphasis on HPC (high performance computing) and gen up a helpful report, perhaps in HTML or Excel
format?
Closest thing I've found and used is this line of powershell:
http://www.microsoftpro.nl/2013/11/21/powershell-performance-monitor-on-multiple-remote-computers/
Maybe figure out a way to present that in better format, such as HTML or Excel.
Also, if someone can suggest some performance metrics to look at with an HPC perspective. For example, if a CPU is running at 100 utilization, figure out if which cores are running high, see how many threads are queued waiting for CPU time, etc...

As far as formatting is concerned,
ConvertTo-HTML is a basic HTML output format, but you can spice it up as much as you like:
http://technet.microsoft.com/en-us/library/ff730936.aspx
Out-Grid is very functional and pretty simple:
http://powertoe.wordpress.com/2011/09/19/out-gridview-now-has-a-passthru-parameter/
Here's an example with Excel:
Excel
Worksheets Example
This might be a good reference for HPC, I don't have access to an HPC environment so I can't offer much advice there.
http://technet.microsoft.com/en-us/library/ff950195.aspx
It might be better to keep unrelated questions separate, so a thread doesn't focus on one question and you lose time getting an answer to another.
I hope this post has helped!

Performance Degradation of new Servers

Similar Messages

Maybe you are looking for