Kernel patch 138888 breaks cluster

I am trying to set up a cluster between guest LDoms on two physical 5140 servers. Without the kernel patch 138888 -03 everything works fine. When I install the patch before or after the cluster is created, it results in the two LDoms not being able to communicate over the interconnects, reporting that one or the other node is unreachable.
Does anyone know why the patch causes that problem and if there is a fix? I'm leaving the patch off for now.
Thanks.

[PSARC 2009/069 802.1Q tag mode link property|http://mail.opensolaris.org/pipermail/opensolaris-arc/2009-February/013817.html]

Similar Messages

Steps to upgrade kernel patch in AIX cluster environment

Hello All,
We are going to perform kernel upgrade in AIX cluster environment.
Please let me know the other locations to copy the new kernel files ,
default location
CI+DB server
APP1
Regards
Subbu

Hi Subbu
Refer the SAP link
Executing the saproot.sh Script - Java Support Package Manager (OBSOLETE) - SAP Library
1. Extract the downloaded files to a new location using SAPCAR -xvf <file_name> as sidadm.
2. copy the extracted files to sapmnt/<SID>/exe
3. Start the DB & Application.
Regards
Sriram

Kernel patches = total downtime?

We are preparing to install errata patches for RHEL4 and I am trying to find any documentation that might tell me whether or not we need to bring down a cluster to install the kernel patches. I remember hearing somewhere that most OS kernel patches require the cluster to come down because each node must be at the same level. How do I determine if this applies for the patches at hand?
kernel-largesmp-2.6.9-67.0.1.EL.x86_64.rpm
oracleasm-2.6.9-67.0.1.ELlargesmp-2.0.3-1.x86_64.rpm
ocfs2-2.6.9-67.0.1.ELlargesmp-1.2.7-1.el4.x86_64.rpm
TIA

Oracle Support.... or lack thereof. They said they do not test everything in house, so I would have to test on my own. Obviously, I wound up with one of those front-line people.
Anyway, I did test the rolling patch on my end and it appears to fine.

Apply one non-kernel Solaris10 patch at Sun Cluster ***Beginner Question***

Dear Sir/Madam,
Our two Solaris 10 servers are running Sun Cluster 3.3. One server "cluster-1" has one online running zone "classical". Another server
"cluster-2" has two online running zones, namely "romantic" and "modern". We are tying to install a regular non-kernel patch #145200-03 at cluster-1 LIVE which doesn't have prerequisite and no need to reboot afterwards. Our goal is to install this patch at the global zone,
three local zones, i.e., classical, romantic and modern at both cluster servers, cluster-1 and cluster02.
Unfortunately, when we began our patching at cluster-1, it could patch the running zone "classical" but we were getting the following errors which prevent it from continuing with patching at zones, i.e., "romantic" and "modern" which are running on cluster-2. And when we try to patch cluster-2, we are getting similiar patching error about failing to boot non-global zone "classical" which is in cluster-1.
Any idea how I could resolve this ? Do we have to shut down the cluster in order to apply this patch ? I would prefer to apply this
patch with the Sun Cluster running. If not, what's the preferred way to apply simple non-reboot patch at all the zones at both nodes in the Sun Cluster ?
Like to hear from folks who have experience in dealing with patching in Sun Cluster.
Thanks, Mr. Channey
p.s. Below are output form the patch #145200-03 run, zoneadm and clrg
outputs at cluster-1
root@cluster-1# patchadd 145200-03
Validating patches...
Loading patches installed on the system...
Done!
Loading patches requested to install.
Done!
Checking patches that you specified for installation.
Done!
Approved patches will be installed in this order:
145200-03
Preparing checklist for non-global zone check...
Checking non-global zones...
Failed to boot non-global zone romantic
exiting
root@cluster-1# zoneadm list -iv
ID NAME STATUS PATH BRAND IP
0 global running / native shared
15 classical running /zone-classical native shared
- romantic installed /zone-romantic native shared
- modern installed /zone-modern native shared
root@cluster-1# clrg status
=== Cluster Resource Groups ===
Group Name Node Name Suspended Status
classical cluster-1 No Online
cluster-2 No Offline
romantic cluster-1 No Offline
cluster-2 No Online
modern cluster-1 No Offline
cluster-2 No Online

Hi Hartmut,
I kind of got the idea. Just want to make sure. The zones 'romantic' and 'modern' show "installed" as the current status at cluster-1. These 2 zones are in fact running and online at cluster-2. So I will issue your commands below at cluster-2 to detach these zones to "configured" status :
cluster-2 # zoneadm -z romantic detach
cluster-2 # zoneadm -z modern detach
Afterwards, I apply the Solaris patch at cluster-2. Then, I go to cluster-1 and apply the same Solaris patch. Once I am done patching both cluster-1 and cluster-2, I will
go back to cluster-2 and run the following commands to force these zones back to "installed" status :
cluster-2 # zoneadm -z romantic attach -f
cluster-2 # zoneadm -z modern attach -f
CORRECT ?? Please let me know if I am wrong or if there's any step missing. Thanks much, Humphrey
root@cluster-1# zoneadm list -iv
ID NAME STATUS PATH BRAND IP
0 global running / native shared
15 classical running /zone-classical native shared
- romantic installed /zone-romantic native shared
- modern installed /zone-modern native shared

Steps for Kernel Patch Updation on Solaris 10 X4100 with 2disks mirrored

Hi all,
I have Solaris 10 10/06 (118855-19) installed on one of the X4100 server. This is the time for me to update the latest kernel patch (118855-36). We have two disks mirrored. My questions are,
1) Do i need to detach any of the disk from the mirror before doing any patching.
2) Is it possible to install the patches without detaching any disks from the mirror. (i.e. installeing patch on mirrored root filesystem)
3) how to boot from the second disk in case the patch installation creates problem while booting up.
Any suggestions or steps which you have already implemented for the above scenario.

This isn't really a question for this forum, you may be better to look at some of the sys-admin forums for a complete answer.
You should not need to break the mirror in order to apply the kernel patch, however doing so would allow for quicker recovery of the system should something go wrong during patching.
I would strongly advise that you read the special install instructions for the kernel patch prior to installing it.
http://sunsolve.sun.com/search/document.do?assetkey=1-21-118855-36-1
You may also wish to use a patch cluster rather than smpatch/updatemanager, these can be downloaded from SunSolve:
http://sunsolve.sun.com/private-cgi/show.pl?target=patchpage

Static library not accessed properly after Solaris Kernel patch update !

Hi,
We are facing a sever issue in our application after our customer updated the Solaris 10 kernel patch u9 to u10.
We have two static libraries libdlib.a and libDLIB.a, with exactly same code base, but these two libraries are scattered across the code base and linked by many shared objects in our application.
However, one of the shared objects that links to "libdlib.a" library tries to access a function from "libDLIB.a". This behavior is causing a crash at a later point, since that shared object is supposed to access the function from "libdlib.a". Moreover, we found this is happening through the use of dbx.
I'm unable to understand why this problem surfaced after kernel patch update, though still the shared object works fine on Solaris 10 u9 patch.
Flow is something like this :
1. syslogrecorder.so gets loaded by one of the processes.
2. syslogrecorder.so is linked to "libdlib.a" at compile time, so it uses "libdlib.a" function DLIB_LoadLibrary and gets a handle to all the function pointers of the loaded library ( The purpose of DLIB_LoadLibrary is to load a shared library dynamically using dlopen )
3. syslogrecorder.so tries to do a "dlsym" and to do that it needs access to the library handle which we got in previous call DLIB_LoadLibrary. So syslogrecorder.so calls another function from DLIB_ProcAddress, which actually gives back the access to the loaded shared library.
Here is a catch in step 3, it is supposed to call DLIB_ProcAddress from the libdlib.a but as we observed from dbx output it does so by calling DLIB_ProcAddress from libDLIB.a and hence fails to give back the access to loaded shared library, causing crash at a later point in code.
Can someone put some light here that why this could happen ??
Thanks
Kuldeep

To clarify: You did not modify or rebuild any of your binaries, but after installing a kernel patch, the application stopped working. Most likely, something about your application depended on a accidental behavior of the runtime loader. That accidental behavior changed due to the patch, and your application failed.
For example, if there is a circular dependency among shared libraries, the loader will break the cycle at an arbitrary point to establish an initialization order. By accident, that order might work, in the sense of not causing a problem. A change to the loader could cause the cycle to be broken at a different point, and the resulting initialization order could cause a now-uninitialized object to be accessed. I'm not saying this is what is wrong, but this is an example of a dependency on accidental loader behavior.
Finding your actual problem will require tracing the sequence of operations leading up to the failure. You are more likely to find help in a Solaris linker forum. AFAIK, there are currently no Oracle forums for Solaris, and the old OpenSolaris forums have been converted to mailing lists. You can try the "tools-linking" list found on this page:
http://mail.opensolaris.org/mailman/listinfo
I also suggest you review the paper on best practices for using shared libraries written by Darryl Gove and myself:
http://www.oracle.com/technetwork/articles/servers-storage-admin/linkinglibraries-396782.html
If you have a service contract with Oracle, you can use your support channel to get more help.
Edited by: Steve_Clamage on May 18, 2012 3:21 PM

Kernel patch update for solaris 10 x86

I have Solaris 10 06/06 installed on x86 machine which is using svm and clustered with another node. The kernel revision is 118855-19 from the uname -a output. I am looking for the kernel patch updation and I heard 118855-36 is the latest one. Shall I go ahead with this patch and what r the dependency patches for this.
If anyone done this please suggest and guide me..

For Solaris 10 x86 the latest offered with smpatch is 125101-07 and yes it may be recommended to patch. Then again you said clustered with sun cluster? You may want to check the documentation and if your machines aren't facing the internet you may wait for 7/07 to hit the street and do an upgrade.

Solaris 10 U4 and kernel patches

When I install a fresh U4 machine, I then (as I always do) apply the recommended patch cluster. U4 has kernel patched to 120011-14. In the patch cluster, there are kernel patches 118833-36 and also 120011-14. When I run the patch cluster, it installs 118833-36! Isn't this older than the kernel on there? Shouldn't both 118833-36 and 120011-14 BOTH not install as the kernel is already at level 120011-14. The cluster gets to 118833-36 installs that and then of course every patch after that one fails as the machine is waiting for a reboot.

KJP 137138-09 should be ok with cpquary3 driver 1.9.1. KJP 137138 introduced new feature which does not allow misaligned pointer mutexes to work and panics the system. with revision 07 SUN introduced a new environment variable as a for applications which cannot be ported easily "6729759 need to accommodate non-8-byte-aligned mutexes".
This is documented in alert 244606 "The resolution for OpenSolaris releases sets _THREAD_LOCKS_MISALIGNED to 0. This is to ensure that any faulty applications fail and are identified. To allow such applications to continue to work on OpenSolaris releases based upon snv_96 or later, the environment variable _THREAD_LOCKS_MISALIGNED must be set to 1." For this to work you need to have revision 09 of this KJP applied.
Can you post the stack trace so i can have a look at it. I guess you have another application which uses unaligned mutexes.
A pkginfo of the cpquary3 package would also be useful.
-Marco

Kernel Patching with zones

I have a T2000 installed with the Solaris 10 1/06 release with several zones created on it. 4 zones are "sparse" root, and one (zone-5) is a "whole root" zone.
In order to apply and certify (internally) the latest sendmail patch, Solaris 10 needs a later kernel patch than I had installed (this is a subject for another discussion...). So I downloaded the latest patch cluster (4/6 Recommended cluster) to apply it.
I shut down the non-global zones, and took the machine to single user mode, and installed the cluster. It seemed to go in fine, except for the following error:
Zone zone-5
Rejected patches:
122856-01
Patches that passed the dependency check:
None.
Fatal failure occurred - impossible to install any patches.
zone-5: For patch 122856-01, required patch 118822-30 does not exist.
Fatal failure occurred - impossible to install any patches.Now, 118822-30 is a kernel patch series that is prerequisite for the latest kernel patch (118833-03). Zone-5 is my only whole-root zone. I then looked at the patch cluster log, and discovered that a handful of patches (including 118822-30) had also failed:
titan15n> grep failed /var/sadm/install_data/Solaris_10_Recommended_Patch_Cluster_log
Pkgadd failed. See /var/tmp/119254-19.log.6615 for details
Pkgadd failed. See /var/tmp/118712-09.log.9307 for details
Pkgadd failed. See /var/tmp/119578-18.log.15160 for details
Pkgadd failed. See /var/tmp/121308-03.log.18339 for details
Pkgadd failed. See /var/tmp/119689-07.log.22068 for details
Pkgadd failed. See /var/tmp/118822-30.log.9404 for details
Pkgadd failed. See /var/tmp/119059-11.log.29911 for details
Pkgadd failed. See /var/tmp/119596-03.log.4724 for details
Pkgadd failed. See /var/tmp/119985-02.log.8349 for details
Pkgadd failed. See /var/tmp/122032-02.log.13334 for details
Pkgadd failed. See /var/tmp/118918-14.log.27743 for detailsLooking at any of these logs (in the non-global zone-5's /var/tmp directory shows failures like the following snippet:
pkgadd: ERROR: unable to create unique temporary file </usr/platform/sun4us/include/sys/cheetahregs.h6HaG8w>: (30) Read-only file sy
stem
pkgadd: ERROR: unable to create unique temporary file </usr/platform/sun4us/include/sys/clock.h7HaG8w>: (30) Read-only file system
pkgadd: ERROR: unable to create unique temporary file </usr/platform/sun4us/include/sys/dvma.h8HaG8w>: (30) Read-only file systemQuestion(s):
Why would there be read-only file systems where tmp files are getting written? Possibly a timing issue?
Is there a "best practice" on applying patch clusters, and specifically, the kernel patch? Did I make a mistake in taking the zones down first? It seems like the zones were being booted up as the patches were getting applied, but I may be misinterpreting the output.
Even though the patches failed to apply to zone-5, the uname -a output in the zone show the latest kernel patch, but does NOT show 118822-30 (118822-25 is what showrev -p in the non-global zone-5 shows -- which is the level I was at before attempting to patch).
Any solutions?
Thanks.

The kernel config and patch are irrelevant - I have tried to compile the stock arch kernel just to make sure that it WASN'T the patch - I simple copied the folder from ABS, did makepkg and installed - no lucky. The problem seems to be that all of the kernels I compile end up with the folder in /lib/modules having -dirty on the end of them. How do I stop this '-dirty'?
I notice in the build I get this message -
==> Building the kernel
fatal: cannot describe '604d205b49b9a478cbda542c65bacb9e1fa4c840'
CHK include/linux/version.h

MSCS and Kernel patches: Can someone please refresh my memory?

Greetings,
We run 4.7 Ent on a 2-node MSCS Cluster (SQL).
I need to do a kernel upgrade, but I have forgotten the precise distribution of kernel files outside of the "run" directory for the cluster.
I understand they are:
Main Kernel Files
shared_drive \usr\sap\<SID>\SYS\exe\run
MSCS Files (on each node)
local_drive \Windows\SapCluster\
local_drive \Windows\System32\
However, could someone please remind me which files go into
"SapCluster" and "System32", or if you know the location, point me to an SAP document that details the kernel patch process in an MSCS Cluster (rather than a general MSCS config document which is of no use in this situatio).
Thanks if anyone is able to help.
Tim
Edited by: Tim McKenzie on May 28, 2008 4:52 PM

Hi Tim,
We have a productive SAP R/3 4.7 (Windows 2003 IA64 Oracle 9.2) on a 2 nodes MSCS cluster.
In C:\windows\SapCluster
We have the following files :
backint.exe
brarchive.exe
brbackup.exe
brconnect.exe
brgui
brrecover.exe
brrestore.exe
brspace.exe
brtools.exe
cpio.exe
cpqccms.dll
dd.exe
dev_rout
mkszip.exe
mt.exe
niping.exe
pstat.exe
rfcoscol.exe
routtab.txt
sapevents.dll
sapgw
sapntchk.exe
sapntwaitforhalt.exe
saposcol.exe
saprouter.exe
sapsrvkill.exe
sapstart.exe
sapstartsrv.exe
sapstartsrv.exe.new
sapxpg.exe
uncompress.exe
So basically the SAP cluster dlls and the brtools.
We have also C:\windows\SapCluster\brgui with the brgui files
and C:\windows\SapCluster\sapgw with the standalone gateway executables
In C:\windows\system32 we have these SAP files :
sapmmc.dll
sapmmcada.dll
sapmmcinf.dll
sapmmcms.dll
saprc.dll
saprcex.dll
sapstart.log
sapstartsrv.exe
To update the kernel, we do it one node after the other one switching the SAP resources between the 2 nodes .
We of course keep the same patch level in the different directories.
Hope this helps.
Olivier
Hope this helps.

Problem of SIGPOLL(SI_NOINFO) in latest Solaris9 kernel patch

Hi,
We are facing a rather strange problem with the latest kernel patch on Solaris 9. (Generic_112233-08). We had not faced this problem with any of the other kernel patches of Solaris 9.
Our application has a main thread and a single child thread (pthread). The main thread schedules aio_writes() on the raw disk interface and lets the child thread block on sigwaitinfo() to listen to the signal completion notification. This is communicated to it via the SI_ASYNCIO code of SIGPOLL. The child thread then informs the main thread by writing to a bi-directional pipe. Since the main thread has registered for read interest on the bi-directional pipe (via /dev/poll) it is informed of the completion of the aio_write() without having to block itself. Under normal circumstances, the child thread receives SIGPOLL with SI_ASYNCIO code.
This application has been running fine on all the previous builds of Solaris (Generic, Generic_112233-04, Generic_112233-06) on sparc platform expect with the latest kernel patch. The child thread now keeps receiving SIGPOLL with SI_NOINFO code. There has been no change in our application and we are perplexed to the reason of this behaviour. Since it is SI_NOINFO there is not much debugging information we can get.
We have been able to replicate this behaviour using a small stand-alone program. We are attaching it at the end of the email. We tried this program on a couple of different Sparc systems and were able to reproduce this behaviour on one of them but not on the other.
Has anybody faced problems with regard to SIGPOLL in the latest kernel patch of Solaris 9 for sparc systems ?
Thanks
Regards
Raj Pagaku
proxy-24:~ >uname -a
SunOS proxy-24 5.9 Generic_112233-08 sun4u sparc SUNW,Ultra-5_10
proxy-24:~ >gcc -v
Reading specs from /usr/local/lib/gcc-lib/sparc-sun-solaris2.9/3.2/specs
Configured with: ../configure with-as=/usr/ccs/bin/as with-ld=/usr/ccs/bin/ld --disable-nls
Thread model: posix
gcc version 3.2
Compiled this program using the following command : gcc -g kernel_bug.c -lrt -lpthread
#include <stdio.h>
#include <aio.h>
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <signal.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/resource.h>
#include <sys/stat.h>
#include <sys/types.h>
#define min(x,y) (((x)<=(y))?(x):(y))
#define DISPLAY_COUNT 10000
typedef struct DiskInfoCallOut {
void (*func_ptr)(void *);
void *data_ptr;
} DiskInfoCallOut;
typedef struct DiskInfo {
struct aiocb di_aiocb;
DiskInfoCallOut di_callout;
off_t di_currOffset;
int di_scheduled;
} DiskInfo;
typedef struct Disk {
int fd;
char *buffer;
int bufferLen;
} Disk;
static sigset_t aioSignalSet;
int aioSigFD[2];
int glob_scheduled = 1;
int glob_respond = 1;
Disk disk;
static void LaunchDiskOperation(DiskInfo *di);
char BUFDATA[4096] = {'a'};
char rawDeviceName[256] = "/dev/rdsk/";
static void
InitializeDisk()
int fd;
if ((fd = open(rawDeviceName, O_RDWR, 0)) == -1) {
fprintf(stderr, "Unable to open raw device \n");
exit(-1);
disk.fd = fd;
disk.buffer = BUFDATA;
disk.bufferLen = sizeof(BUFDATA);
static void
AIOSignalHandler(int sigNum, siginfo_t* si, void* context)
fprintf(stderr, "WARN: got signal %d in AIOSignalHandler!\n", sigNum);
/* Function implementing the slave thread */
static void*
AIOSignalThread(void *arg)
struct sigaction sa;
siginfo_t info;
sigset_t ss;
int sig_num;
int retVal;
/* Initialize the signal set*/
sigemptyset(&ss);
sigaddset(&ss, SIGPOLL);
if ((retVal = pthread_sigmask(SIG_SETMASK, &ss, NULL))) {
fprintf(stderr, "pthread_sigmask failed in AIOSignalThread \n");
exit(-1);
sa.sa_handler = NULL;
sa.sa_sigaction = AIOSignalHandler;
sa.sa_mask = aioSignalSet;
sa.sa_flags = SA_SIGINFO;
if (sigaction(SIGPOLL, &sa, NULL)) {
fprintf(stderr, "sigaction in AIOSignalThread \n");
exit(-1);
/* Wait infinitely for the signals and respond to the main thread */
while (1) {
sig_num = sigwaitinfo(&aioSignalSet, &info);
if (sig_num != SIGPOLL) {
fprintf(stderr, "caught unexpected signal %d in AIOSignalThread \n",
sig_num);
exit(-1);
if (info.si_code != SI_ASYNCIO){
fprintf(stderr, "ERROR: siginfo_t had si_code != SI_ASYNCIO, si_code = %d \n", info.si_code);
continue;
/* Write the stored pointer value in the pipe so that the main thread can process it */
if (write(aioSigFD[1], &(info.si_value.sival_ptr), sizeof(info.si_value.sival_ptr)) !=
sizeof(info.si_value.sival_ptr)) {
perror("Couldn't write the whole pointer");
exit(-1);
return (NULL);
static void
Init()
pthread_attr_t aioAttr;
pthread_t aioThread;
int retVal = 0;
/* Create a bidirectional pipe */
if (pipe(aioSigFD)) {
perror("pipe failed");
exit(-1);
/* Initialize to prevent other threads from being interrupted by
SIGPOLL */
sigemptyset(&aioSignalSet);
sigaddset(&aioSignalSet, SIGPOLL);
if ((retVal = pthread_sigmask(SIG_BLOCK, &aioSignalSet, NULL))) {
fprintf(stderr, "pthread_sigmask failed in Init\n");
exit(-1);
InitializeDisk();
if ((retVal = pthread_attr_init(&aioAttr)))
fprintf(stderr, "pthread_attr_init failed \n");
if ((retVal = pthread_attr_setdetachstate(&aioAttr, PTHREAD_CREATE_DETACHED)))
fprintf(stderr, "pthread_attr_setdetachstate failed \n");
if ((retVal = pthread_attr_setscope(&aioAttr, PTHREAD_SCOPE_SYSTEM)))
fprintf(stderr, "pthread_attr_setscope failed in \n");
if ((retVal = pthread_attr_setstacksize(&aioAttr, 2*1024*1024)))
fprintf(stderr, "pthread_attr_setstacksize failed \n");
if ((retVal = pthread_create(&aioThread, &aioAttr,
AIOSignalThread, NULL)))
fprintf(stderr, "pthread_create failed \n");
static void
UpdateDiskWriteInformation(DiskInfo *di)
di->di_currOffset += disk.bufferLen;
di->di_scheduled = 0;
static void
DiskOpCompleted(void *ptr)
DiskInfo di = (DiskInfo )ptr;
if (aio_error(&di->di_aiocb))
perror("aio_error");
if (aio_return(&di->di_aiocb) < 0)
perror("aio_return ");
UpdateDiskWriteInformation(di);
glob_respond++;
static void
LaunchDiskOperation(DiskInfo *di)
int res;
di->di_callout.func_ptr = DiskOpCompleted;
di->di_callout.data_ptr = di;
memset(&di->di_aiocb, 0, sizeof(di->di_aiocb));
di->di_aiocb.aio_fildes = disk.fd;
di->di_aiocb.aio_buf = disk.buffer;
di->di_aiocb.aio_nbytes = disk.bufferLen;
di->di_aiocb.aio_offset = di->di_currOffset;
di->di_scheduled = 1;
di->di_aiocb.aio_sigevent.sigev_notify = SIGEV_SIGNAL;
di->di_aiocb.aio_sigevent.sigev_signo = SIGPOLL;
di->di_aiocb.aio_sigevent.sigev_value.sival_ptr = &di->di_callout;
res = aio_write(&di->di_aiocb);
if (res == -1) {
perror("aio op error");
static void
HandleSignalResponses()
int fd;
#define DISKINFO_CALLOUT_MAX 64
DiskInfoCallOut* callout[DISKINFO_CALLOUT_MAX];
struct stat pipeStat;
int numCompleted;
int bytesToRead;
int sz;
int i;
fd = aioSigFD[0];
while (1) {
/* Find whether there is any data in the pipe */
if(-1 == fstat(fd, &pipeStat)) {
perror("fstat");
exit(-1);
if (pipeStat.st_size < sizeof(DiskInfoCallOut *))
break;
numCompleted = min((pipeStat.st_size/sizeof(DiskInfoCallOut *)),DISKINFO_CALLOUT_MAX);
bytesToRead = numCompleted * sizeof(DiskInfoCallOut *);
if ((sz = read(fd, callout, bytesToRead)) != bytesToRead) {
perror("Error reading from pipe");
exit(-1);
for (i = 0; i < numCompleted; i++)
(*callout[i]->func_ptr)(callout[i]->data_ptr);
int main(int argc, char *argv[])
DiskInfo *di;
FILE *logPtr1 = NULL;
FILE *logPtr2 = NULL;
FILE *logPtr3 = NULL;
struct rusage ru;
struct timeval t1, t2;
long timeTaken = 0;
int writeCount = 0;
int i;
char logFileName1[1024];
char logFileName2[1024];
char logFileName3[1024];
if (argc < 2) {
fprintf(stderr, "Usage : %s <partition_name> \n", argv[0]);
exit(-1);
strcat(rawDeviceName, argv[1]);
writeCount = 1;
printf("Partition selected = %s \n", rawDeviceName);
di = calloc(writeCount, sizeof(DiskInfo));
sprintf(logFileName1, "%s.log1", argv[0]);
if ((logPtr1 = fopen(logFileName1, "w+")) == NULL) {
fprintf(stderr, "Unable to create file test_pgm \n");
exit(-1);
sprintf(logFileName2, "%s.log2", argv[0]);
if ((logPtr2 = fopen(logFileName2, "w+")) == NULL) {
fprintf(stderr, "Unable to create file test_pgm \n");
exit(-1);
sprintf(logFileName3, "%s.log3", argv[0]);
if ((logPtr3 = fopen(logFileName3, "w+")) == NULL) {
fprintf(stderr, "Unable to create file test_pgm \n");
exit(-1);
Init();
for (i = 0; i < writeCount; i++) {
di.di_currOffset = (1 << 18) * (i + 1);
di[i].di_scheduled = 0;
gettimeofday(&t1, NULL);
while (1) {
int curScheduled = 0;
/* Schedule the disk operations */
for (i = 0; i < writeCount; i++) {
if (di[i].di_scheduled == 0) {
LaunchDiskOperation(&di[i]);
glob_scheduled++;
curScheduled++;
/* Handle the responses */
HandleSignalResponses();
if ((curScheduled) && (glob_respond % DISPLAY_COUNT == 0)) {
gettimeofday(&t2, NULL);
timeTaken = ((t2.tv_sec * 1000000 + t2.tv_usec) -
(t1.tv_sec * 1000000 + t1.tv_usec))/1000;
printf("Scheduled = %d, Responded = %d, Time Taken = %ld ms \n",
glob_scheduled, glob_respond, timeTaken);
fprintf(logPtr1, "Scheduled = %d, Responded = %d, Time Taken = %ld ms \n",
glob_scheduled, glob_respond, timeTaken);
fprintf(stderr,"wrote to logPtr1 ..\n");
fprintf(logPtr2, "Scheduled = %d, Responded = %d, Time Taken = %ld ms \n",
glob_scheduled, glob_respond, timeTaken);
fprintf(stderr,"wrote to logPtr2 ..\n");
fprintf(logPtr3, "Scheduled = %d, Responded = %d, Time Taken = %ld ms \n",
glob_scheduled, glob_respond, timeTaken);
fprintf(stderr,"wrote to logPtr3 ..\n");
t1 = t2;

Hi @cooldog ,
I hit this same LVM2 snapshot kernel oops on several Oracle Linux 6.5 servers running UEK R3 kernel version 3.8.13-16.3.1. I have Linux Premier Support so I opened a Service Request. Oracle Support got back to me with the following notes.
Hello Matt,
Bug 17487738 : EXT4: STRESS TESTING WITH SUSPEND/RESUME FS ACCESS CAUSES FS ERRORS This bug is fixed in kernel version: 3.8.13-18. This kernel will be available quite soon for download.
You may upgrade the kernel once its available. ~Siju
Update
Dear Matt, Latest available UEK3 kernel version 'kernel-uek-3.8.13-26.el6uek.x86_64' incorporates the required bugfix. [root@server1 tmp]# rpm -q --changelog -p kernel-uek-3.8.13-26.el6uek.x86_64.rpm | grep -i 17487738
warning: kernel-uek-3.8.13-26.el6uek.x86_64.rpm: Header V3 RSA/SHA256 signature: NOKEY, key ID ec551f03
- fs: protect write with sb_start/end_write in generic_file_write_iter (Guangyu Sun) [Orabug: 17487738] <<<<<<======================================== You can download the UEK3 kernel from ULN or from public-yum repo.
http://public-yum.oracle.com/repo/OracleLinux/OL6/UEKR3/latest/x86_64/getPackage/kernel-uek-firmware-3.8.13-26.el6uek.noarch.rpm
http://public-yum.oracle.com/repo/OracleLinux/OL6/UEKR3/latest/x86_64/getPackage/kernel-uek-3.8.13-26.el6uek.x86_64.rpm Hope this helps! ~Siju
Subscribe to the Oracle Linux el-errata mailing list .
The latest kernel-uek-3.8.13-26.el6uek.x86_64 version fixed the problem.
- Matt

Kernel patch problem

I try kernel patch 114 to 159 but I have a problem
Distpatcher is being yellow "Running but Dialog Queue info unavailable" and WP Table is empty in SAP MMC but I can logon SAP on SAP GUI.
and I can see workprocess in task manager.
I aleady patch in QAS and DEV . there are not any problems.
I think MSCS may be cause of problem.
PRD was confiured MSCS.
Microsoft cluster library patch was '114' in sapstartsrv.log
Microsoft cluster library patch must be same disp+work ?
so how can I get microsoft cluster library patch ?
and this is not related with problem, how can I do?
help please
thanks

For MSCS there is diffrent procedure to stop SAP for kernel upgrade.
For windows, open cluster administrator, you will find ur SAP application instance..then righ click on it and select "take offline"
you SAP MMC may not go show u gray colour, sometimes it will be yellow. then in ur stopped node host, copy ur kernel files. then start server using "bring online" option in cluster administartor.
do not forget to take backup of ur old kernel before switching the kernel.

Upgradation of kernel patch in clustred enviroment

hi
i have to upgrade the kernel patch from 144 to 205
we have a clustering in our production server
ECC6.0
HP-UX
Oracle 10g
can anyone guide me step by step how to do this .
regards
Aditya Rathore

Hi,
See the below threads,
[Kernel Upgrade on Solaris Cluster Failed;
Thanks
Shambo

Question about kernel patch revisions

Hello
Installing a recommended patch cluster for Solaris 9. I've noticed there are serveral patches for the kernel. Why is this? Also when I type uname -a and it shows the "revision" of the system, what is this referring to?
Thanks in advance.

When the kernel patch starts getting too big and unwieldy, they create a new kernel patch and list the previous patch as a prereq.
The revision referred to in uname -a is the kernel patch revision.

Kernel Patch upgrade results into error

Recently, we upgraded our R/3 640 kernel from patch number 196 to 327. It went successful.
But now, the Sales and order team is having problem in saving the orders(VA02, VA01).
While saving Sales order, error pops up saying "dialog step number missing".
Kindly suggest.

Hi ppl,
The problem has been resolved.
As i said, problem wasnt with the upgrade, but came thereafter, functional team faced issues while switching between the windows.
Problem was with the Gui level, it needed to be upgraded too after the kernel patch upgrade.
Thanks a lot for putting your thoughts.
Thanks

Kernel patch 138888 breaks cluster

Similar Messages

Maybe you are looking for