MOUNTING OCFS on NODE 2

Hi Guru's,
Here is my situation.Am implementing RAC on 2 nodes on RHEL AS 4.0.I have reached the ocfs2 configuration part.Node 1 has 2 disks,where by the 2nd disk is the one containing the OCFS2.I have configured and mounted it without any problems.
Now,on the second node,i only have 1 hard disk.Now , my questions are this.
1)Am i supposed to have two hard disks like the one's in node 1 so as i do the same kind of configurations as the one's in node 1?or am i supposed to mount the ocfs2 of node1 on node2?
2)Since i only have one hard disk on node 2 which has alot of free space,is there a way of creating a new partition ocfs2 on that same hard disk that already contains other (ext3) linux partitions?and how?
Thanks inadvance.

I found the answer I was looking fot, at www.ocfs.org/ocfs/RPMS/.
Thanks
_Rob

Similar Messages

Mount ocfs folder rais error ....

HI every one.....
I am installing OEL4 on vm ware for installing oracle 10g RAC.... .when i m mounting the ocfs2 folder it gives error.....this question i have put three times no one have a give the answer...in oracle forums so many Oracle guru available...but not single persong have give the correct solution. please see the problem and give the correct answer...
see the problem here ........
[root@rac1 ~]#mount -t ocfs2 -o datavolume,nointr /dev/sda1 /ocfs
ocfs2_hb_ctl: Bad magic number in superblock while reading uuid
mount.ocfs2: Error when attempting to run /sbin/ocfs2_hb_ctl: "Operation not permitted"
supporting help to find the soloution...
[root@rac1 ~]# mkfs.ocfs2 -L "myVolume" /dev/sda1
mkfs.ocfs2 1.2.2
Filesystem label=myVolume
Block size=1024 (bits=10)
Cluster size=4096 (bits=12)
Volume size=8224768 (2008 clusters) (8032 blocks)
1 cluster groups (tail covers 2008 clusters, rest cover 2008 clusters)
Journal size=4194304
Initial number of node slots: 2
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Formatting Journals: mkfs.ocfs2: Unable to find available bit while formatting journal "journal:0001"
[root@rac1 ~]# cat /etc/ocfs2/cluster.conf
node:
ip_port = 7777
ip_address = 172.16.31.195
number = 0
name = rac1.local.com
cluster = ocfs2
node:
ip_port = 7777
ip_address = 172.16.31.197
number = 1
name = rac2.local.com
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2

here is the output of the coomand from node1
[root@rac1 ~]# /sbin/mounted.ocfs2 -d
Device FS UUID Label
[root@rac1 ~]#
=========================
[root@rac1 ~]# /sbin/fdisk -l
Disk /dev/hda: 26.8 GB, 26843545600 bytes
255 heads, 63 sectors/track, 3263 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 1 1307 10498446 83 Linux
/dev/hda2 1308 2614 10498477+ 83 Linux
/dev/hda3 2615 2888 2200905 82 Linux swap
Disk /dev/sda: 3221 MB, 3221225472 bytes
255 heads, 63 sectors/track, 391 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 391 391 8032+ 83 Linux
Disk /dev/sdb: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 522 522 8032+ 83 Linux
Disk /dev/sdc: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 522 522 8032+ 83 Linux
Disk /dev/sdd: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdd1 522 522 8032+ 83 Linux
[root@rac1 ~]#
---------------out put from node2
[root@rac2 ~]# /sbin/fdisk -l
Disk /dev/hda: 26.8 GB, 26843545600 bytes
255 heads, 63 sectors/track, 3263 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 1 1307 10498446 83 Linux
/dev/hda2 1308 2614 10498477+ 83 Linux
/dev/hda3 2615 2907 2353522+ 82 Linux swap
Disk /dev/sda: 3221 MB, 3221225472 bytes
255 heads, 63 sectors/track, 391 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 391 391 8032+ 83 Linux
Disk /dev/sdb: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 522 522 8032+ 83 Linux
Disk /dev/sdc: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 522 522 8032+ 83 Linux
Disk /dev/sdd: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdd1 522 522 8032+ 83 Linux
[root@rac2 ~]#
[root@rac2 ~]# /sbin/mounted.ocfs2 -d
Device FS UUID Label
where is exact error please see and give reply asap.......................
Regards
Ali

Cluster /global/devices/node@X mount problem.

Hi,
I have just installed sun cluster 3.2. It is a simple installtion and cluster contains only two servers and one sun storage.
After installation, i realised that the name of the metadevice on tha servers were the same, so only one of them mounted under /global/devices/node@X, and the other were didn't.
Then i re-create metadevice on the second node with another name. After re-creation /global/devices/node@X
fs weren't mounted, when node were rebooted.
there was various error on the console, but i couldn't resolve that problem.
I can mount global/devices/node@X with mount command. But it's not mounted automaticaly in reboot time.
# mount global/devices/node@X is running perfectly.
The error shown in reboot time are as follows;
The /global/.devices/node@2 file system (/dev/rlofi/1) is being checked.
WARNING - Unable to repair the /global/.devices/node@2 filesystem. Run fsck
manually (fsck -F ufs /dev/rlofi/1).
Mar 18 13:35:12 svc.startd[8]: svc:/system/cluster/scmountdev:default: Method "/usr/cluster/lib/svc/method/scmountdev start" failed with exit status 32.
The /global/.devices/node@2 file system (/dev/rlofi/1) is being checked.
WARNING - Unable to repair the /global/.devices/node@2 filesystem. Run fsck
manually (fsck -F ufs /dev/rlofi/1).
Mar 18 13:35:12 svc.startd[8]: svc:/system/cluster/scmountdev:default: Method "/usr/cluster/lib/svc/method/scmountdev start" failed with exit status 32.
The /global/.devices/node@2 file system (/dev/rlofi/1) is being checked.
WARNING - Unable to repair the /global/.devices/node@2 filesystem. Run fsck
manually (fsck -F ufs /dev/rlofi/1).
Mar 18 13:35:12 svc.startd[8]: svc:/system/cluster/scmountdev:default: Method "/usr/cluster/lib/svc/method/scmountdev start" failed with exit status 32.
mount: I/O error
mount: Cannot mount /dev/lofi/126
lofiadm: could not unmap device /dev/lofi/126: No such device or address
mount: I/O error
mount: Cannot mount /dev/lofi/126
lofiadm: could not unmap device /dev/lofi/126: No such device or address
WARNING: Cluster startup could not remove the existing
lofi device associated with /.globaldevices. Try manually
removing the device association. Global devices will
be unavailable until this error is fixed.
Mar 18 13:36:10 tstclstr1 svc.startd[8]: system/cluster/globaldevices:default misconfigured: transitioned to maintenance (see 'svcs -xv' for details)
Mar 18 13:36:10 tstclstr1 Cluster.CCR: /usr/cluster/bin/scgdevs: Filesystem /global/.devices/node@2 is not available in /etc/mnttab.
Mar 18 13:36:10 tstclstr1 last message repeated 1 time
Mar 18 13:38:50 tstclstr1 svc.startd[8]: platform/sun4u/dscp:default failed: transitioned to maintenance (see 'svcs -xv' for details)
The output of the svcs -xv;
root@tstclstr1 # svcs -xv
svc:/system/cluster/globaldevices:default (Suncluster globaldevices service)
State: maintenance since 18 Mart 2010 Perşembe 13:36:10 EET
Reason: Start method exited with $SMF_EXIT_ERR_CONFIG.
See: http://sun.com/msg/SMF-8000-KS
See: /var/svc/log/system-cluster-globaldevices:default.log
Impact: 3 dependent services are not running:
svc:/system/cluster/mountgfs:default
svc:/system/cluster/clusterdata:default
svc:/system/cluster/ql_rgm:default
svc:/application/print/server:default (LP print server)
State: disabled since 18 Mart 2010 Perşembe 13:34:46 EET
Reason: Disabled by an administrator.
See: http://sun.com/msg/SMF-8000-05
See: man -M /usr/share/man -s 1M lpsched
Impact: 2 dependent services are not running:
svc:/application/print/rfc1179:default
svc:/application/print/ipp-listener:default
svc:/system/cluster/scmountdev:default (Sun Cluster scmountdev)
State: maintenance since 18 Mart 2010 Perşembe 13:35:12 EET
Reason: Start method failed repeatedly, last exited with status 32.
See: http://sun.com/msg/SMF-8000-KS
See: /etc/svc/volatile/system-cluster-scmountdev:default.log
Impact: This service is not running.
svc:/system/cluster/scsymon-srv:default (Sun Cluster SyMON Server Daemon)
State: offline since 18 Mart 2010 Perşembe 13:34:49 EET
Reason: Dependency svc:/application/management/sunmcagent:default is absent.
See: http://sun.com/msg/SMF-8000-E2
Impact: This service is not running.
svc:/platform/sun4u/dcs:default (domain configuration server)
State: maintenance since 18 Mart 2010 Perşembe 13:38:51 EET
Reason: Restarting too quickly.
See: http://sun.com/msg/SMF-8000-L5
See: man -M /usr/share/man -s 1M dcs
See: /var/svc/log/platform-sun4u-dcs:default.log
Impact: This service is not running.
svc:/platform/sun4u/dscp:default (DSCP Service)
State: maintenance since 18 Mart 2010 Perşembe 13:38:50 EET
Reason: Start method failed repeatedly, last died on Killed (9).
See: http://sun.com/msg/SMF-8000-KS
See: man -M /usr/share/man -s 1M prtdscp
See: /var/svc/log/platform-sun4u-dscp:default.log
Impact: This service is not running.
{noformat}root@tstclstr1 #
</pre>{noformat}

Thanks for the reply,
It used to be the same metaset name on the second server, which several folks have run into when installing SC on the 2nd node. So I renamed it when booted into no cluster mode using metarename. I then updated the /etc/vfstab with the new name, this is it in /etc/vfstab:
/dev/md/dsk/d26 /dev/md/rdsk/d26 /global/.devices@2 ufs 2 no global
And yes if I change "global" to "-" then it mounts. When the 2nd node boots up, it cannot mount /global/.devices@2 with it set to global so the boot halts, I do ctrl-D to continue. Then I try a regular mount from root and I get the "no such file or directory" message.
If I do a metastat d26 it shows it to be OK on both submirrors.
The name of the metadisk on the first node is d160. And the 2nd server is able to mount that from the global option on the first server.
Thanks,

Mount ocfs2 error in oracle 10g installation on vmware

Dear All this is very urgent.....kindly help me...
i m running this command for mount ocfs folder but not run gives error below.......
[root@rac1 ~]#mount -t ocfs2 -o datavolume,nointr /dev/sda1 /ocfs
ocfs2_hb_ctl: Bad magic number in superblock while reading uuid
mount.ocfs2: Error when attempting to run /sbin/ocfs2_hb_ctl: "Operation not permitted"
--------I have run this command for checking----------
$ mounted.ocfs2 -d
$ mounted.ocfs2 -f
$ cat /etc/ocfs2/cluster.conf
$ /etc/rc.d/init.d/ocfs2 status [ as root ]
---------the output is below--------------------------------
--------From node 1----------------------------------------------
[root@rac1 ~]# mounted.ocfs2 -d
Device FS UUID Label
[root@rac1 ~]# mounted.ocfs2 -f
Device FS Nodes
[root@rac1 ~]# cat /etc/ocfs2/cluster.conf
node:
ip_port = 7777
ip_address = 172.16.31.195
number = 0
name = rac1
cluster = ocfs2
node:
ip_port = 7777
ip_address = 172.16.31.197
number = 1
name = rac2
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2
[root@rac1 ~]#
[root@rac1 ~]# /etc/rc.d/init.d/ocfs2 status
[root@rac1 ~]#
----------------------from node 2----------------------------------
output from node2
[root@rac2 ~]# su -
[root@rac2 ~]# mounted.ocfs2 -d
-bash: mounted.ocfs2: command not found
[root@rac2 ~]# mounted.ocfs2 -f
-bash: mounted.ocfs2: command not found
[root@rac2 ~]# cat /etc/ocfs2/cluster.conf
node:
ip_port = 7777
ip_address = 172.16.31.195
number = 0
name = rac1
cluster = ocfs2
node:
ip_port = 7777
ip_address = 172.16.31.197
number = 1
name = rac2
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2
[root@rac2 ~]# /etc/rc.d/init.d/ocfs2 status
[root@rac2 ~]#
the both node is installed on VMWARE AND LINUX IS OEL4....THE CONFIGRATION OF vmx file below
------from first node vmx file configration----------------------
.encoding = "windows-1252"
config.version = "8"
virtualHW.version = "7"
scsi0.present = "TRUE"
scsi0.virtualDev = "lsilogic"
memsize = "1056"
ide0:0.present = "TRUE"
ide0:0.fileName = "D:\vmware\ractesting\rac1.vmdk"
ide1:0.present = "TRUE"
ide1:0.autodetect = "TRUE"
ide1:0.deviceType = "cdrom-image"
floppy0.startConnected = "FALSE"
floppy0.fileName = ""
floppy0.autodetect = "TRUE"
ethernet0.present = "TRUE"
ethernet0.wakeOnPcktRcv = "FALSE"
ethernet0.addressType = "generated"
usb.present = "TRUE"
ehci.present = "TRUE"
sound.present = "TRUE"
sound.fileName = "-1"
sound.autodetect = "TRUE"
serial0.present = "TRUE"
serial0.fileType = "thinprint"
pciBridge0.present = "TRUE"
pciBridge4.present = "TRUE"
pciBridge4.virtualDev = "pcieRootPort"
pciBridge4.functions = "8"
pciBridge5.present = "TRUE"
pciBridge5.virtualDev = "pcieRootPort"
pciBridge5.functions = "8"
pciBridge6.present = "TRUE"
pciBridge6.virtualDev = "pcieRootPort"
pciBridge6.functions = "8"
pciBridge7.present = "TRUE"
pciBridge7.virtualDev = "pcieRootPort"
pciBridge7.functions = "8"
vmci0.present = "TRUE"
roamingVM.exitBehavior = "go"
displayName = "rac1"
guestOS = "rhel4"
nvram = "rac1.nvram"
virtualHW.productCompatibility = "hosted"
printers.enabled = "TRUE"
extendedConfigFile = "rac1.vmxf"
disk.locking="FALSE"
disklib.dataCacheMaxSize="0"
scsi1.sharedBus="virtual"
scsi0:0.present = "TRUE"
scsi0:0.fileName = "D:\sharedstorage\ocfs2disk.vmdk"
scsi0:0.mode = "independent-persistent"
scsi0:0.deviceType="disk"
scsi0:1.present = "TRUE"
scsi0:1.fileName = "D:\sharedstorage\asmdisk1.vmdk"
scsi0:1.mode = "independent-persistent"
scsi0:1.deviceType="disk"
scsi0:2.present = "TRUE"
scsi0:2.fileName = "D:\sharedstorage\asmdisk2.vmdk"
scsi0:2.mode = "independent-persistent"
scsi0:2.deviceType="disk"
scsi0:3.present = "TRUE"
scsi0:3.fileName = "D:\sharedstorage\asmdisk3.vmdk"
scsi0:3.mode = "independent-persistent"
scsi0:3.deviceType="disk"
ethernet1.present = "TRUE"
ethernet1.connectionType = "hostonly"
ethernet1.wakeOnPcktRcv = "FALSE"
ethernet1.addressType = "generated"
ide1:0.fileName = "E:\OEL4\Enterprise-R4-U4-i386-disc4.iso"
ethernet0.generatedAddress = "00:0c:29:de:6c:7a"
ethernet1.generatedAddress = "00:0c:29:de:6c:84"
uuid.location = "56 4d a1 a0 2f 33 08 24-12 e5 7a 39 a7 de 6c 7a"
uuid.bios = "56 4d a1 a0 2f 33 08 24-12 e5 7a 39 a7 de 6c 7a"
cleanShutdown = "FALSE"
replay.supported = "FALSE"
replay.filename = ""
ide0:0.redo = ""
scsi0:0.redo = ""
scsi0:1.redo = ""
scsi0:2.redo = ""
scsi0:3.redo = ""
pciBridge0.pciSlotNumber = "17"
pciBridge4.pciSlotNumber = "21"
pciBridge5.pciSlotNumber = "22"
pciBridge6.pciSlotNumber = "23"
pciBridge7.pciSlotNumber = "24"
scsi0.pciSlotNumber = "16"
usb.pciSlotNumber = "32"
ethernet0.pciSlotNumber = "33"
ethernet1.pciSlotNumber = "34"
sound.pciSlotNumber = "35"
ehci.pciSlotNumber = "36"
vmci0.pciSlotNumber = "37"
vmotion.checkpointFBSize = "16777216"
ethernet0.generatedAddressOffset = "0"
ethernet1.generatedAddressOffset = "10"
vmci0.id = "-1478595462"
checkpoint.vmState = ""
tools.syncTime = "TRUE"
tools.remindInstall = "FALSE"
ide1:0.startConnected = "FALSE"
------and also paste vmx file configration from node2---------------------
.encoding = "windows-1252"
config.version = "8"
virtualHW.version = "7"
scsi0.present = "TRUE"
scsi0.virtualDev = "lsilogic"
memsize = "1056"
ide0:0.present = "TRUE"
ide0:0.fileName = "rac2.vmdk"
ide1:0.present = "TRUE"
ide1:0.autodetect = "TRUE"
ide1:0.deviceType = "cdrom-image"
floppy0.startConnected = "FALSE"
floppy0.fileName = ""
floppy0.autodetect = "TRUE"
ethernet0.present = "TRUE"
ethernet0.wakeOnPcktRcv = "FALSE"
ethernet0.addressType = "generated"
usb.present = "TRUE"
ehci.present = "TRUE"
sound.present = "TRUE"
sound.fileName = "-1"
sound.autodetect = "TRUE"
serial0.present = "TRUE"
serial0.fileType = "thinprint"
pciBridge0.present = "TRUE"
pciBridge4.present = "TRUE"
pciBridge4.virtualDev = "pcieRootPort"
pciBridge4.functions = "8"
pciBridge5.present = "TRUE"
pciBridge5.virtualDev = "pcieRootPort"
pciBridge5.functions = "8"
pciBridge6.present = "TRUE"
pciBridge6.virtualDev = "pcieRootPort"
pciBridge6.functions = "8"
pciBridge7.present = "TRUE"
pciBridge7.virtualDev = "pcieRootPort"
pciBridge7.functions = "8"
vmci0.present = "TRUE"
roamingVM.exitBehavior = "go"
displayName = "rac2"
guestOS = "rhel4"
nvram = "rac2.nvram"
virtualHW.productCompatibility = "hosted"
printers.enabled = "TRUE"
extendedConfigFile = "rac2.vmxf"
ethernet1.present = "TRUE"
ethernet1.connectionType = "hostonly"
ethernet1.wakeOnPcktRcv = "FALSE"
ethernet1.addressType = "generated"
disk.locking="FALSE"
disklib.dataCacheMaxSize="0"
scsi1.sharedBus="virtual"
scsi0:0.present = "TRUE"
scsi0:0.fileName = "D:\sharedstorage\ocfs2disk.vmdk"
scsi0:0.mode = "independent-persistent"
scsi0:0.deviceType="disk"
scsi0:1.present = "TRUE"
scsi0:1.fileName = "D:\sharedstorage\asmdisk1.vmdk"
scsi0:1.mode = "independent-persistent"
scsi0:1.deviceType="disk"
scsi0:2.present = "TRUE"
scsi0:2.fileName = "D:\sharedstorage\asmdisk2.vmdk"
scsi0:2.mode = "independent-persistent"
scsi0:2.deviceType="disk"
scsi0:3.present = "TRUE"
scsi0:3.fileName = "D:\sharedstorage\asmdisk3.vmdk"
scsi0:3.mode = "independent-persistent"
scsi0:3.deviceType="disk"
ide1:0.fileName = "E:\OEL4\Enterprise-R4-U4-i386-disc4.iso"
ide1:0.startConnected = "FALSE"
ethernet0.generatedAddress = "00:0c:29:dc:e1:c9"
ethernet1.generatedAddress = "00:0c:29:dc:e1:d3"
tools.syncTime = "TRUE"
uuid.location = "56 4d c6 2a 5b 77 ee e5-79 46 84 cd 2b dc e1 c9"
uuid.bios = "56 4d c6 2a 5b 77 ee e5-79 46 84 cd 2b dc e1 c9"
cleanShutdown = "FALSE"
replay.supported = "FALSE"
replay.filename = ""
ide0:0.redo = ""
scsi0:0.redo = ""
scsi0:1.redo = ""
scsi0:2.redo = ""
scsi0:3.redo = ""
pciBridge0.pciSlotNumber = "17"
pciBridge4.pciSlotNumber = "21"
pciBridge5.pciSlotNumber = "22"
pciBridge6.pciSlotNumber = "23"
pciBridge7.pciSlotNumber = "24"
scsi0.pciSlotNumber = "16"
usb.pciSlotNumber = "32"
ethernet0.pciSlotNumber = "33"
ethernet1.pciSlotNumber = "34"
sound.pciSlotNumber = "35"
ehci.pciSlotNumber = "36"
vmci0.pciSlotNumber = "37"
vmotion.checkpointFBSize = "16777216"
usb:1.present = "TRUE"
ethernet0.generatedAddressOffset = "0"
ethernet1.generatedAddressOffset = "10"
vmci0.id = "735896009"
usb:1.deviceType = "hub"

What's the output of "/etc/init.d/o2cb status"
Have you formatted the partition i.e. mkfs -t ocfs2 /dev/sdb1

IS IT SUPPORTED TO CENTRALLY MOUNT THE ORACLE_HOME IN A NON-RAC ENVIRONMENT

SR 7250090.993 : (http://qmon.oraclecorp.com/qmon3/quickpicks.pl?t=t&q=7250090.993)
Technical Summary:
Customer is planing to install Oracle 10.2.0.4 and 11.1.0.x software on a Red Hat5 with NetApp storage.
Customer came across the following :
For single instance installations (as opposed to RAC installations), you must create a separate Oracle home directory for each installation. Run the software in this Oracle home directory only from the system that you used to install it. For Oracle Real Application Clusters (RAC) installations, you can use a single Oracle home directory mounted from each node in the cluster. You must mount this Oracle home directory on each node so that it has the same directory path on all nodes.
mentioned in the 10gR2 documentation link :
http://download.oracle.com/docs/cd/B19306_01/install.102/b15667/app_nas.htm#BCFIDEJA
Requirements/Expectations:
As the above statements, that customer came across is not present in 9i documentation, customer wants to understand if it is actually supported to centrally mount the 10g/11g ORACLE_HOME to many servers that are not RAC enabled.
Also, I would like to understand if the statements in documentation indicate that it is not generally recommended centrally mount the 10g/11g ORACLE_HOME or does it mean that it is not supported to centrally mount the ORACLE_HOME in a non RAC environment?
Please advice.

The binaries (executables) in an Oracle home are "linked" (link edited?) to the OS libraries on each server where the software is installed.
Unless the OS is IDENTICAL on each of the IDENTICAL(HW) servers -- that would share the Oracle home--, you could be in trouble.
The only supported configuration (I know of) where the Oracle binaries are shared between servers is 9i RAC. On 10g RAC the binaries are installed on each server.
Other wise I'd say it's NOT recommended, besides you don't save anything (execpt a cooupl of Gigs disk space).
:p

ASM disk added without scan on second node

Hi All ,
Oracle Version:11.2.0.3
I need one help for one issue with ASM disk addition.
It is a two node RAC and one disk group was filled.
One disk was available as UNUSED001 and so we renamed it ran scan disk and added the disk to the diskgroup on node1.
But , as we did not run the scan disk on second node , the name is still showing as UNUSED001 and assigned to diskgroup so showing as MEMBER.
Also, the renamed disk is also showing as MEMBER but not assigned with any diskgroup.
Usually, when this heppens we have to reboot the node to fix the issue , but would like to get idea if this can be fixed without bouncing nodes.

Hi ,
+ Probably that disk addition failed with ORA-15075 as same named device is not visible after renaming of disk.
As this validation takes place after writing disk header ,it is showing as MEMBER.
+ Get downtime of cluster on 2nd node and run scandisks on 2nd node.
+ Now renamed disk should be showing up on node 2.
+ if showing up ,then validate .
-- All expected diskgroups were mounted on both nodes or not
sql> select inst_id,name,state from gv$asm_diskgroup;
-- If mounted validate that renamed disk group_number and mount_status
sql> col path for a30
sql> select inst_id,group_number,path,mount_status from gv$asm_disk;
+ If group_number is 0 and mount_status is CLOSED ,then it is not part of any mounted diskgroup.
Add that disk again with force option in same diskgroup.
sql> alter diskgroup <diskgroup_name> add disk 'ORCL:<LABEL_NAME>' force;
And allow rebalance to complete.
Regards,
Aritra

Sun cluster failed when switching, mount /global/ I/O error .

Hi all,
I am having a problem during switching two Sun Cluster nodes.
Environment:
Two nodes with Solaris 8 (Generic_117350-27), 2 Sun D2 arrays & Vxvm 3.2 and Sun Cluster 3.0.
Porblem description:
scswitch failed , then scshutdown and boot up the both nodes. One node failed because of vxvm boot failure.
The other node is booting up normally but cannot mount /global directories. Manually mount is working fine.
# mount /global/stripe01
mount: I/O error
mount: cannot mount /dev/vx/dsk/globdg/stripe-vol01
# vxdg import globdg
# vxvol -g globdg startall
# mount /dev/vx/dsk/globdg/mirror-vol03 /mnt
# echo $?
0
port:root:/global/.devices/node@1/dev/vx/dsk 169# mount /global/stripe01
mount: I/O error
mount: cannot mount /dev/vx/dsk/globdg/stripe-vol01
Need help urgently
Jeff

I would check your patch levels. I seem to remember there was a linker patch that cause an issue with mounting /global/.devices/node@X
Tim
---

Mapping shared disk mount in Linux

Hi ,
We are using shared disk for our SOA clustered environment, everything is working fine, even disk sharing between two servers are fine,issues is when ever we reboot servers,mounting node 1 disk to node2 always disappearing, i have to manually mount disk in node 2, is there any way to automate to mount the disk after reboot in node 2.
OS : OEL 5.4(64bit).
11G SOA/WLS

hi,
it's a bit not clear to me - would you please describe how you managed to map other drives/volumes?
please post
listing of /dev/mapper/*
content of /etc/rc.local
Are you using multipath or ASMlib?

/globaldevices on different file system mounts

IHAC that has a SC3.2 on a pair of V890's Solaris9-U5 and Veritas Foundation Suite 5 for boot disks.
I have noticed that the paths to /globaldevices are different:
/dev/vx/dsk/bootdg/rootdg_16vol 487863 5037 434040 2% /global/.devices/node@1
/dev/vx/dsk/bootdg/node@2 487863 5035 434042 2% /global/.devices/node@2
Can I just go on the first name and rename /dev/vx/bootdg/ and /dev/vx/rdsl/bootdg and rename the rootdg_16vol to node@1 or is there a different method.

Yes, you can just rename as a normal veritas volume using vxedit. Make sure you modify the /etc/vfstab file for node1.
1. umount the /global/.devices/node@1
2. renamee
3. modify vfstab
4. mount /global/.devices/node@1
If possible test reboot to verify.
Actually, it doesnot make any difference ifyou leave it that way.
-LP

4 Node Q-Master Cluster issue with Nodes stuck at "Waiting"

I have a 4 node QMaster cluster configured no problem, One Cluster Controller and 3 Service nodes. When i export a project from FCP to compressor i can assign the cluster no problem. Then the only node that starts to process is the cluster controller, The service nodes stay at Waiting.... I f i make any other service node the Cluster controller then THAT node starts to process and the 3 Service nodes are stuck at waiting.
I can mount the cluster storage on all nodes and i am able to read/write to that volume.
The source material (Firewire drive) is also mounted on all nodes.
The setup is a new MB-PRO running FCP 5.0 and Compressor 2.0, QMaster 2.0 , the cluster nodes are all 1ghz G4's
Any ideas?
Many thanks
MACBOOK PRO 2.1 Mac OS X (10.4.6) Multiple G4's
MACBOOK PRO 2.1 Mac OS X (10.4.6) Multiple G4's

I don't recommend submitting to a cluster directly from the timeline. Instead, export a QT referance movie, and submit that to the cluster via compressor using a setting that has 'allow job segmenting' checked. If that is not checked, then you are telling Qmaster to only let one machine work on that file - this is recommended for high quality, multi-pass VBR Mpeg2 jobs because of how multi-pass and VBR work...
If you submit from outside FCP, but you are doing multi-pass VBR, you can send your cluster 4 jobs that will all happen at the same time tho.
Basically, to use all the nodes, it needs to have the whole file and be allowed to break it into segments, and send thos segments to the nodes. when you export off the time line, fcp sends the data to compressor frame at a time, not as a whole file, thus, no segmenting.

RAC with 10G using shared directories

We want to test Oracle 10G with Real Applications Cluster, but we do not have a SAN yet, can we use a disk from a normal server, share this disk and create a map network drive in the two servers that i want to install in the RAC? and use them like a shared disk??

This is the article about what I was refering:
Setting Up Linux with FireWire-Based Shared Storage for Oracle9i RAC
By Wim Coekaerts
If youre all fired up about FireWire and you want to set up a two-node cluster for development and testing purposes for your Oracle RAC (Real Application Clusters) database on Linux, heres an installation and configuration QuickStart guide to help you get started. But first, a caveat: Neither Oracle nor any other vendor currently supports the patch; it is intended for testing and demonstration only.
The QuickStart instructions step you through the installation of the Oracle database and the use of our patched kernel for configuring Linux for FireWire as well as the installation and configuration of Oracle Cluster File System (OCFS) on a FireWire shared-storage device. Oracle RAC uses shared storage in conjunction with a multinode extension of a database to allow scalability and provide failover security.
The hardware typically used for shared storage (a fibre-channel system) is expensive (see my column on clustering with FireWire on Oracle Technology Network (OTN) for some background on shared-storage solutions and the new kernel patch). However, once youve installed and set up the kernel patch, you will be on your way to setting up a Linux cluster suitable for your development team to use for demo testing and QAa solution that costs considerably less than the traditional ones.
The patch is available to the Linux and open source community under the GNU General Public License (GPL). You can download it from the Linux Open Source Projects page, available from the Community Code section of OTN. See the Toolbox sidebar for more information.
Figure 1: Two-node Linux cluster using FireWire shared drive
By following this guide, youll install the patched kernel on each machine that will comprise a node of the cluster. Youll basically build a two-node test configuration composed of two machines connected over a 10Base-T network, with each machine linked via FireWire to the drive used for shared storage, as shown in see Figure 1.
If you havent used FireWire on either machine before, be sure to install and configure the FireWire interconnect in each machine and test it with a FireWire drive or other device before you get started, to ensure that the baseline system is working. The FireWire interconnects we tested are based on Texas Instruments (TI, one of the coauthors of the IEEE specification on which FireWire is based) chipsets, and we used a 120GB Western Digital External FireWire (IEEE 1394) hard drive.
Table 1 lists the minimum hardware requirements per node for a two-node cluster and some of the additional requirements for clusters of more than two nodes. You can use a standard laptop equipped with a PCMCIA FireWire card for any of the nodes in the cluster. Weve successfully tested a laptop-based cluster following the same installation process described in this article.
As shown in Table 1, for more than two nodes, you must add a four- or five-port FireWire hub to the configuration, to support connections from the additional machines to the drive. Just plug each Linux box into a port in the hub, and plug the FireWire drive into the hub as well. Without a hub, the configuration wont have enough power for the total cable length on the bus.
The instructions in this article are for a two-node cluster configuration. To create a cluster of more than two nodes, configure each additional node (node 3, node 4) by repeating these steps for each of the additional nodes and also be sure to do the following:
Modify the command syntax or script files to account for the proper node number, machine name, and other details specific to the node.
Create an extra set of log files and undo tablespaces on the shared storage for each additional node.
Its not yet possible to use our patched FireWire drivers to build a cluster of more than four nodes.
Step 1: Download Everything You Need
Before you get started, spend some time downloading all the software youll need from OTN. If youre not an OTN member, youll have to join first, but its free.
Keep in mind that these Linux kernel FireWire driver patches are true open source projects. You can download the source code and customize it for your own implementations as long as you adhere to the GPL agreement.
See "Toolbox" for a list of the software you should download and have available before you get started.
Step 2. Install Linux
Once youve downloaded or purchased the Red Hat Linux Advanced Server 2.1 distribution (or another distribution that youve already gotten to work with Oracle9i Database, Release 2), you can install Linux on the local hard drive of each node (this takes about 25 minutes per node). Well keep the configuration basic, but you should configure one of the network cards on each machine for a private LAN (this provides the interconnect between nodes in the cluster); for example:
hostname: node1
ip address: 192.168.1.50
hostname: node2
ip address: 192.168.1.51
Because this is a private LAN, you dont need "real" IP addresses. Just make sure that if you do hook up either of these machines to a live network, the IP addresses dont conflict with those of other machines. Also, be sure you download all the software you need for these machines before configuring the private network if you havent also configured or dont have a second network interface card (NIC) in the machines.
Step 3. Install Oracle9i Database
If you havent done so already, you must download the Oracle software set for Oracle9i Database Release 2 (9.2.0.1.0) for Linux, or if youre an OTN TechTracks
For each machine that will comprise a node in the cluster, you must do the following:
Create a mount point, /oracle/home, for the Oracle software files on the local hard disk of each machine.
Create a new user, oracle (in either the dba or the oracle group), in /home/oracle on each machine.
Start the Oracle Universal Installer from the CD or the mount point on the local hard disk to which youve copied the installation files; that is, enter runInstaller. The Oracle Universal Installer menu displays.
From the menu, choose Cluster Manager as the first product to install, and install it with only its own node name as public and private nodes for now. Cluster Manager is just a few megabytes, so installation should take only a minute or two.
When the installation is complete, exit from the Oracle Universal Installer and restart it (using the runInstaller script). Choose the database installation option, and do a full software-only installation (dont create a database).
Step 4. Configure FireWire (IEEE 1394)
If you havent done so already, download the patched Linux kernel file (fw-test-kernel-2.4.20-image.tar.gz) from OTNs Community Code area.
Assuming that fw-test-kernel-2.4.19-image.tar.gz is available at the root mount point on each node, now do the following:
Log on to each machine as the root user and execute these commands to uncompress and unpack the files that comprise the modules:
cd /
tar zxvf /fw-test-kernel-2.4.19-image.tar.gz
modify /etc/grub.conf
If youre using the lilo bootloader utility instead of grub, replace grub.conf in the last statement above with /etc/lilo.conf.
To the bottom of /etc/grub.conf or /etc/lilo.conf, add the name of the new kernel:
title FireWire Kernel (2.4.19)
root (hd0,0)
kernel /vmlinuz-2.4.19 ro root=/dev/hda3
Now reboot the system by using this kernel on both nodes. To simplify the startup process so that you dont have to modify the boot-up commands each time, you should also add the following statements to /etc/modules.conf on each node:
options sbp2 sbp2_exclusive_login=0
post-install sbp2 insmod sd_mod
post-remove sbp2 rmmod sd_mod
During every system boot, load the FireWire drivers on each node; for example:
modprobe ohci1394
modprobe sbp2
If you use dmesg (display messages from the kernel ring buffer), you should see a log message similar to the following:
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 35239680 512-byte hdwr sectors (18043 MB)
sda: sda1 sda2 sda3
This particular message indicates that the Linux kernel has recognized an 18GB disk with three partitions.
The first time you use the FireWire drive, run fdisk from one of the nodes and partition the disk as you like. (If both nodes have the modules loaded while youre running fdisk on one node, you should reboot the other system or unload and reload all the FireWire and SCSI modules to make sure the new partition table is loaded.)
Step 5. Configure OCFS
We strongly recommend that you use OCFS in conjunction with the patched kernel so that you dont have to partition your disks manually. If you havent done so already, download the precompiled modules (fw-kernel-ocfs.tar.gz) from OTNs Community Code area. (See the "Toolbox" sidebar for more information.)
Untar the file on each node, and use ocfsformat on one node to format the file system on the shared disk, as in the following example:
ocfsformat -f -l /dev/sda1 -c 128 -v ocfsvol
-m /ocfs -n node1 -u 1011 -p 755 -g 1011
where 1011 is the UID and GID of the Oracle account and 755 is the directory permission. The partition that well use is /dev/sda1, and -c 128 means that well use a 128KB cluster size; the cluster size can be 4, 8, 16, 32, 128, 256, 512, or 1,024KB.
As the root user, create an /ocfs mountpoint directory on each node.
To configure and load the kernel module on each node, create a configuration file /etc/ocfs.conf. For example:
ipcdlm:
ip_address = 192.168.1.50
ip_port = 9999
subnet_mask = 255.255.252.0
type = udp
hostname = node1 (on node2, put node2s hostname here)
active = yes
Be sure that each node has the correct values with respect to IP addresses, subnet masks, and node names. Assuming that youre using the example configuration, node 1 uses the IP address 192.168.1.50 ; while on node 2, put 192.168.1.51
Use the insmod command to load the OCFS driver on each node. The basic syntax is as follows:
insmod ocfs.o name=<nodename>
For example:
insmod /root/ocfs.o name=node1
Each time the system boots, the module must be loaded on each node that comprises the cluster.
To mount the OCFS partition, enter the following on each node:
mount -t ocfs /dev/sda1 /ocfs
You now have a shared file system, owned by user oracle, mounted on each node. The shared file system will be used for all data, log, and control files. The modules have also been loaded, and the Oracle database software has been installed.
Youre now ready for the final stepsconfiguring the Cluster Manager software and creating a database. To streamline this process, you can create a small script (env.sh) in the Oracle home to set up the environment, as follows:
export ORACLE_HOME=/home/Oracle/9i
export ORACLE_SID=node1
export LD_LIBRARY_PATH=/home/Oracle/9i/lib
export PATH=$ORACLE_HOME/bin:$PATH
You can do the same for the second nodejust change the second line above to export ORACLE_SID=node2.
Execute (source) this file (env.sh) when you log in or from .login scripts as root or oracle.
Step 6. Configure Cluster Manager
Cluster Manager maintains the status of the nodes and the Oracle instances across the cluster and runs on each node of the cluster.
As user root or oracle, go to $ORACLE_HOME/oracm/admin on each node and create or change the cmcfg.ora and the ocmargs.ora files according to Listing 1.
Be sure that the HostName in the cmcfg.ora file is correct for the machine that is, node 1 has a file that contains node1, and node 2 has a file that contains node2.
Before starting the database, make sure the Cluster Manager software is running. For conveniences sake, add Cluster Manager to the rc script. As user root on each node, set up the Oracle environment variables (source env.sh):
cd $ORACLE_HOME/oracm/bin
./ocmstart.sh
The file ocmstart.sh is an Oracle-provided sample startup script that starts both the Watchdog daemon and Cluster Manager.
Step 7. Configure Oracle init.ora, and Create a Database
Listing 2 contains an example init.ora in $ORACLE_HOME/dbs. You can use it on each node to create initnode1.ora and initnode2.ora, respectively, by making the appropriate adjustmentsthat is, change node1 to node2 throughout the listing.
You must now create the directories for the log files on node 1, as follows:
cd $ORACLE_HOME
mkdir admin ; cd admin ; mkdir node1 ; cd node1 ;
mkdir udump ; mkdir bdump ; mkdir cdump
Again, do the same for node 2, replacing node1 in the syntax example with node2.
Make a link for the Oracle password file on each node (these files may not yet exist):
cd $ORACLE_HOME/dbs
ln -sf /ocfs/orapw orapw
Now that you have the setup, the next step is to create a database. To simplify this process, use the shell script (create.sh) in Listing 3. Be sure to run the script from node 1 only, and be sure to run it only once. Run this script as user oracle, and if all has goes well, you will have created the database, added a second undo tablespace, and added and enabled a second log thread.
You can start the database from either node in the cluster, as follows:
sqlplus / as sysdba
startup
Finally, you can configure the Oracle listener, $ORACLE_HOME/network/admin/listener.ora, as you normally would on both nodes and start that as well.
You should now be all set up!
Wim Coekaerts ( [email protected]) is principal member of technical staff, Corporate Architecture, Development. His team works on continuing enhancements to the Linux kernel and publishes source code under the GPL in OTNs Community Code section. For more information about Oracle and Linux, visit the OTN Linux Center or the Linux Forum.
Toolbox
Dont tackle this as your first "getting to know Linux and Oracle project." This article is brief and doesnt provide detailed, blow-by-blow instructions for beginners. You should be comfortable with the UNIX operating system and with Oracle database installation in a UNIX environment. Youll need all the software and hardware items in this list:
Oracle9i Database Release 2 (9.2.0.1.0) for Linux (Intel). Download the Enterprise Edition, which is required for Oracle RAC.
Linux distribution. We recommend Red Hat Linux Advanced Server 2.1, but you can download Red Hat 8.0 free from Red Hat. (However, please note that Red Hat doesnt support the downloaded version.)
Linux kernel patch for FireWire driver support, available under the Firewire Patches section. (Note that were updating these constantly, so the precise name may have changed.)
OCFS for Linux. OCFS is not strictly required, but we recommend that you use it because it simplifies installation and configuration of the storage for the cluster. The file you need is fw-kernel-ocfs.tar.gz.
Two Intel-based PCs
Two NICs in each machine (although were only concerned in these instructions with configuring the private LAN that provides the heartbeat communication between the nodes in the cluster)
Two FireWire interconnect cards
One large FireWire drive for shared storage
To supplement this QuickStart, you should also take a look at the supporting documentation, especially these materials:
Release Notes for Oracle9i for Linux (Intel)
Oracle9i Real Application Clusters Setup and Configuration
Oracle Cluster Management Software for Linux (Appendix F in the Oracle9i Administrators Reference Release 2 (9.2.0.1.0) for UNIX Systems)
Table 1: Hardware inventory and worksheet for FireWire-based cluster
Requirements Your configuration details:
Per node minimum Node 1 Node 2
Minimum CPU 500 MHz (Celeron, AMD, Pentium)
Minimum RAM 256 MB
Local hard drive free space 3 GB
FireWire card 1 (TI chipset)
Network interface card 2 (1 for node interconnect; 1 for public network)
Per cluster minimum Your configuration details:
FireWire hard drive 1 300-GB
4-port FireWire hub Required for 3-node cluster
5-port FireWire hub Required for 4-node cluster
http://otn.oracle.com/oramag/webcolumns/2003/techarticles/coekaertsfirewiresetup.html
Joel Pérez
http://otn.oracle.com/experts

Oracle kernel sources

Hi everybody,
I've installed a Oracle 10.1.0.3 RAC on 2 RHEL 3 Update2 machines (folowing the guidlines on otn).
I've replaced the original kernel with 2.4.21-27.0.2ELorafw1smp provided by oracle.I had to compile a sk98lin module for the second network adapters that I've added to those machines.
I also have some problems with the ASM modules install..now it's up and runing..
However ,even at boot time,finding modules dependencies yields OK,there are some unresolved symbols for sk98lin when running a depmod command.
Randomly,one server freezes,and ,when rebooting,the other one also freezes.I have to shutdown both,start,manualy run commands from /etc/init.d/init.crs script in order to have the RAC daemons back online.
The provider of the NIC-s also delivers a CD which will compile a sk98lin module,but it requires the kernel sources for that...downloaded the file kernel-2.4.21-27.0.2.ELorafw1.src.rpm and tried to install it ...however,no kernel source is installed...no folder for the sources is created in /usr/src...Did anyone installed those kernel sources?
If not,any other ideea about where can I get a kernel for firewire concurent connections which also had sources?
Thanks,
Sandu

Hello,
Just wondering if you've found a solution to your problem. I have a 2 node RAC with Dell Power Edge 2850's and Maxtor 250g shared Firewire with VIA Tech 1394 PCI's on exactly the same configuration: 10g 10.1.0.3 and 2.4.21-27.0.2ELorafw1smp . The entire platform is erratic. It hangs on boot at mounting OCFS. If everything mounts OK it will work for a while between 5 minutes and an hour and then anyone of the nodes will freeze. I have installed CRS and haven't got 10g RDBMS installed yet cauz I can't get the system stable long enough to finish the installation.
Any assistance would be appreciated.
Thanks,
Moe.

Failed attempt to move log and database paths

Hi. Can anyone offer any advice on what might have caused an attempt to move Exchange 2010 (SP3) mailbox database and log folder paths to fail? I can't diagnose it, and would appreciate any advice.
We have two databases in a two-node DAG, one mounted on each node. I removed the passive copy of each database before the move attempts. The first moved ok. The second failed. Here's a summary of what happened:
Task 1:
DAG-NODE-A ----------------------------DAG-NODE-B
MAILBOX-DB-01                               MAILBOX-DB-01
MOUNTED                                        HEALTHY (passive)
- Dismount database
- Remove passive copy.
- Move logfolder path from R: to L:
and move EDB path from S: to M:
- Succeeded.
- Mount datbase.
- Re-add passive copy
All ok.
Task 2:
DAG-NODE-A ----------------------------DAG-NODE-B
MAILBOX-DB-02                               MAILBOX-DB-02
HEALTHY (passive)                           MOUNTED
- Dismount database
- Remove passive copy.
- Move logfolder path from R: to L:
and move EDB path from S: to M:
- Failed. Paths not moved.
- Database now won't mount.
- Database eventually mounts
after approx. 30 auto retries.
- Abandon attempt to move paths.
- Re-add passive copy.
END
Here's some more detail, included event log and shell output:
So, as per the above, with Mailbox-DB-01 (active copy on DAG-NODE-A), Exchange moved the paths without a hitch and I was then able to mount the database and re-add the passive copy. I then tried to move the paths for Mailbox-DB-02 (active copy on DAG-NODE-B),
but after a few minutes Exchange aborted the move, outputting errors to the Shell, the application log and the MSExchange Management log. A second attempt failed because Exchange found that "The .edb file path is not available. There is already a file
named M:\Mailbox-DB-02\Mailbox-DB-02.edb" - Exchange had created an EDB file in the target location, but the source EDB file was still in the source location and the path was unchanged. Rather than re-trying the move again, I decided to try to mount the
database, to check it was ok, but it wouldn’t mount.
I left it alone for an hour or so to have a think about what to do, came back to it and found Exchange had mounted the database (without moving the paths) after around 30 automatic retries. I was then able to re-add the passive copy.
I’m not keen to re-try the move without knowing why it failed, why the database then wouldn’t mount, and why it subsequently recovered. Any advice would be appreciated. Many thanks.
Output included
- a WinRM error in the Shell’;
- an App Log error event from my attempt to mount the database after the move failure, saying an attempt to open the EDB file for read / write access failed with system error 32 because the file was being used by another process;
- events recording failure to configure and start the database, and noting a serious error which caused it to terminate its functional activity;
- Event ID 3154 from MSExchangeRepl, saying an Active Manager operation failed / database action failed / MapiExceptionCallFailed;
- Event ID 10056 from source MSExchangeIS Mailbox Store in the App Log just before Exchange successfully mounted the troublesome database, saying “Patch all ID counters for database Mailbox-DB-02.”
One other thing which may or may not be relevant – the MSExchange Service Host service stops soon after being started on DAG-NODE-B (reboots have made no difference). On DAG-NODE-A, it runs consistently.
Here’s what seems to me to be the most relevant output from the Application Log and the MSExchange Management Log on DAG-NODE-B, and from the Exchange Shell:
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 10:46:19 ESE 327 General "Information Store (3720) Mailbox-DB-02: The database engine detached a database (4, S:\Mailbox-DB-02\Database\Mailbox-DB-02.edb). (Time=6 seconds)
Internal Timing Sequence: [1] 0.000, [2] 0.000, [3] 0.000, [4] 0.000, [5] 0.000, [6] 6.188, [7] 0.156, [8] 0.016, [9] 0.015, [10] 0.016, [11] 0.031.
Revived Cache: 0"
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 10:46:19 ESE 103 General "Information Store (3720) Mailbox-DB-02: The database engine stopped the instance (4).
Dirty Shutdown: 0
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 10:46:19 MSExchangeIS Mailbox Store 9539 General The Microsoft Exchange Information Store database "Mailbox-DB-02" was stopped.
==================================================================================
DAG-NODE-B, MSExchange Management log
Information DD/MM/14 10:46:19 MSExchange CmdletLogs 1 General Cmdlet succeeded. Cmdlet Dismount-Database, parameters {Identity=Mailbox-DB-02}.
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 10:46:19 MSExchangeRepl 3161 Service Active Manager dismounted database Mailbox-DB-02 on server DAG-NODE-B.company.corp.
==================================================================================
DAG-NODE-B, MSExchange Management log
Error DD/MM/14 10:46:20 MSExchange CmdletLogs 6 General Cmdlet failed. Cmdlet Get-PublicFolderDatabase, parameters {Status=True, Identity=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx}.
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 10:46:21 MSExchange Assistants 9002 Assistants Service MSExchangeMailboxAssistants. Stopped processing database Mailbox-DB-02 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx).
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 10:46:21 MSExchange Assistants 9002 Assistants Service MSExchangeMailSubmission. Stopped processing database Mailbox-DB-02 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx).
==================================================================================
DAG-NODE-B, Application Log:
Error DD/MM/2014 10:51:08 MSExchange Search Indexer 104 General Exchange Search Indexer failed to enable the Mailbox Database Mailbox-DB-02 (GUID = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) after 10 tries. The last failure was: MapiExceptionMdbOffline: Unable to
get CI watermark (hr=0x80004005, ec=1142)
==================================================================================
DAG-NODE-B, MSExchange Management log
Warning DD/MM/14 11:06:44 MSExchange CmdletLogs 4 General Cmdlet stopped. Cmdlet Move-DatabasePath, parameters {Identity=Mailbox-DB-02, EdbFilePath=M:\Mailbox-DB-02\Mailbox-DB-02.edb, LogFolderPath=L:\Mailbox-DB-02\Logs}.
==================================================================================
DAG-NODE-B, Exchange Management Shell output
Processing data from remote server failed with the following error message: The WinRM client cannot complete the operation within the time specified. Check if the machine name is valid and is reachable over the network and firewall exception for Windows Remote
Management service is enabled. For more information, see the about_Remote_Troubleshooting Help topic.
+ CategoryInfo : OperationStopped: (System.Manageme...pressionSyncJob:PSInvokeExpressionSyncJob) [], PSRemotingTransportException
+ FullyQualifiedErrorId : JobFailure
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 11:23:42 MSExchangeIS Mailbox Store 1000 General Attempting to start the Information Store "Mailbox-DB-02".
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 11:23:43 ESE 102 General "Information Store (3720) Mailbox-DB-02: The database engine (14.03.0162.0000) is starting a new instance (4).
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 11:23:43 ESE 105 General "Information Store (3720) Mailbox-DB-02: The database engine started a new instance (4). (Time=0 seconds)
==================================================================================
DAG-NODE-B, Application Log:
Error DD/MM/2014 11:23:53 ESE 490 General "Information Store (3720) Mailbox-DB-02: An attempt to open the file ""S:\Mailbox-DB-02\Database\Mailbox-DB-02.edb"" for read / write access failed with system error 32 (0x00000020): ""The
process cannot access the file because it is being used by another process. "". The open file operation will fail with error -1032 (0xfffffbf8).
==================================================================================
DAG-NODE-B, Application Log:
Error DD/MM/2014 11:23:53 MSExchangeIS 9519 General "The following error occurred while starting database Mailbox-DB-02: 0xfffffbf8.
Failed to configure MDB. "
==================================================================================
DAG-NODE-B, Application Log:
Error DD/MM/2014 11:23:53 MSExchangeIS 9519 General "The following error occurred while starting database Mailbox-DB-02: 0xfffffbf8.
Start DB failed.. "
==================================================================================
DAG-NODE-B, Application Log:
Error DD/MM/2014 11:23:53 ExchangeStoreDB 215 Database recovery At 'DD/MM/2014 11:23:53' the Microsoft Exchange Information Store Database 'Mailbox-DB-02' copy on this server experienced a serious error which caused it to terminate its functional activity.
The error returned by the remount attempt was "There is only one copy of this mailbox database (Mailbox-DB-02). Automatic recovery is not available.". Consult the event log on the server for other storage and "ExchangeStoreDb" events for
more specific information about the failures.
==================================================================================
DAG-NODE-B, Application Log:
Error DD/MM/2014 11:23:53 ExchangeStoreDB 231 Database recovery At 'DD/MM/2014 11:23:53', the copy of database 'Mailbox-DB-02' on this server encountered an error during the mount operation. For more information, consult the Event log on the server for "ExchangeStoreDb"
or "MSExchangeRepl" events. The mount operation will be tried again automatically.
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 11:23:54 ESE 103 General "Information Store (3720) Mailbox-DB-02: The database engine stopped the instance (4).
Dirty Shutdown: 0
==================================================================================
DAG-NODE-B, Application Log:
Error DD/MM/2014 11:23:54 ExchangeStoreDB 231 Database recovery At 'DD/MM/2014 11:23:54', the copy of database 'Mailbox-DB-02' on this server encountered an error during the mount operation. For more information, consult the Event log on the server for "ExchangeStoreDb"
or "MSExchangeRepl" events. The mount operation will be tried again automatically.
==================================================================================
DAG-NODE-B, Application Log:
Error DD/MM/2014 11:23:54 MSExchangeRepl 3154 Service Active Manager failed to mount database Mailbox-DB-02 on server DAG-NODE-B.company.corp. Error: An Active Manager operation failed. Error The database action failed. Error: Operation failed with message:
MapiExceptionCallFailed: Unable to mount database. (hr=0x80004005, ec=-1032)
==================================================================================
NOTE:repeat events similar to some of the above continue until Exchange eventually manages to mount the database:
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:15:56 MSExchangeIS Mailbox Store 1000 General Attempting to start the Information Store "Mailbox-DB-02".
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:15:56 ESE 102 General "Information Store (3720) Mailbox-DB-02: The database engine (14.03.0162.0000) is starting a new instance (4).
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:15:56 ESE 105 General "Information Store (3720) Mailbox-DB-02: The database engine started a new instance (4). (Time=0 seconds)
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:15:57 ESE 326 General "Information Store (3720) Mailbox-DB-02: The database engine attached a database (5, S:\Mailbox-DB-02\Database\Mailbox-DB-02.edb). (Time=0 seconds)
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:15:57 MSExchangeIS Mailbox Store 10056 General Patch all ID counters for database Mailbox-DB-02.
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:15:57 MSExchangeIS Mailbox Store 1133 General Allocating message database resources for database "Mailbox-DB-02".
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:15:58 MSExchangeIS Mailbox Store 9523 General "The Microsoft Exchange Database ""Mailbox-DB-02"" has been started.
Database File: S:\Mailbox-DB-02\Database\Mailbox-DB-02.edb
Transaction Logfiles: R:\Mailbox-DB-02\Logs\
Base Name (logfile prefix): E03
System Path: R:\Mailbox-DB-02\Logs\
(Start Duration=00:00:01.844) "
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:15:58 MSExchangeRepl 3156 Service Active Manager successfully mounted database Mailbox-DB-02 on server DAG-NODE-B.company.corp.
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:16:06 MSExchange Assistants 9001 Assistants Service MSExchangeMailSubmission. Started to process mailbox database Mailbox-DB-02 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx).
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:16:09 MSExchange Assistants 9001 Assistants Service MSExchangeMailboxAssistants. Started to process mailbox database Mailbox-DB-02 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx).
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:16:12 MSExchange Assistants 9017 Assistants Service MSExchangeMailboxAssistants. Managed Folder Mailbox Assistant for database Mailbox-DB-02 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) is entering a work cycle. There are 125 mailboxes on
this database.
==================================================================================
DAG-NODE-B, Application Log:
Information DD/MM/2014 12:16:30 MSExchange Search Indexer 108 General Exchange Search Indexer has enabled indexing for the Mailbox Database Mailbox-DB-02 (GUID = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx).
==================================================================================


If you used the steps outlined in this TechNet article - http://technet.microsoft.com/en-us/library/dd979782.aspx - then you should have been OK. I will say that I read one
article that said to work one database at a time, then when you are sure it is operational to go back and do the next, etc.
(I also assume by your shorthand at the beginning of this document that you performed the tasks on the active node when you did this work. If so, that's the best way to do it, so if it failed, it wasn't because you used the wrong steps or did them
on the wrong system.)

Setup Cluster using Solaris Container data service

We have a two-node cluster that we would like to use to create either a zone cluster or use the Solaris Container data service that would create a scalable (or multiple master) data service of two zones, one on each node. We have an app running in the zone, CiscoWorks, that has a local database of jobs that are scheduled to run to configure Cisco switches. I was curious how we setup the storage. If each zone is running on local disks, how do the zones stay in sync and the database updated with job information? Would I setup a device group of the disks where the zones will reside on each node? Can I use SAN as the local disk so the zones can be replicated to a Disaster Recovery location?
Thanks for any help,
Chuck

Chuck,
Sadly, I think I'm going to make your implementation decisions a lot more complicated because there are three ways you can use zones within Solaris Cluster.
1. Create a failover zone using the HA Solaris Container Data Service. Here the zone root moves between the cluster nodes as the zone fails over.
2. Create static zones between with resource groups can migrate. Each zone root is local the the physical node. However, the configuration of the zones can be subtly different.
3. Create a virtual cluster using static zones within which resource groups can migrate. Each zone root is local the the physical node. However, the configuration of the zones are forced to be the same.
Note also, that a ZFS zpool can only be mounted on one node or zone at anyone time, although it can be mounted read/write in one zone and read-only in other zones on the same node (IIRC).
I would be inclined to put your database into an HA configuration, i.e. one that runs on one node at any one time. I would then constrain that in a zone cluster that is bound to a project with restricted resources, i.e. CPUs and memory. Any other tiers of the application, could then be placed either in the global zone (main cluster) or placed in another zone cluster and equally constrained.
I don't know if that's any help. I can recommend a good book on the subject <shameless plug "Oracle Solaris Cluster Essentials"/>. The example chapters should be of help.
Regards,
Tim
---

OCR and voting disks on ASM, problems in case of fail-over instances

Hi everybody
in case at your site you :
- have an 11.2 fail-over cluster using Grid Infrastructure (CRS, OCR, voting disks),
where you have yourself created additional CRS resources to handle single-node db instances,
their listener, their disks and so on (which are started only on one node at a time,
can fail from that node and restart to another);
- have put OCR and voting disks into an ASM diskgroup (as strongly suggested by Oracle);
then you might have problems (as we had) because you might:
- reach max number of diskgroups handled by an ASM instance (63 only, above which you get ORA-15068);
- experiment delays (especially in case of multipath), find fake CRS resources, etc.
whenever you dismount disks from one node and mount to another;
So (if both conditions are true) you might be interested in this story,
then please keep reading on for the boring details.
One step backward (I'll try to keep it simple).
Oracle Grid Infrastructure is mainly used by RAC db instances,
which means that any db you create usually has one instance started on each node,
and all instances access read / write the same disks from each node.
So, ASM instance on each node will mount diskgroups in Shared Mode,
because the same diskgroups are mounted also by other ASM instances on the other nodes.
ASM instances have a spfile parameter CLUSTER_DATABASE=true (and this parameter implies
that every diskgroup is mounted in Shared Mode, among other things).
In this context, it is quite obvious that Oracle strongly recommends to put OCR and voting disks
inside ASM: this (usually called CRS_DATA) will become diskgroup number 1
and ASM instances will mount it before CRS starts.
Then, additional diskgroup will be added by users, for DATA, REDO, FRA etc of each RAC db,
and will be mounted later when a RAC db instance starts on the specific node.
In case of fail-over cluster, where instances are not RAC type and there is
only one instance running (on one of the nodes) at any time for each db, it is different.
All diskgroups of db instances don't need to be mounted in Shared Mode,
because they are used by one instance only at a time
(on the contrary, they should be mounted in Exclusive Mode).
Yet, if you follow Oracle advice and put OCR and voting inside ASM, then:
- at installation OUI will start ASM instance on each node with CLUSTER_DATABASE=true;
- the first diskgroup, which contains OCR and votings, will be mounted Shared Mode;
- all other diskgroups, used by each db instance, will be mounted Shared Mode, too,
even if you'll take care that they'll be mounted by one ASM instance at a time.
At our site, for our three-nodes cluster, this fact has two consequences.
One conseguence is that we hit ORA-15068 limit (max 63 diskgroups) earlier than expected:
- none ot the instances on this cluster are Production (only Test, Dev, etc);
- we planned to have usually 10 instances on each node, each of them with 3 diskgroups (DATA, REDO, FRA),
so 30 diskgroups each node, for a total of 90 diskgroups (30 instances) on the cluster;
- in case one node failed, surviving two should get resources of the failing node,
in the worst case: one node with 60 diskgroups (20 instances), the other one with 30 diskgroups (10 instances)
- in case two nodes failed, the only node survived should not be able to mount additional diskgroups
(because of limit of max 63 diskgroup mounted by an ASM instance), so all other would remain unmounted
and their db instances stopped (they are not Production instances);
But it didn't worked, since ASM has parameter CLUSTER_DATABASE=true, so you cannot mount 90 diskgroups,
you can mount 62 globally (once a diskgroup is mounted on one node, it is given a number between 2 and 63,
and other diskgroups mounted on other nodes cannot reuse that number).
So as a matter of fact we can mount only 21 diskgroups (about 7 instances) on each node.
The second conseguence is that, every time our CRS handmade scripts dismount diskgroups
from one node and mount it to another, there are delays in the range of seconds (especially with multipath).
Also we found inside CRS log that, whenever we mounted diskgroups (on one node only), then
behind the scenes were created on the fly additional fake resources
of type ora*.dg, maybe to accomodate the fact that on other nodes those diskgroups were left unmounted
(once again, instances are single-node here, and not RAC type).
That's all.
Did anyone go into similar problems?
We opened a SR to Oracle asking about what options do we have here, and we are disappointed by their answer.
Regards
Oscar

Hi Klaas-Jan
- best practises require that also online redolog files are in a separate diskgroup, in case of ASM logical corruption (we are a little bit paranoid): in case DATA dg gets corrupted, you can restore Full backup plus Archived RedoLog plus Online Redolog (otherwise you will stop at the latest Archived).
So we have 3 diskgroups for each db instance: DATA, REDO, FRA.
- in case of fail-over cluster (active-passive), Oracle provide some templates of CRS scripts (in $CRS_HOME/crs/crs/public) that you edit and change at your will, also you might create additionale scripts in case of additional resources you might need (Oracle Agents, backups agent, file systems, monitoring tools, etc)
About our problem, the only solution is to move OCR and voting disks from ASM and change pfile af all ASM instance (parameter CLUSTER_DATABASE from true to false ).
Oracle aswers were a litlle bit odd:
- first they told us to use Grid Standalone (without CRS, OCR, voting at all), but we told them that we needed a Fail-over solution
- then they told us to use RAC Single Node, which actually has some better features, in csae of planned fail-over it might be able to migreate
client sessions without causing a reconnect (for SELECTs only, not in case of a running transaction), but we already have a few fail-over cluster, we cannot change them all
So we plan to move OCR and voting disks into block devices (we think that the other solution, which needs a Shared File System, will take longer).
Thanks Marko for pointing us to OCFS2 pros / cons.
We asked Oracle a confirmation that it supported, they said yes but it is discouraged (and also, doesn't work with OUI nor ASMCA).
Anyway that's the simplest approach, this is a non-Prod cluster, we'll start here and if everthing is fine, after a while we'll do it also on Prod ones.
- Note 605828.1, paragraph 5, Configuring non-raw multipath devices for Oracle Clusterware 11g (11.1.0, 11.2.0) on RHEL5/OL5
- Note 428681.1: OCR / Vote disk Maintenance Operations: (ADD/REMOVE/REPLACE/MOVE)
-"Grid Infrastructure Install on Linux", paragraph 3.1.6, Table 3-2
Oscar

MOUNTING OCFS on NODE 2

Similar Messages

Maybe you are looking for