ASM disk header corruption
Hi.
If the header of an asm disk is corrupted, then I drop the corrupted disk and re-add, will the data in the disk lost?
Thanks.
fschk requires the file system driver to support repairing the file system - in other words, fschk itself does not do the fixing (it does not understand the underlying file system format). Quoting manpage, "+fsck is simply a front-end for the various file system checkers+". So if there is not a fsck.asmcfs (or similar), fsck can not do anything ito fixing invalid structures on that file system.
As for the kernel keeping copies of a file system's superblock - not sure why it would be doing that for ASM. It does not manage it as a file system or service I/O to it (the ASM devices are opened using the O_DIRECT flag).
Similar Messages
-
Can The ASM Disk Header and Disk Group Be Renamed?
I have a requirement to mount multiple SAN clones (EMC Symclone) of a production ASM instance on the same development server. RMAN is too slow to be an option, unless I find a white paper explaining any way to make RMAN perform a clone faster than the SAN.
The default response is normally, "Do your clone to dedicated hardware." However, that answer is not an option for obvious financial reasons, especially since I am working with a RAC environment with ~10 development and test environments.
The only way to do this (that I can think of) is to rename the disk header and probably the disk group prior to making it available to the ASM instance.
I have heard that Oracle has come up with an undocumented solution for one or more businesses. I believe it has something to do with the kfed library located in $ORACLE_HOME/rdbms/lib.
Has anyone out there managed to do this? If so, can you share your solution or point me in the right direction?
I know I am not the only one out there looking for a solution to this issue...
Thanks in advance.This is absolutely possible. Can be done via kfed. I never found the answer searching the net and had to figure it out. Here it is in hopes that it helps someone else. DISCLAIMER.... If you screw up your disks don't blame me. If you have any questions about this you can email me: [email protected]
The procedure is basically this:
- compile kfed
- dump the disk header with kfed
- Modify the dump file
- write the dump back to the disk header.
** Changing ANYTHING other than the diskgroup name will render your disks useless.
Here is a script to do the work for you:
for file in `ls /dev/vx/rdsk/as1_pccdw/asmdata*`
do
echo "Processing DATA disk $file ..."
search=ASCDW_DATA
replace=AS1CDW_DATA
newlength=`echo $replace | wc -m`
let newlength=$newlength-1
shortname=`echo $file | cut -f 6 -d /`
kfed op=read dev=$file | sed -e '24,24s/ '$search' / '$replace' /' -e '24,24s/length=.*/length='$newlength'/' > /tmp/$shortname.kfed
kfed op=write dev=$file text=/tmp/$shortname.kfed CHKSUM=YES
done
for file in `ls /dev/vx/rdsk/as1_pccdw/asmredo*`
do
echo "Processing REDO disk $file ..."
search=ASCDW_REDO
replace=AS1CDW_REDO
newlength=`echo $replace | wc -m`
let newlength=$newlength-1
shortname=`echo $file | cut -f 6 -d /`
kfed op=read dev=$file | sed -e '24,24s/ '$search' / '$replace' /' -e '24,24s/length=.*/length='$newlength'/' > /tmp/$shortname.kfed
kfed op=write dev=$file text=/tmp/$shortname.kfed CHKSUM=YES
done
Edited by: user4630111 on Nov 10, 2008 6:03 PM -
Hi,
A few days ago, I have asked a question about“ASM disk header corruption”at ASM disk header corruption .
Because it was only my assumption, I didn't think about it deeply. This morning I encountered a problem from a thread. It said that the diskgroup couldn't mount, and ORA-15196 appeared in the alert.log.
It occurred to me that if the diskgroup couldn't mount and the header of an asm disk corrupted, how should I deal with it? Will the data in the ASM disk be lost?
Please help me with this problem.
Thanks in advance.user526904 wrote:
I hope you have the backup, Incase its the media failure and if you can detemine the corroupt datafiles, just restore the corroupt datafiles from backup and recover them. Database backups cannot restore a corrupted ASM header. rman backups the Oracle data files. Not the physical ASM disk itself with the disk's headers.
To restore that, you will need a physical disk backup. A physical disk backup will need all processes using the disks to terminate in order to ensure that all file handles are closed and that the backup is consistent. Not something that is easily done in today's 24x7 environments. RAID is usually used to address this type of failure (e.g. via hot swappable disks, where you simply replace the faulty disk with a new one, while the storage system is running).
So where you do not have that physical redundancy, and have to deal with "physical disk" error (like corrupted header blocks), you need to be extremely careful on how to try and recover that. I would not even try and touch that disk. I will ensure that no processes touch that disk at all, create a duplicate disk (same size) and manually "mirror" the data (using dd for example). This will serve two purposes. Tests whether physical reads on the problem disk succeeds (is this actual media failure, or logical failure?). And create a 2nd disk that can be used for testing/playing purposes, prior to trying any fixes on the problem disk. -
Hi! I have Oracle RAC on Centos with one active node and share storage HP MSA1500 12x500 Gb FC. Also, i have two servers which i want install new rac database.
[oracle@server ~]$ crsctl query css votedisk
0. 0 /u04/sync/oracrs/CSSFile
located 1 votedisk(s).
[oracle@server ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 262120
Used space (kbytes) : 1672
Available space (kbytes) : 260448
ID : 1521772939
Device/File Name : /u04/sync/oracrs/CRSFile
Device/File integrity check succeeded
Device/File not configured
Cluster registry integrity check succeeded
[oracle@server ~]$
CRS & CSS install on ocfs.
Затем, через oracleasm
[oracle@server ~]$ /etc/init.d/oracleasm listdisks
VOL8
VOL9
[oracle@server ~]$
[root@server ~]# /etc/init.d/oracleasm querydisk /dev/sda1
Disk "/dev/sda1" is not marked an ASM disk
[root@server ~]# /etc/init.d/oracleasm querydisk /dev/sdb1
Disk "/dev/sdb1" is marked an ASM disk with the label ""
[root@server ~]# /etc/init.d/oracleasm querydisk /dev/sdc1
Disk "/dev/sdc1" is marked an ASM disk with the label ""
[root@server ~]# /etc/init.d/oracleasm querydisk /dev/sdd1
Disk "/dev/sdd1" is marked an ASM disk with the label "VOL8"
[root@server ~]# /etc/init.d/oracleasm querydisk /dev/sdd2
Disk "/dev/sdd2" is marked an ASM disk with the label "VOL9"
[root@server ~]# /etc/init.d/oracleasm querydisk /dev/sdd3
Disk "/dev/sdd3" is not marked an ASM disk
[root@server ~]#
[root@server ~]# fdisk -l
Disk /dev/cciss/c0d0: 72.8 GB, 72833679360 bytes
255 heads, 32 sectors/track, 17433 cylinders
Units = cylinders of 8160 * 512 = 4177920 bytes
Device Boot Start End Blocks Id System
/dev/cciss/c0d0p1 * 1 50 203984 83 Linux
/dev/cciss/c0d0p2 51 1305 5120400 82 Linux swap
/dev/cciss/c0d0p3 1306 17433 65802240 83 Linux
Disk /dev/sda: 1048 MB, 1048657920 bytes
33 heads, 61 sectors/track, 1017 cylinders
Units = cylinders of 2013 * 512 = 1030656 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 1017 1023580 83 Linux
Disk /dev/sdb: 500.0 GB, 500071791104 bytes
255 heads, 63 sectors/track, 60796 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 60796 488343838+ 83 Linux
Disk /dev/sdc: 499.0 GB, 499025092608 bytes
255 heads, 63 sectors/track, 60669 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 1 60669 487323711 83 Linux
Disk /dev/sdd: 500.0 GB, 500073750528 bytes
255 heads, 63 sectors/track, 60797 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdd1 1 1000 8032468+ 83 Linux
/dev/sdd2 1001 2000 8032500 83 Linux
/dev/sdd3 2001 3000 8032500 83 Linux
[root@server ~]#
Than in sqlplusi see, what disks use ASM:
SQL> select name, total_mb, free_mb,path from v$asm_disk;
NAME TOTAL_MB FREE_MB PATH
VOL1 476898 431798 /dev/raw/raw1
VOL2 475902 475379 /dev/raw/raw2
SQL> select name, total_mb, free_mb from v$asm_diskgroup;
NAME TOTAL_MB FREE_MB
DATA 476898 431798
RECOVERY_AREA 475902 475379
SQL> select name, type from V$asm_diskgroup;
NAME TYPE
DATA EXTERN
RECOVERY_AREA EXTERN
Also i see what disks use diskgroups DATA and RECOVERY:
[root@server ~]# cat /etc/sysconfig/rawdevices
# This file and interface are deprecated.
# Applications needing raw device access should open regular
# block devices with O_DIRECT.
# raw device bindings
# format: <rawdev> <major> <minor>
# <rawdev> <blockdev>
# example: /dev/raw/raw1 /dev/sda1
# /dev/raw/raw2 8 5
/dev/raw/raw1 /dev/sdb1
/dev/raw/raw2 /dev/sdc1
/dev/raw/raw8 /dev/sdd1
/dev/raw/raw9 /dev/sdd2
[root@server ~]#
[root@server ~]# mount
/dev/cciss/c0d0p3 on / type ext3 (rw)
none on /proc type proc (rw)
none on /sys type sysfs (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
usbfs on /proc/bus/usb type usbfs (rw)
/dev/cciss/c0d0p1 on /boot type ext3 (rw)
none on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
configfs on /config type configfs (rw)
ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
/dev/sda1 on /u04/sync type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)
oracleasmfs on /dev/oracleasm type oracleasmfs (rw)
[root@server ~]#
[root@server ~]# ls -l /dev/oracleasm/disks/
total 0
brw-rw---- 1 oracle dba 8, 49 Aug 8 19:34 VOL8
brw-rw---- 1 oracle dba 8, 50 Aug 8 19:34 VOL9
[root@server ~]#
How i could resize asm disk from 450 G for example to 100 Gb for my new rac installation?
Sorry for my english:(Hi mmusette, hi all,
please, allow a little clarification. ASM allows resizing disks:
From: http://download.oracle.com/docs/cd/B19306_01/server.102/b14231/storeman.htm#sthref1727
Resizing Disks in Disk Groups+
The RESIZE clause of ALTER DISKGROUP enables you to perform the following operations:
Resize all disks in the disk group
Resize specific disks
Resize all of the disks in a specified failure group
If you do not specify a new size in the SIZE clause then ASM uses the size of the disk as returned by the operating system. This could be a means of recovering disk space when you had previously restricted the size of the disk by specifying a size smaller than disk capacity.
The new size is written to the ASM disk header record and if the size of the disk is increasing, then the new space is immediately available for allocation. If the size is decreasing, rebalancing must relocate file extents beyond the new size limit to available space below the limit. If the rebalance operation can successfully relocate all extents, then the new size is made permanent, otherwise the rebalance fails.
However, if you have setup the ASM disk on a physical disk partition, you probably will not be able to resize the partition without destroying the data on the disk. If you, however, used a volume manager to create volumes and you based your ASM disks on those volumes (through RAW devices or directly) AND your volume manager allows resizing the volumes, you should be able to make use of the command mentioned above.
Thanks,
Markus -
Hi all,
My env:oracle 10g 10.2.0.4,RHEL4.0
I have prob in my ASM disk groups (4disks) one of my ASM-disk is corrupted.
How to restore and recover that one .please let me if any one knows.What is the status of db ? mount, unmount or open?
Do you know the datafiles that are in your corrupted ASM disk?
I think that you solve the problem restore your all database tor part of the database . -
How to simulate the ASM disk corruption & recover it back ?
Hi DBA's,
we are creating a new grid .SO we would like to simulate the ASM disk corruption & recover it back . Do you have any way that we can simulate the situation ?Hi,
>>>>>>>>>>>>>>>>>>>>> Please do not try this on production. >>>>>>>>>>>>>>>>>>>>>
Simulate ASM disk corruption:
dd if=/dev/zero of=/dev/sdb1 bs=1024 count=4
Solution:
kfed repair /dev/sdb1 >>>>>>> will fix only asm header issue
Thanks,
Rajasekhar -
Recover OCR and VOTE disk after complete corruption of ASM disk groups.
Hi Gurus,
I am simulating a recovery situation to perform recover of OCR and Vote files after complete corruption of ASM related disks and diskgroups. I have setup my environment as follows:\
Environment: RAC
OS: OEL 5.5 32-bit
GI Version: 11.2.0.2.0
ASM Disk groups: +OCR, +DATA
OCR, Vote Files location: +OCR
ASM Redundancy: External
ASM Disks: /dev/asm-disk1, /dev/asm-disk2
/dev/asm-disk1 - mapped on +OCR
/dev/asm-disk2 - mapped on +DATA
With the above configuration in place I have manually corrupted +OCR, +DATA diskgroups with dd command. I used this command to completely corrupt +OCR disk group.
dd if=/dev/zero of=/dev/asm-disk1. I have manual backups as well as automatic backups of OCR and Vote disk. I am not using ASMLib.
I followed this link:
http://docs.oracle.com/cd/E11882_01/rac.112/e17264/adminoc.htm#TDPRC237
When I tried to recover OCR file, I could not do so as there is no such diskgroup which ASM can restore the OCR, Voting disk to. I could not Re-create OCR and DATA diskgroups as I cannot connect to ASM instance. If you have a solution or workaround for my situation please describe it. That will be greatly appreciated.
Thanks and Regards,
Suresh.Please go through the following document which have the detailed steps to restore the OCR
How to restore ASM based OCR after complete loss of the CRS diskgroup on Linux/Unix systems [ID 1062983.1] -
I just tried to resize an ASM disk and although the feedback was 'successful', there doesn't appear to have been any change.
I was attempting to shrink disk DATA_0001 from 200G to 100G. Am I missing something obvious?
SQL> select group_number, name, path, os_mb, total_mb, free_mb from v$asm_disk;
GROUP_NUMBER NAME PATH OS_MB TOTAL_MB FREE_MB
0 /dev/iscsi/rman11 20489 0 0
0 /dev/iscsi/rmanB11 102398 0 0
0 /dev/iscsi/rman1 20490 0 0
0 /dev/iscsi/vote3 300 0 0
0 /dev/iscsi/vote1 300 0 0
0 /dev/iscsi/rmanP11 204805 0 0
0 /dev/iscsi/vote2 300 0 0
0 /dev/iscsi/rmanP1 204810 0 0
0 /dev/iscsi/rmanB1 102405 0 0
1 DATA_0000 /dev/iscsi/db1 10245 10245 10109
2 FRA_0000 /dev/iscsi/flshbk1 20490 20490 20465
GROUP_NUMBER NAME PATH OS_MB TOTAL_MB FREE_MB
2 FRA_0001 /dev/iscsi/flshbkR1 409605 409605 409262
1 DATA_0001 /dev/iscsi/dbR1 204810 204810 202297
13 rows selected.
SQL> alter diskgroup data resize disk 'data_0001' size 100g;
Diskgroup altered.
SQL> select group_number, name, path, os_mb, total_mb, free_mb from v$asm_disk;
GROUP_NUMBER NAME PATH OS_MB TOTAL_MB FREE_MB
0 /dev/iscsi/rman11 20489 0 0
0 /dev/iscsi/rmanB11 102398 0 0
0 /dev/iscsi/rman1 20490 0 0
0 /dev/iscsi/vote3 300 0 0
0 /dev/iscsi/vote1 300 0 0
0 /dev/iscsi/rmanP11 204805 0 0
0 /dev/iscsi/vote2 300 0 0
0 /dev/iscsi/rmanP1 204810 0 0
0 /dev/iscsi/rmanB1 102405 0 0
1 DATA_0000 /dev/iscsi/db1 10245 10245 10004
2 FRA_0000 /dev/iscsi/flshbk1 20490 20490 20465
GROUP_NUMBER NAME PATH OS_MB TOTAL_MB FREE_MB
2 FRA_0001 /dev/iscsi/flshbkR1 409605 409605 409262
1 DATA_0001 /dev/iscsi/dbR1 204810 204810 202402
13 rows selected.The free_mb seems to have increased, but otherwise I can't see the effect of my change. Maybe I'm looking in the wrong place??
I tried restarting the ASM instance but it made no difference.
After resizing the disk in ASM I shrunk the disk volume in our storage array. ASM was of course down at the time.
When I attempted to restart ASM I saw this ...
SQL> startup
ASM instance started
Total System Global Area 283930624 bytes
Fixed Size 2158992 bytes
Variable Size 256605808 bytes
ASM Cache 25165824 bytes
ORA-15032: not all alterations performed
ORA-15036: disk '/dev/iscsi/dbR1' is truncatedNone of my diskgroups are mounted ...
SQL> select group_number, name, state from v$asm_diskgroup;
GROUP_NUMBER NAME STATE
0 DATA DISMOUNTED
0 FRA DISMOUNTEDHere's the messages from the ASM instance alert log ...
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: cache registered group DATA number=1 incarn=0x5f5e3343
NOTE: cache began mount (not first) of group DATA number=1 incarn=0x5f5e3343
NOTE: cache registered group FRA number=2 incarn=0x5f5e3344
NOTE: cache began mount (not first) of group FRA number=2 incarn=0x5f5e3344
WARNING::ASMLIB library not found. See trace file for details.
NOTE: Assigning number (1,0) to disk (/dev/iscsi/db1)
NOTE: cache dismounting group 1/0x5F5E3343 (DATA)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0x5F5E3343 (DATA)
NOTE: cache ending mount (fail) of group DATA number=1 incarn=0x5f5e3343
kfdp_dismount(): 1
kfdp_dismountBg(): 1
NOTE: De-assigning number (1,0) from disk (/dev/iscsi/db1)
ERROR: diskgroup DATA was not mounted
NOTE: Assigning number (2,1) to disk (/dev/iscsi/flshbkR1)
NOTE: Assigning number (2,0) to disk (/dev/iscsi/flshbk1)
NOTE: cache dismounting group 2/0x5F5E3344 (FRA)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 2/0x5F5E3344 (FRA)
NOTE: cache ending mount (fail) of group FRA number=2
incarn=0x5f5e3344
kfdp_dismount(): 2
kfdp_dismountBg(): 2
NOTE: De-assigning number (2,0) from disk (/dev/iscsi/flshbk1)
NOTE: De-assigning number (2,1) from disk (/dev/iscsi/flshbkR1)
ERROR: diskgroup FRA was not mounted
ORA-15032: not all alterations performed
ORA-15036: disk '/dev/iscsi/dbR1' is truncated
ERROR: ALTER DISKGROUP ALL MOUNTAny clues?
Thanks,
SteveThanks Markus. I changed the size of the volume back to the original and was able to restart the ASM instances on both nodes. I confirmed it saw the size as the original 200G.
I then shut down ASM on the second node and issued the alter on the first node. Here's what happened ...
SQL> select group_number, name, path, os_mb, total_mb, free_mb from v$asm_disk order by name;
GROUP_NUMBER NAME PATH OS_MB TOTAL_MB FREE_MB
1 DATA_0000 /dev/iscsi/db1 10245 10245 10004
1 DATA_0001 /dev/iscsi/dbR1 204810 204810 202402
2 FRA_0000 /dev/iscsi/flshbk1 20490 20490 20465
2 FRA_0001 /dev/iscsi/flshbkR1 409605 409605 409262
0 /dev/iscsi/rman1 20490 0 0
0 /dev/iscsi/rmanP1 204810 0 0
0 /dev/iscsi/rmanB1 102405 0 0
0 /dev/iscsi/rmanP11 204805 0 0
0 /dev/iscsi/rmanB11 102398 0 0
0 /dev/iscsi/rman11 20489 0 0
10 rows selected.
SQL> alter diskgroup data resize disk 'data_0001' size 100g;
alter diskgroup data resize disk 'data_0001' size 100g
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0001" may result in a data lossHere's the details from the alert log ...
SQL> alter diskgroup data resize disk 'data_0001' size 100g
NOTE: requesting all-instance membership refresh for group=1
WARNING: cache read a corrupted block gn=1 dsk=1 blk=257 from disk 1
NOTE: a corrupted block was dumped to /var/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_5295.trc
ERROR: cache failed to read gn=1 dsk=1 blk=257 from disk(s): 1
ORA-15196: invalid ASM block header [kfc.c:9133] [endian_kfbh] [2147483649] [257] [0 != 1]
System State dumped to trace file /var/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_5295.trc
NOTE: cache initiating offline of disk 1 group 1
WARNING: initiating offline of disk 1.3688884620 (DATA_0001) with mask 0x7e
NOTE: initiating PST update: grp = 1, dsk = 1, mode = 0x15
kfdp_updateDsk(): 14
Thu May 07 15:45:38 2009
kfdp_updateDskBg(): 14
ERROR: too many offline disks in PST (grp 1)
Thu May 07 15:45:38 2009
NOTE: halting all I/Os to diskgroup DATA
Thu May 07 15:45:38 2009
SQL> alter diskgroup DATA dismount force
NOTE: active pin found: 0x0x6ddf6060
NOTE: active pin found: 0x0x6ddf6168
ERROR: ORA-15130 signalled during resize of diskgroup DATA
Thu May 07 15:45:38 2009
NOTE: membership refresh pending for group 1/0xdc0f1999 (DATA)
kfdp_query(): 15
kfdp_queryBg(): 15
SUCCESS: refreshed membership for 1/0xdc0f1999 (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 1
Errors in file /var/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_5202.trc:
ORA-15130: diskgroup "DATA" is being dismounted
Errors in file /var/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_5202.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15032: not all alterations performed
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0001" may result in a data loss
ERROR: alter diskgroup data resize disk 'data_0001' size 100g
NOTE: cache dismounting group 1/0xDC0F1999 (DATA)
NOTE: dbwr not being msg'd to dismount
Thu May 07 15:45:41 2009
Dirty detach reconfiguration started (old inc 6, new inc 6)
List of nodes:
0
Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 1 invalid = TRUE
10 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
Thu May 07 15:45:41 2009
freeing rdom 1
Thu May 07 15:45:41 2009
WARNING: dirty detached from domain 1
NOTE: cache dismounted group 1/0xDC0F1999 (DATA)
kfdp_dismount(): 16
kfdp_dismountBg(): 16
NOTE: De-assigning number (1,0) from disk (/dev/iscsi/db1)
NOTE: De-assigning number (1,1) from disk (/dev/iscsi/dbR1)
SUCCESS: diskgroup DATA was dismounted
SUCCESS: alter diskgroup DATA dismount force
ERROR: PST-initiated MANDATORY DISMOUNT of group DATA
Thu May 07 15:46:06 2009
SQL> alter diskgroup data resize disk 'data_0001' size 100g
ORA-15032: not all alterations performed
ORA-15001: diskgroup "DATA" does not exist or is not mounted
ERROR: alter diskgroup data resize disk 'data_0001' size 100gLooks like my earlier attempt has indeed screwed up something so even though the instances start OK and mount the diskgroup, I think there's a fair chance something would go splat sooner rather than later.
As this is a test database, I think I'll cut my losses and rebuild the diskgroup then restore it from backup. I'm assuming the effort involved in correcting the corruption will be greater than rebuilding and restoring.
Do you agree with this?
Then I'll try again and hopefully get it right.
Thanks for your help!!!
Steve -
Hi,
I am faced a question in a interview about asm.How to recover corrupted asm disk group(all the disk).
Please provide steps to recover.
ThanksIf you don't know the answer, then you may not be qualified to support this type of configuration. Always start with the documentation.
And if you are answering this level of interview question, saying "I don't know, but can learn" will be a lot better than getting in over your head. Because the follow-on questions will show whether or not you are trying to BS your way through the answer. And if they are any good at interviewing, you won't know that you failed until you don't hear back from them. -
ORA-15196: invalid ASM block header
asm alert shows:
WARNNING: cache read a corrupted block group=1(CRSDG) dsk=0 blk=2 from disk 0(CRSVOL)
Errors in file /u01/app/11.2.0/grid/log/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_16585.trc:
ORA-15196: invalid ASM block header [kfc.c:25578] [endian_kfbh] [2147483648] [2] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:25427] [endian_kfbh] [2147483648] [2] [0 != 1]
Here are my challenges to the answer providers and I am going to number my questions. You are welcome to answer any question listed below, just please quota my question number first. Thanks.
1) "cache read a corrupted block group" , what exactly is the "cache" here?
2) "group=1(CRSDG) dsk=0 blk=2 from disk 0(CRSVOL)" we know that ASM is stored in AUs, here only gives dsk=0, blk=2, but without knowing the AU number, how can I use kfed to read the proper block?
3) "kfc.c:25578" what is kfc.c? I can find explanation about kfbh.<>, kfdhdb.<>, but have no clue about what kfc.c is.
4) "[endian_kfbh] [2147483648] [2] [0 != 1]" I know endian_kfbh is about the platform's Endian, but what is 2147483648 referring to here?
5) "[endian_kfbh] [2147483648] [2] [0 != 1]" I am guessing [2] refers to block number 2, is this correct?
6) overall, is all it is trying to tell me is "the plaform should be in small endian (1), but now it is marked as big endian(2) at some header place"?
7) of course, I would appreciate your solutions very much. you solution is ...?
8) Is there any good article reference about these stuff?
lisingt every single questions of mine is my favorite seeking help style. Doing this help learn better. I am expecting your contributions.
Thanks in advance!1) "cache read a corrupted block group" , what exactly is the "cache" here?
I think I need not ask this question. Cache here simply means reading the block from disk to the cache.
However, questions 2 ~ 7 are still valid. -
Questions on asm disk discovery:
Questions on asm disk discovery:
1)What is the relationship btween asm_diskstring in the init.ora and DiscoveryString in the GPNP profile.xml?
2) Which one of the above two finally accounts for the disk discovery process?
3) We know that asmlib disks are self describing at the disk header. This overcomes the disk name/path persistency issue as we no long need to rely on the path to discover the asm disks, by setting asm_diskstring='ORCL:*' , ASM instance will identify the right disks automatically. However, I am not sure if setting asm_diskstring='ORCL:*' is the most economic way to do the discovery as I am not sure if Oracle will have to probe all the disks on the OS to determine the right disks. If Oracle has to screen all the disks in this way, then I think setting asm_diskstring='<path_to_asmlib_disk>' will be much faster, although this will be open to the persistent problem. Is my understanding correct?
Thanks.From my understanding all disk you see in /dev/oracleasm/disks are the disks in your system that been discovered by asmlib at discovery stage.
Currently, due to bug 13465545, ASM instance will discover disks from both locations, ASM_DISKSTRING and gpnp profile, which can cause some mess in disk representation for asm. You can check the settings using asmcmd command: dsget, and set to be the same using dsset.
I think its more secure to set ASM_DISKSTRING to only the disks used by asm instance.
ASMCMD> dsget
Regards
Ed -
Questions on asm disk discover:
Questions on asm disk discover:
1)What is the relationship btween asm_diskstring in the init.ora and DiscoveryString in the GPNP profile.xml?
2) which one finally accounts for the disk discovery process?
3) We know that asmlib disks are self describing at the disk header. This overcomes the disk name/path persistency issue as we do not rely on the path the discover the asmlib
disks. asm_diskstring='ORCL:*' will identify the right disks. I am not sure if setting 'ORCL:*' is the most economic way as I am not sure if Oracle will have to scan all the disks
on the OS and probe the disks that it has rigths to determine which disks belong to ASM. If Oracle has to screen all the disks in this way, then I think setting
asm_diskstring='<path_to_asmlib_disk>' will be much faster. However, this will be open to the persistent problem. Is my understanding correct?
Thanks.Questions on asm disk discovery:
1)What is the relationship btween asm_diskstring in the init.ora and DiscoveryString in the GPNP profile.xml?
2) Which one of the above two finally accounts for the disk discovery process?
3) We know that asmlib disks are self describing at the disk header. This overcomes the disk name/path persistency issue as we no long need to rely on the path to discover the asm disks, by setting asm_diskstring='ORCL:*' , ASM instance will identify the right disks automatically. However, I am not sure if setting asm_diskstring='ORCL:*' is the most economic way to do the discovery as I am not sure if Oracle will have to probe all the disks on the OS to determine the right disks. If Oracle has to screen all the disks in this way, then I think setting asm_diskstring='<path_to_asmlib_disk>' will be much faster, although this will be open to the persistent problem. Is my understanding correct?
Thanks. -
Want to move datafiles, controlfiles, redolog on new ASM Disks (11gR2 RAC)
Hi Guys,
Setup: Two Node 11gR2 (11.2.0.1) RAC on RHEL 5.4
Existing disks are from Old SAN & New Disks are from New SAN.
Can I move all datafiles (+DATA), controlfiles (+CTRL), redolog (+REDO) on new ASM Disks by adding disks in is same Diskgroup & dropping older disks from existing Diskgroup taking advantage of ASM Re-balancing Feature.
1) add required disks in the DATA Diskgroups,
ALTER DISKGROUP DATA ADD DISK
'/dev/oracleasm/disks/NEWDATA3' NAME NEWDATA_0003,
'/dev/oracleasm/disks/NEWDATA4' NAME NEWDATA_0004,
'/dev/oracleasm/disks/NEWDATA5' NAME NEWDATA_0005
REBALANCE POWER 11;
Check rebalance status from v$ASM_OPERATION.
2) When rebalance completes, drop the old disks.
ALTER DISKGROUP DATA DROP DISK
NEWDATA_0000,
NEWDATA_0001
REBALANCE POWER 11;
Check rebalance status from v$ASM_OPERATION.
3) Do it same for Redo log groups & Controlfile Diskgroups.
I hope, I could do this Activity, even if database is Up. is there possibility of Database block Corruption ??? (or is it necessary to perform above steps when database is down)
Would be appreciated, your quick responses on the same.
It's an urgent requirement. Thanks.
Regards,
ManishManish Nashikkar wrote:
Hi Guys,
Setup: Two Node 11gR2 (11.2.0.1) RAC on RHEL 5.4
Existing disks are from Old SAN & New Disks are from New SAN.
Can I move all datafiles (+DATA), controlfiles (+CTRL), redolog (+REDO) on new ASM Disks by adding disks in is same Diskgroup & dropping older disks from existing Diskgroup taking advantage of ASM Re-balancing Feature.
1) add required disks in the DATA Diskgroups,
ALTER DISKGROUP DATA ADD DISK
'/dev/oracleasm/disks/NEWDATA3' NAME NEWDATA_0003,
'/dev/oracleasm/disks/NEWDATA4' NAME NEWDATA_0004,
'/dev/oracleasm/disks/NEWDATA5' NAME NEWDATA_0005
REBALANCE POWER 11;
Check rebalance status from v$ASM_OPERATION.
2) When rebalance completes, drop the old disks.
ALTER DISKGROUP DATA DROP DISK
NEWDATA_0000,
NEWDATA_0001
REBALANCE POWER 11;
Check rebalance status from v$ASM_OPERATION.
3) Do it same for Redo log groups & Controlfile Diskgroups.
I hope, I could do this Activity, even if database is Up. is there possibility of Database block Corruption ??? (or is it necessary to perform above steps when database is down)
Would be appreciated, your quick responses on the same.
It's an urgent requirement. Thanks.
Regards,
Manish
Hi Manish,
Yes you can do that by adding new disk to existing diskgroup and delete old diskgroup. The good thing is this can be done online however you need to make sure the rebalance power is meet your business time, higher rebalance power is faster to rebalance to complete however it also will consume more resources
Cheers -
Difference between ASM Disk Group, ADVM Volume and ACFS File system
Q1. What is the difference between an ASM Disk Group and an ADVM Volume ?
To my mind, an ASM Disk Group is effectively a logical volume for Database files ( including FRA files ).
11gR2 seems to have introduced the concepts of ADVM volumes and ACFS File Systems.
An 11gR2 ASM Disk Group can contain :
ASM Disks
ADVM volumes
ACFS file systems
Q2. ADVM volumes appear to be dynamic volumes.
However is this therefore not effectively layering a logical volume ( the ADVM volume ) beneath an ASM Disk Group ( conceptually a logical volume as well ) ?
Worse still if you have left ASM Disk Group Redundancy to the hardware RAID / SAN level ( as Oracle recommend ), you could effectively have 3 layers of logical disk ? ( ASM on top of ADVM on top of RAID/SAN ) ?
Q3. if it is 2 layers of logical disk ( i.e. ASM on top of ADVM ), what makes this better than 2 layers using a 3rd party volume manager ( eg ASM on top of 3rd party LVM ) - something Oracle encourages against ?
Q4. ACFS File systems, seem to be clustered file systems for non database files including ORACLE_HOMEs, application exe's etc ( but NOT GRID_HOME, OS root, OCR's or Voting disks )
Can you create / modify ACFS file systems using ASM.
The oracle toplogy diagram for ASM in the 11gR2 ASM Admin guide, shows ACFS as part of ASM. I am not sure from this if ACFS is part of ASM or ASM sits on top of ACFS ?
Q5. Connected to Q4. there seems to be a number of different ways, ACFS file systems can be created ? Which of the below are valid methods ?
through ASM ?
through native OS file system creation ?
through OEM ?
through acfsutil ?
my head is exploding
Any help and clarification greatly appreciated
JimQ1 - ADVM volume is a type of special file created in the ASM DG. Once created, it creates a block device on the OS itself that can be used just like any other block device. http://docs.oracle.com/cd/E16655_01/server.121/e17612/asmfilesystem.htm#OSTMG30000
Q2 - the asm disk group is a disk group, not really a logical volume. It combines attributes of both when used for database purposes, as the database and certain other applications know how to talk "ASM" protocol. However, you won't find any general purpose applications that can do so. In addition, some customers prefer to deal directly with file systems and volume devices, which ADVM is made to do. In your way of thinking, you could have 3 layers of logical disk, but each of them provides different attributes and characteristics. This is not a bad thing though, as each has a slightly different focus - os file system\device, database specific, and storage centric.
Q3 - ADVM is specifically developed to extend the characteristics of ASM for use by general OS applications. It understands the database performance characteristics and is tuned to work well in that situation. Because it is developed in house, it takes advantage of the ASM design model. Additionally, rather than having to contact multiple vendors for support, your support is limited to calling Oracle, a one-stop shop.
Q4 - You can create and modify ACFS file systems using command line tools and ASMCA. Creating and modifying logical volumes happens through SQL(ASM), asmcmd, and ASMCA. EM can also be used for both items. ACFS sits on top of ADVM, which is a file in an ASM disk group. ACFS is aware of the characteristics of ASM\ADVM volumes, and tunes it's IO to make best use of those characteristics.
Q5 - several ways:
1) Connect to ASM with SQL, use 'alter diskgroup add volume' as Mihael points out. This creates an ADVM volume. Then, format the volume using 'mkfs' (*nix) or acfsformat (windows).
2) Use ASMCA - A gui to create a volume and format a file system. Probably the easiest if your head is exploding.
3) Use 'asmcmd' to create a volume, and 'mkfs' to format the ACFS file system.
Here is information on ASMCA, with examples:
http://docs.oracle.com/cd/E16655_01/server.121/e17612/asmca_acfs.htm#OSTMG94348
Information on command line tools, with examples:
Basic Steps to Manage Oracle ACFS Systems -
Problem with create asm disk group
Hi all
I am about configuring ASM, so I have downloaded the Grid infrastructure 11g (32 bit), I have configured and created parameters and directories.
I runned the installer but get stack at the 3 step where I have to change the discovery path. I have taped as path /dev where I have created 3 partitions sdb1, sdc1 and sdd1.
Is there any thing should I perform on partitions may be or parameters to set before I go through the installation?
Thanks for helpYou can use the below link to install ASMLIB:
http://gssdba.wordpress.com/category/asm/
REFERANCE : Doc ID 580153.1
There are two different methods to configure ASM on Linux:
ASM with ASMLib I/O: This method creates all Oracle database files on raw block devices managed by ASM using ASMLib calls. RAW devices are not required with this method as ASMLib works with block devices.
ASM with Standard Linux I/O: This method creates all Oracle database files on raw character devices managed by ASM using standard Linux I/O system calls. You will be required to create RAW devices for all disk partitions used by ASM.
You can download the ASMLIB rpm’s from below URL:
http://www.oracle.com/technetwork/server-storage/linux/downloads/rhel5-084877.html
STEP 01: LOG IN AS ROOT USER AND INSTALL THE RPMS
[root@node1 ASMLIB]# rpm -Uvh oracleasm-2.6.18-164.el5-2.0.5-1.el5.i686.rpm \
> oracleasmlib-2.0.4-1.el5.i386.rpm \
> oracleasm-support-2.1.8-1.el5.i386.rpm
warning: oracleasm-2.6.18-164.el5-2.0.5-1.el5.i686.rpm: Header V3 DSA signature: NOKEY, key ID 1e5e0159
Preparing… ########################################### [100%]
1:oracleasm-support ########################################### [ 33%]
2:oracleasm-2.6.18-164.el########################################### [ 67%]
3:oracleasmlib ########################################### [100%]
STEP 02: CONFIGURE ASMLIB
[root@node1 ASMLIB]# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library
driver. The following questions will determine whether the driver is
loaded on boot and what permissions it will have. The current values
will be shown in brackets (‘[]‘). Hitting <ENTER> without typing an
answer will keep that current value. Ctrl-C will abort.
Default user to own the driver interface []: oracle
Default group to own the driver interface []: dba
Start Oracle ASM library driver on boot (y/n) [n]: y
Scan for Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration: done
Initializing the Oracle ASMLib driver: [ OK ]
Scanning the system for Oracle ASMLib disks: [ OK ]
STEP 03 :CREATE ASM DISK
[root@node1 ASMLIB]# /etc/init.d/oracleasm listdisks
[root@node1 ASMLIB]#
[root@node1 ~]# /etc/init.d/oracleasm createdisk VOL1 /dev/sdb1
Marking disk “VOL1″ as an ASM disk: [ OK ]
[root@node1 ~]# /etc/init.d/oracleasm createdisk VOL2 /dev/sdc1
Marking disk “VOL2″ as an ASM disk: [ OK ]
[root@node1 ~]# /etc/init.d/oracleasm createdisk VOL3 /dev/sdd1
Marking disk “VOL3″ as an ASM disk: [ OK ]
[root@node1 ~]# /etc/init.d/oracleasm createdisk VOL4 /dev/sde1
Marking disk “VOL4″ as an ASM disk: [ OK ]
[root@node1 ~]# /etc/init.d/oracleasm createdisk VOL5 /dev/sdf1
Marking disk “VOL5″ as an ASM disk: [ OK ]
[root@node1 ~]# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
VOL4
VOL5
[root@node1 ~]#
Maybe you are looking for
-
Back up error elements7?
elements 7 tried to back up. after choosing the destination drive errror message appears "error encountered while writing file" using XP
-
Hi I wanted to print the output of the oracle report in Korean language. I registered a report with the output type as text , am i able to display the output of the concurrent program in Korean Language, as the client is looking for pdf format, i cha
-
Lighroom 3 writing metadata to image files?
Hi, I have been wondering why my backups (differential) weregetting larger than they should, and I noticed that some of the files (.psd and .jpg, and mayde few .tiff) were modified. Interestingly, I do not see this happening to cr2-files, that I most
-
I am not able to access web area access manager under remote app programs which is under remote desktop. While accessing the web area access manager am getting the following error message Error: The connection was denied because the user account is
-
Can I Collaborate Using Just PrPro Files with a Colleague in London?
Hello, I am in the US and have a colleague in London. I would like to collaborate with her on the final edits to my film and on the conversion to DVD. May I exchange PrPro files with her via YouSendIt or something like that? I'm worried that such a