ASM disk corrupted

Hi,
A few days ago, I have asked a question about“ASM disk header corruption”at ASM disk header corruption .
Because it was only my assumption, I didn't think about it deeply. This morning I encountered a problem from a thread. It said that the diskgroup couldn't mount, and ORA-15196 appeared in the alert.log.
It occurred to me that if the diskgroup couldn't mount and the header of an asm disk corrupted, how should I deal with it? Will the data in the ASM disk be lost?
Please help me with this problem.
Thanks in advance.

user526904 wrote:
I hope you have the backup, Incase its the media failure and if you can detemine the corroupt datafiles, just restore the corroupt datafiles from backup and recover them. Database backups cannot restore a corrupted ASM header. rman backups the Oracle data files. Not the physical ASM disk itself with the disk's headers.
To restore that, you will need a physical disk backup. A physical disk backup will need all processes using the disks to terminate in order to ensure that all file handles are closed and that the backup is consistent. Not something that is easily done in today's 24x7 environments. RAID is usually used to address this type of failure (e.g. via hot swappable disks, where you simply replace the faulty disk with a new one, while the storage system is running).
So where you do not have that physical redundancy, and have to deal with "physical disk" error (like corrupted header blocks), you need to be extremely careful on how to try and recover that. I would not even try and touch that disk. I will ensure that no processes touch that disk at all, create a duplicate disk (same size) and manually "mirror" the data (using dd for example). This will serve two purposes. Tests whether physical reads on the problem disk succeeds (is this actual media failure, or logical failure?). And create a 2nd disk that can be used for testing/playing purposes, prior to trying any fixes on the problem disk.

Similar Messages

How to simulate the ASM disk corruption & recover it back ?

Hi DBA's,
we are creating a new grid .SO we would like to simulate the ASM disk corruption & recover it back . Do you have any way that we can simulate the situation ?

Hi,
>>>>>>>>>>>>>>>>>>>>> Please do not try this on production. >>>>>>>>>>>>>>>>>>>>>
Simulate ASM disk corruption:
dd if=/dev/zero of=/dev/sdb1 bs=1024 count=4
Solution:
kfed repair /dev/sdb1 >>>>>>> will fix only asm header issue
Thanks,
Rajasekhar

Recover OCR and VOTE disk after complete corruption of ASM disk groups.

Hi Gurus,
I am simulating a recovery situation to perform recover of OCR and Vote files after complete corruption of ASM related disks and diskgroups. I have setup my environment as follows:\
Environment: RAC
OS: OEL 5.5 32-bit
GI Version: 11.2.0.2.0
ASM Disk groups: +OCR, +DATA
OCR, Vote Files location: +OCR
ASM Redundancy: External
ASM Disks: /dev/asm-disk1, /dev/asm-disk2
/dev/asm-disk1 - mapped on +OCR
/dev/asm-disk2 - mapped on +DATA
With the above configuration in place I have manually corrupted +OCR, +DATA diskgroups with dd command. I used this command to completely corrupt +OCR disk group.
dd if=/dev/zero of=/dev/asm-disk1. I have manual backups as well as automatic backups of OCR and Vote disk. I am not using ASMLib.
I followed this link:
http://docs.oracle.com/cd/E11882_01/rac.112/e17264/adminoc.htm#TDPRC237
When I tried to recover OCR file, I could not do so as there is no such diskgroup which ASM can restore the OCR, Voting disk to. I could not Re-create OCR and DATA diskgroups as I cannot connect to ASM instance. If you have a solution or workaround for my situation please describe it. That will be greatly appreciated.
Thanks and Regards,
Suresh.

Please go through the following document which have the detailed steps to restore the OCR
How to restore ASM based OCR after complete loss of the CRS diskgroup on Linux/Unix systems [ID 1062983.1]

ASM disk header corruption

Hi.
If the header of an asm disk is corrupted, then I drop the corrupted disk and re-add, will the data in the disk lost?
Thanks.

fschk requires the file system driver to support repairing the file system - in other words, fschk itself does not do the fixing (it does not understand the underlying file system format). Quoting manpage, "+fsck is simply a front-end for the various file system checkers+". So if there is not a fsck.asmcfs (or similar), fsck can not do anything ito fixing invalid structures on that file system.
As for the kernel keeping copies of a file system's superblock - not sure why it would be doing that for ASM. It does not manage it as a file system or service I/O to it (the ASM devices are opened using the O_DIRECT flag).

Want to move datafiles, controlfiles, redolog on new ASM Disks (11gR2 RAC)

Hi Guys,
Setup: Two Node 11gR2 (11.2.0.1) RAC on RHEL 5.4
Existing disks are from Old SAN & New Disks are from New SAN.
Can I move all datafiles (+DATA), controlfiles (+CTRL), redolog (+REDO) on new ASM Disks by adding disks in is same Diskgroup & dropping older disks from existing Diskgroup taking advantage of ASM Re-balancing Feature.
1) add required disks in the DATA Diskgroups,
ALTER DISKGROUP DATA ADD DISK
'/dev/oracleasm/disks/NEWDATA3' NAME NEWDATA_0003,
'/dev/oracleasm/disks/NEWDATA4' NAME NEWDATA_0004,
'/dev/oracleasm/disks/NEWDATA5' NAME NEWDATA_0005
REBALANCE POWER 11;
Check rebalance status from v$ASM_OPERATION.
2) When rebalance completes, drop the old disks.
ALTER DISKGROUP DATA DROP DISK
NEWDATA_0000,
NEWDATA_0001
REBALANCE POWER 11;
Check rebalance status from v$ASM_OPERATION.
3) Do it same for Redo log groups & Controlfile Diskgroups.
I hope, I could do this Activity, even if database is Up. is there possibility of Database block Corruption ??? (or is it necessary to perform above steps when database is down)
Would be appreciated, your quick responses on the same.
It's an urgent requirement. Thanks.
Regards,
Manish

Manish Nashikkar wrote:
Hi Guys,
Setup: Two Node 11gR2 (11.2.0.1) RAC on RHEL 5.4
Existing disks are from Old SAN & New Disks are from New SAN.
Can I move all datafiles (+DATA), controlfiles (+CTRL), redolog (+REDO) on new ASM Disks by adding disks in is same Diskgroup & dropping older disks from existing Diskgroup taking advantage of ASM Re-balancing Feature.
1) add required disks in the DATA Diskgroups,
ALTER DISKGROUP DATA ADD DISK
'/dev/oracleasm/disks/NEWDATA3' NAME NEWDATA_0003,
'/dev/oracleasm/disks/NEWDATA4' NAME NEWDATA_0004,
'/dev/oracleasm/disks/NEWDATA5' NAME NEWDATA_0005
REBALANCE POWER 11;
Check rebalance status from v$ASM_OPERATION.
2) When rebalance completes, drop the old disks.
ALTER DISKGROUP DATA DROP DISK
NEWDATA_0000,
NEWDATA_0001
REBALANCE POWER 11;
Check rebalance status from v$ASM_OPERATION.
3) Do it same for Redo log groups & Controlfile Diskgroups.
I hope, I could do this Activity, even if database is Up. is there possibility of Database block Corruption ??? (or is it necessary to perform above steps when database is down)
Would be appreciated, your quick responses on the same.
It's an urgent requirement. Thanks.
Regards,
Manish
Hi Manish,
Yes you can do that by adding new disk to existing diskgroup and delete old diskgroup. The good thing is this can be done online however you need to make sure the rebalance power is meet your business time, higher rebalance power is faster to rebalance to complete however it also will consume more resources
Cheers

Shrinking ASM disk

I just tried to resize an ASM disk and although the feedback was 'successful', there doesn't appear to have been any change.
I was attempting to shrink disk DATA_0001 from 200G to 100G. Am I missing something obvious?
SQL> select group_number, name, path, os_mb, total_mb, free_mb from v$asm_disk;
GROUP_NUMBER NAME                 PATH                                OS_MB   TOTAL_MB    FREE_MB
          0                      /dev/iscsi/rman11                   20489          0          0
          0                      /dev/iscsi/rmanB11                 102398          0          0
          0                      /dev/iscsi/rman1                    20490          0          0
          0                      /dev/iscsi/vote3                      300          0          0
          0                      /dev/iscsi/vote1                      300          0          0
          0                      /dev/iscsi/rmanP11                 204805          0          0
          0                      /dev/iscsi/vote2                      300          0          0
          0                      /dev/iscsi/rmanP1                  204810          0          0
          0                      /dev/iscsi/rmanB1                  102405          0          0
          1 DATA_0000            /dev/iscsi/db1                      10245      10245      10109
          2 FRA_0000             /dev/iscsi/flshbk1                  20490      20490      20465
GROUP_NUMBER NAME                 PATH                                OS_MB   TOTAL_MB    FREE_MB
          2 FRA_0001             /dev/iscsi/flshbkR1                409605     409605     409262
          1 DATA_0001            /dev/iscsi/dbR1                    204810     204810     202297
13 rows selected.
SQL> alter diskgroup data resize disk 'data_0001' size 100g;
Diskgroup altered.
SQL> select group_number, name, path, os_mb, total_mb, free_mb from v$asm_disk;
GROUP_NUMBER NAME                 PATH                                OS_MB   TOTAL_MB    FREE_MB
          0                      /dev/iscsi/rman11                   20489          0          0
          0                      /dev/iscsi/rmanB11                 102398          0          0
          0                      /dev/iscsi/rman1                    20490          0          0
          0                      /dev/iscsi/vote3                      300          0          0
          0                      /dev/iscsi/vote1                      300          0          0
          0                      /dev/iscsi/rmanP11                 204805          0          0
          0                      /dev/iscsi/vote2                      300          0          0
          0                      /dev/iscsi/rmanP1                  204810          0          0
          0                      /dev/iscsi/rmanB1                  102405          0          0
          1 DATA_0000            /dev/iscsi/db1                      10245      10245      10004
          2 FRA_0000             /dev/iscsi/flshbk1                  20490      20490      20465
GROUP_NUMBER NAME                 PATH                                OS_MB   TOTAL_MB    FREE_MB
          2 FRA_0001             /dev/iscsi/flshbkR1                409605     409605     409262
          1 DATA_0001            /dev/iscsi/dbR1                    204810     204810     202402
13 rows selected.The free_mb seems to have increased, but otherwise I can't see the effect of my change. Maybe I'm looking in the wrong place??
I tried restarting the ASM instance but it made no difference.
After resizing the disk in ASM I shrunk the disk volume in our storage array. ASM was of course down at the time.
When I attempted to restart ASM I saw this ...
SQL> startup
ASM instance started
Total System Global Area 283930624 bytes
Fixed Size                  2158992 bytes
Variable Size             256605808 bytes
ASM Cache                  25165824 bytes
ORA-15032: not all alterations performed
ORA-15036: disk '/dev/iscsi/dbR1' is truncatedNone of my diskgroups are mounted ...
SQL> select group_number, name, state from v$asm_diskgroup;
GROUP_NUMBER NAME            STATE
          0 DATA            DISMOUNTED
          0 FRA             DISMOUNTEDHere's the messages from the ASM instance alert log ...
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: cache registered group DATA number=1 incarn=0x5f5e3343
NOTE: cache began mount (not first) of group DATA number=1 incarn=0x5f5e3343
NOTE: cache registered group FRA number=2 incarn=0x5f5e3344
NOTE: cache began mount (not first) of group FRA number=2 incarn=0x5f5e3344
WARNING::ASMLIB library not found. See trace file for details.
NOTE: Assigning number (1,0) to disk (/dev/iscsi/db1)
NOTE: cache dismounting group 1/0x5F5E3343 (DATA)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0x5F5E3343 (DATA)
NOTE: cache ending mount (fail) of group DATA number=1 incarn=0x5f5e3343
kfdp_dismount(): 1
kfdp_dismountBg(): 1
NOTE: De-assigning number (1,0) from disk (/dev/iscsi/db1)
ERROR: diskgroup DATA was not mounted
NOTE: Assigning number (2,1) to disk (/dev/iscsi/flshbkR1)
NOTE: Assigning number (2,0) to disk (/dev/iscsi/flshbk1)
NOTE: cache dismounting group 2/0x5F5E3344 (FRA)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 2/0x5F5E3344 (FRA)
NOTE: cache ending mount (fail) of group FRA number=2
incarn=0x5f5e3344
kfdp_dismount(): 2
kfdp_dismountBg(): 2
NOTE: De-assigning number (2,0) from disk (/dev/iscsi/flshbk1)
NOTE: De-assigning number (2,1) from disk (/dev/iscsi/flshbkR1)
ERROR: diskgroup FRA was not mounted
ORA-15032: not all alterations performed
ORA-15036: disk '/dev/iscsi/dbR1' is truncated
ERROR: ALTER DISKGROUP ALL MOUNTAny clues?
Thanks,
Steve

Thanks Markus. I changed the size of the volume back to the original and was able to restart the ASM instances on both nodes. I confirmed it saw the size as the original 200G.
I then shut down ASM on the second node and issued the alter on the first node. Here's what happened ...
SQL> select group_number, name, path, os_mb, total_mb, free_mb from v$asm_disk order by name;
GROUP_NUMBER NAME       PATH                                OS_MB   TOTAL_MB    FREE_MB
           1 DATA_0000 /dev/iscsi/db1                      10245      10245      10004
           1 DATA_0001 /dev/iscsi/dbR1                    204810     204810     202402
           2 FRA_0000   /dev/iscsi/flshbk1                  20490      20490      20465
           2 FRA_0001   /dev/iscsi/flshbkR1                409605     409605     409262
           0            /dev/iscsi/rman1                    20490          0          0
           0            /dev/iscsi/rmanP1                  204810          0          0
           0            /dev/iscsi/rmanB1                  102405          0          0
           0            /dev/iscsi/rmanP11                 204805          0          0
           0            /dev/iscsi/rmanB11                 102398          0          0
           0            /dev/iscsi/rman11                   20489          0          0
10 rows selected.
SQL> alter diskgroup data resize disk 'data_0001' size 100g;
alter diskgroup data resize disk 'data_0001' size 100g
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0001" may result in a data lossHere's the details from the alert log ...
SQL> alter diskgroup data resize disk 'data_0001' size 100g
NOTE: requesting all-instance membership refresh for group=1
WARNING: cache read a corrupted block gn=1 dsk=1 blk=257 from disk 1
NOTE: a corrupted block was dumped to /var/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_5295.trc
ERROR: cache failed to read gn=1 dsk=1 blk=257 from disk(s): 1
ORA-15196: invalid ASM block header [kfc.c:9133] [endian_kfbh] [2147483649] [257] [0 != 1]
System State dumped to trace file /var/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_5295.trc
NOTE: cache initiating offline of disk 1 group 1
WARNING: initiating offline of disk 1.3688884620 (DATA_0001) with mask 0x7e
NOTE: initiating PST update: grp = 1, dsk = 1, mode = 0x15
kfdp_updateDsk(): 14
Thu May 07 15:45:38 2009
kfdp_updateDskBg(): 14
ERROR: too many offline disks in PST (grp 1)
Thu May 07 15:45:38 2009
NOTE: halting all I/Os to diskgroup DATA
Thu May 07 15:45:38 2009
SQL> alter diskgroup DATA dismount force
NOTE: active pin found: 0x0x6ddf6060
NOTE: active pin found: 0x0x6ddf6168
ERROR: ORA-15130 signalled during resize of diskgroup DATA
Thu May 07 15:45:38 2009
NOTE: membership refresh pending for group 1/0xdc0f1999 (DATA)
kfdp_query(): 15
kfdp_queryBg(): 15
SUCCESS: refreshed membership for 1/0xdc0f1999 (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 1
Errors in file /var/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_5202.trc:
ORA-15130: diskgroup "DATA" is being dismounted
Errors in file /var/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_5202.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15032: not all alterations performed
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0001" may result in a data loss
ERROR: alter diskgroup data resize disk 'data_0001' size 100g
NOTE: cache dismounting group 1/0xDC0F1999 (DATA)
NOTE: dbwr not being msg'd to dismount
Thu May 07 15:45:41 2009
Dirty detach reconfiguration started (old inc 6, new inc 6)
List of nodes:
0
Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 1 invalid = TRUE
10 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
Thu May 07 15:45:41 2009
freeing rdom 1
Thu May 07 15:45:41 2009
WARNING: dirty detached from domain 1
NOTE: cache dismounted group 1/0xDC0F1999 (DATA)
kfdp_dismount(): 16
kfdp_dismountBg(): 16
NOTE: De-assigning number (1,0) from disk (/dev/iscsi/db1)
NOTE: De-assigning number (1,1) from disk (/dev/iscsi/dbR1)
SUCCESS: diskgroup DATA was dismounted
SUCCESS: alter diskgroup DATA dismount force
ERROR: PST-initiated MANDATORY DISMOUNT of group DATA
Thu May 07 15:46:06 2009
SQL> alter diskgroup data resize disk 'data_0001' size 100g
ORA-15032: not all alterations performed
ORA-15001: diskgroup "DATA" does not exist or is not mounted
ERROR: alter diskgroup data resize disk 'data_0001' size 100gLooks like my earlier attempt has indeed screwed up something so even though the instances start OK and mount the diskgroup, I think there's a fair chance something would go splat sooner rather than later.
As this is a test database, I think I'll cut my losses and rebuild the diskgroup then restore it from backup. I'm assuming the effort involved in correcting the corruption will be greater than rebuilding and restoring.
Do you agree with this?
Then I'll try again and hopefully get it right.
Thanks for your help!!!
Steve

How reInstall Gride Infrastructure and Use old ASM disk groups

<pre>Hello to all
I installed Grig Infrastructure 11gR2 on a standalone server (OS is Linux)
and I configured ASM and my database created on ASM
Conceive that my OS disk corrupted and OS doesn't start and the Gride Home is on that disk,
and I have to install OS again
My ASM disks are safe , Now how can I install Grig Infrastructure again somehow that it can use previous ASM disks
and disk groups and I don't oblige to create my database again ?
In the step 2 of installing Gride Infrastructure it has four options
<pre>
1.Install and configure Oracle Grid Infrastructure for a Cluster
2.Configure Oracle Grid Infrastructure for a Standalone Server
3.Upgrade Oracle Gride Infrastructure or Oracle Automatice Storage Management
4.Install Oracle Gride Infrastructure Software Only
</pre>
If I select the option 2 it wants to create a disk group again
I guess that I need to select option 4 and then do some configuration but I don't know what I must configure
Do you know answer of my question , if yes please explain it's stages
Thank you so much
</pre>

Hi,
no you are not obliged to recreate your database again. However there is a small flaw in the installation procedure, which does not make it 100% easy...
When you installed the Oracle Restart (Standalone GI), your ASM diskgroup will contain the SPFILE of the ASM instance. And this is exactly the small flaw you will be encountering. So you have 2 options for "recovery":
1.) Do a software only install (4), and run roothas.pl. This however will not create any ASM entries. You would have to add it manually (using srvctl) and you can specify the ASM Spfile with the srvctl command. Problem here is however to have to know where your ASM spfile has been. If you have a backup of your OLR and a backup of the GPNP profile, this might be easier to find out.
2.) Do a new installation (2) and configure a new diskgroup (with a "spare" disk or small lun and a new name), that Oracle restart creates ASM instance and the new ASMSpfile for you.
Then you can simply mount the diskgroup containing your database additionally. You then shoudl however move your new ASMSpfile to the new diskgroup (or simply exchange it with the existing one). In this case it is easier to find out where it was - however you will need a spare (though small) LUN for the new spfile (temporarily, until you exchange it).
In either case after you have your ASM instance back (and access to your old diskgroup), you have to reregister your database and services - if you do not have an OLR backup.
Again => It is doable and you can simply mount the ASM diskgroup containing your database. However I suggest you try this one time to know what really needs to be done in this case.
Regards
Sebastian

Rebuild ASM Disk - Copying multiple datafiles from one disk to another

Hi,
I have an environment of four 11GR2 Oracle databases on a Linux server. Each database has its own ASM disk.
DB1 -> ASM_DISK1
DB2 -> ASM_DISK2
DB3 -> ASM_DISK3
DB4 -> ASM_DISK4
I need to rebuild one of the ASM disks (ASM_DISK1), but first I need to copy all of the datafiles to another disk (ASM_DISK2). I tried backing up the database using RMAN, but it was taking too long (nearly two days when I cancelled it). So now I am going to copy the files using ASMCMD CP command.
Basically my task is as follows:
1. Shutdown database.
2. Copy all data from ASM_DISK1 to ASM_DISK2.
3. Drop ASM_DISK1.
4. Re-create ASM_DISK1.
5. Copy all data back to ASM_DISK1.
6. Start database.
Database size is 700GB.
I am using the below script to copy the files.
Copy Script
================
asmcmd ls +ASM_DISK1/DB1/DATAFILE >> asm_list.txt
for FILENAME in `cat asm_list.txt`
do
asmcmd >> asm_LOG.log <<EOF
cp ASM_DISK1/DB1/DATAFILE/$FILENAME ASM_DISK2/DB1_BACKUP/DATAFILE/$FILENAME.dbf
EOF
done
================
I will then rename each file in the database like so:
alter database rename file '+ASM_DISK1/DB1/DATAFILE/filename' to '+ASM_DISK1/DB1/DATAFILE/filename.dbf'
My questions are as follows.
Is this approach a valid solution?
Will renaming the files during copy corrupt the files?
When I copy the files back to the original disk after rebuild, then rename them, will the database be able to start?
Rgs,
Rob

rgilligan_tnf wrote:
Hi,
I have an environment of four 11GR2 Oracle databases on a Linux server. Each database has its own ASM disk.
DB1 -> ASM_DISK1
DB2 -> ASM_DISK2
DB3 -> ASM_DISK3
DB4 -> ASM_DISK4
I need to rebuild one of the ASM disks (ASM_DISK1), but first I need to copy all of the datafiles to another disk (ASM_DISK2). I tried backing up the database using RMAN, but it was taking too long (nearly two days when I cancelled it). So now I am going to copy the files using ASMCMD CP command.
And how do you propose to update the controlfile to point to the new location?
unless your datafiles are offline and/or the database is down, you will corrupt them and have an unusable database when you finish.
how were you doing this with RMAN? Depending on the size of your database(700G), it very well could take some time. I have restored databases at a rate of >300G/hr from scratch. You will need to shutdown at some point to relocate the controlfiles and system and redo logfiles.
Just curious, what is the problem with diskgroup ASM_DISK1 that you want to rebuild it?
Basically my task is as follows:
1. Shutdown database.
2. Copy all data from ASM_DISK1 to ASM_DISK2.
3. Drop ASM_DISK1.
4. Re-create ASM_DISK1.
5. Copy all data back to ASM_DISK1.
6. Start database.
Database size is 700GB.
I am using the below script to copy the files.
Copy Script
================
asmcmd ls +ASM_DISK1/DB1/DATAFILE >> asm_list.txt
for FILENAME in `cat asm_list.txt`
do
asmcmd >> asm_LOG.log <<EOF
cp ASM_DISK1/DB1/DATAFILE/$FILENAME ASM_DISK2/DB1_BACKUP/DATAFILE/$FILENAME.dbf
EOF
done
================
I will then rename each file in the database like so:
alter database rename file '+ASM_DISK1/DB1/DATAFILE/filename' to '+ASM_DISK1/DB1/DATAFILE/filename.dbf'
My questions are as follows.
Is this approach a valid solution?
Will renaming the files during copy corrupt the files?
When I copy the files back to the original disk after rebuild, then rename them, will the database be able to start?
Rgs,
Rob

[FATAL] [INS-30508] Invalid ASM disks.

Dear Gurus
please help for troubleshoot the Invalid asm disk error on solaris
Oracle Grid 11.2.0.3.0
Solaris10 with EMC Powerpath Partition
-bash-3.2$ ./runInstaller -silent -responseFile /aaa/Oracle11g_SunSPARC_64bit/grid/response/grid_install.rsp
Starting Oracle Universal Installer...
Checking Temp space: must be greater than 180 MB.   Actual 90571 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 90667 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2013-09-15_12-17-47PM. Please wait ...-bash-3.2$ [FATAL] [INS-30508] Invalid ASM disks.
   CAUSE: The disks [/dev/rdsk/emcpower2a, /dev/rdsk/emcpower6a, /dev/rdsk/emcpower8a] were not valid.
   ACTION: Please choose or enter valid ASM disks.
A log of this session is currently saved as: /tmp/OraInstall2013-09-15_12-17-47PM/installActions2013-09-15_12-17-47PM.log. Oracle recommends that if you want to keep this log, you should move it from the temporary location to a more permanent location.

I'm grateful for your response and your time
We had resolve the issue our self
[/dev/rdsk/emcpower2a, /dev/rdsk/emcpower6a, /dev/rdsk/emcpower8a]
above partition by default take the slice 0
we had refer below link
ASM Create Partitions in Solaris and add them as disks in AS
In Sparc architecture, the solaris disk is subdivided into 8 slices.
Below is the common configuration of these eight slices:
slice 0: Holds files and directories that make up the operating
system.*
slice 1: Swap, Provides virtual memory, or swap space.
slice 2: Refers to the entire disk, by convention. The size of this
slice should not be changed.**
slice 3: /export, Holds alternative versions of the operating system.
slice 4: /export/swap. Provides virtual memory space for client
systems. ***
slice 5: /opt. Holds application software added to a system.
slice 6: /usr. Holds operating system commands--also known as
executables-- designed to be run by users.
slice 7: /home. Holds files created by users.
* Cannot be used as ASM disk. Using this slice causes disk corruption
and may render the disk as unusable.
** Should not be used as ASM Disk, as slice refers to the entire disk
(Including partition tables).
*** Is the recommended slice to be used for ASM disk.
as per asm recommendation we had used slice 4 the same
so we had detail diagnose and came to know that we have to use
[/dev/rdsk/emcpower2e, /dev/rdsk/emcpower6e, /dev/rdsk/emcpower8e]
here e refer the slice 4
we had use below changes in silent file and it's working fine
oracle.install.asm.diskGroup.disks=/dev/rdsk/emcpower2e,/dev/rdsk/emcpower6e,/dev/rdsk/emcpower8e
sample logs
INFO: Starting Output Reader Threads for process /tmp/OraInstall2013-09-17_05-03-21PM/ext/bin/kfod
INFO: Parsing 2560 CANDIDATE /dev/rdsk/emcpower2e oracle oinstall
INFO: The process /tmp/OraInstall2013-09-17_05-03-21PM/ext/bin/kfod exited with code 0
INFO: Waiting for output processor threads to exit.
INFO: Parsing 2560 CANDIDATE /dev/rdsk/emcpower6e oracle oinstall
INFO: Parsing 2560 CANDIDATE /dev/rdsk/emcpower8e oracle oinstall

Asm disk group recover

Hi,
I am faced a question in a interview about asm.How to recover corrupted asm disk group(all the disk).
Please provide steps to recover.
Thanks

If you don't know the answer, then you may not be qualified to support this type of configuration. Always start with the documentation.
And if you are answering this level of interview question, saying "I don't know, but can learn" will be a lot better than getting in over your head. Because the follow-on questions will show whether or not you are trying to BS your way through the answer. And if they are any good at interviewing, you won't know that you failed until you don't hear back from them.

ASM disks

Hi all,
My env:oracle 10g 10.2.0.4,RHEL4.0
I have prob in my ASM disk groups (4disks) one of my ASM-disk is corrupted.
How to restore and recover that one .please let me if any one knows.

What is the status of db ? mount, unmount or open?
Do you know the datafiles that are in your corrupted ASM disk?
I think that you solve the problem restore your all database tor part of the database .

Asm disk removed taking too much time to boot

hi
a local machine is configured with ASM by oracleasm with a total disk.As it was a training purpose so i have removed the disk eg /dev/sdb bluntly ie. just removing it from box.now i tried fsck -c -c -f /dev/ in rescue mode its didnot worked even not mounting the /mnt/sysimage.It claims for ext2fs error and mounting the fs etc.
Then after lots of days has gone.i fed up with this issues and reinstalled OS.
but my question is what exactly to do?obviously re-installation is not the exact way to do.
regards

If the +/dev/sdb+ was an ASM disk, then it should not impact the o/s when it is removed. ASM instance itself will fail with an error saying something like it was not able to mount the disk group.
If your system failed to boot correctly after this disk was removed, then +/dev/sdb+ contained more than just ASM data.
We dynamically add and remove ASM (multipath'ed) disks via kpartx - while o/s is running. No reboot. No problems.
I fail to see how a ASM disk could cause the type of problems you describe - unless it was more than just a disk used by ASM alone.

Questions on asm disk discovery:

Questions on asm disk discovery:
1）What is the relationship btween asm_diskstring in the init.ora and DiscoveryString in the GPNP profile.xml?
2) Which one of the above two finally accounts for the disk discovery process?
3) We know that asmlib disks are self describing at the disk header. This overcomes the disk name/path persistency issue as we no long need to rely on the path to discover the asm disks, by setting asm_diskstring='ORCL:*' , ASM instance will identify the right disks automatically. However, I am not sure if setting asm_diskstring='ORCL:*' is the most economic way to do the discovery as I am not sure if Oracle will have to probe all the disks on the OS to determine the right disks. If Oracle has to screen all the disks in this way, then I think setting asm_diskstring='<path_to_asmlib_disk>' will be much faster, although this will be open to the persistent problem. Is my understanding correct?
Thanks.

From my understanding all disk you see in /dev/oracleasm/disks are the disks in your system that been discovered by asmlib at discovery stage.
Currently, due to bug 13465545, ASM instance will discover disks from both locations, ASM_DISKSTRING and gpnp profile, which can cause some mess in disk representation for asm. You can check the settings using asmcmd command: dsget, and set to be the same using dsset.
I think its more secure to set ASM_DISKSTRING to only the disks used by asm instance.
ASMCMD> dsget
Regards
Ed

Questions on asm disk discover:

Questions on asm disk discover:
1）What is the relationship btween asm_diskstring in the init.ora and DiscoveryString in the GPNP profile.xml?
2) which one finally accounts for the disk discovery process?
3) We know that asmlib disks are self describing at the disk header. This overcomes the disk name/path persistency issue as we do not rely on the path the discover the asmlib
disks. asm_diskstring='ORCL:*' will identify the right disks. I am not sure if setting 'ORCL:*' is the most economic way as I am not sure if Oracle will have to scan all the disks
on the OS and probe the disks that it has rigths to determine which disks belong to ASM. If Oracle has to screen all the disks in this way, then I think setting
asm_diskstring='<path_to_asmlib_disk>' will be much faster. However, this will be open to the persistent problem. Is my understanding correct?
Thanks.

Questions on asm disk discovery:
1）What is the relationship btween asm_diskstring in the init.ora and DiscoveryString in the GPNP profile.xml?
2) Which one of the above two finally accounts for the disk discovery process?
3) We know that asmlib disks are self describing at the disk header. This overcomes the disk name/path persistency issue as we no long need to rely on the path to discover the asm disks, by setting asm_diskstring='ORCL:*' , ASM instance will identify the right disks automatically. However, I am not sure if setting asm_diskstring='ORCL:*' is the most economic way to do the discovery as I am not sure if Oracle will have to probe all the disks on the OS to determine the right disks. If Oracle has to screen all the disks in this way, then I think setting asm_diskstring='<path_to_asmlib_disk>' will be much faster, although this will be open to the persistent problem. Is my understanding correct?
Thanks.

Please Help - When I try to add ASM Disk to ASM Diskgroup it crashes Server

We are using a Pillar SAN and have LUNS Created and are using the following multipath device: (I'm a DBA more then anything else... but I am rather familiar with linux .... SAN Hardware not so much)
Device Size Mount Point
/dev/dpda1 11G /u01
The Above device is working fine... Below are the ASM Disks being Created
Device Size Oracle ASM Disk Name
/dev/dpdb1 198G ORCL1
/dev/dpdc1 21G SIRE1
/dev/dpdd1 21G CART1
/dev/dpde1 21G SRTS1
/dev/dpdf1 21G CRTT1
I try to create to the first ASM Disk
/etc/init.d/oracleasm createdisk ORCL1 /dev/dpdb1
Marking disk "ORCL1" as an ASM disk: [FAILED]
So I check the oracleasm log:
#cat /var/log/oracleasm
Device "/dev/dpdb1" is not a partition
I did some research and found that this is a common problem with multipath devices and to work around it you have to use asmtool
# /usr/sbin/asmtool -C -l /dev/oracleasm -n ORCL1 -s /dev/dpdb1 -a force=yes
asmtool: Device "/dev/dpdb1" is not a partition
asmtool: Continuing anyway
now I scan and list the disks
# /etc/init.d/oracleasm scandisks
Scanning the system for Oracle ASMLib disks: [ OK ]
# /etc/init.d/oracleasm listdisks
ORCL1
Here is whats going on in /var/log/messages when I run the oracleasm scandisks command
# date
Fri Aug 14 13:51:58 MST 2009
# /etc/init.d/oracleasm scandisks
Scanning the system for Oracle ASMLib disks: [ OK ]
cat /var/log/messages | grep "Aug 14 13:5"
Aug 14 13:52:06 seer kernel: dpdb: dpdb1
Aug 14 13:52:06 seer kernel: dpdc: dpdc1
Aug 14 13:52:06 seer kernel: dpdd: dpdd1
Aug 14 13:52:06 seer kernel: dpde: dpde1
Aug 14 13:52:06 seer kernel: dpdf: dpdf1
Aug 14 13:52:06 seer kernel: dpdg: dpdg1
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sda, sector 0
Aug 14 13:52:06 seer kernel: printk: 30 messages suppressed.
Aug 14 13:52:06 seer kernel: Buffer I/O error on device sda, logical block 0
Aug 14 13:52:06 seer kernel: sda : READ CAPACITY failed.
Aug 14 13:52:06 seer kernel: sda : status=1, message=00, host=0, driver=08
Aug 14 13:52:06 seer kernel: sd: Current: sense key: Illegal Request
Aug 14 13:52:06 seer kernel: Add. Sense: Logical unit not supported
Aug 14 13:52:06 seer kernel:
Aug 14 13:52:06 seer kernel: sda: test WP failed, assume Write Enabled
Aug 14 13:52:06 seer kernel: sda: asking for cache data failed
Aug 14 13:52:06 seer kernel: sda: assuming drive cache: write through
Aug 14 13:52:06 seer kernel: sda:end_request: I/O error, dev sda, sector 0
Aug 14 13:52:06 seer kernel: Buffer I/O error on device sda, logical block 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sda, sector 0
Aug 14 13:52:06 seer kernel: Buffer I/O error on device sda, logical block 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sda, sector 0
Aug 14 13:52:06 seer kernel: Buffer I/O error on device sda, logical block 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sda, sector 0
Aug 14 13:52:06 seer kernel: Buffer I/O error on device sda, logical block 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sda, sector 0
Aug 14 13:52:06 seer kernel: Buffer I/O error on device sda, logical block 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sda, sector 0
Aug 14 13:52:06 seer kernel: Buffer I/O error on device sda, logical block 0
Aug 14 13:52:06 seer kernel: Dev sda: unable to read RDB block 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sda, sector 0
Aug 14 13:52:06 seer kernel: Buffer I/O error on device sda, logical block 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sda, sector 0
Aug 14 13:52:06 seer kernel: Buffer I/O error on device sda, logical block 0
Aug 14 13:52:06 seer kernel: unable to read partition table
Aug 14 13:52:06 seer kernel: SCSI device sdb: 21502464 512-byte hdwr sectors (11009 MB)
Aug 14 13:52:06 seer kernel: sdb: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdb: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdb: sdb1
Aug 14 13:52:06 seer kernel: SCSI device sdc: 421476864 512-byte hdwr sectors (215796 MB)
Aug 14 13:52:06 seer kernel: sdc: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdc: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdc: sdc1
Aug 14 13:52:06 seer kernel: SCSI device sdd: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdd: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdd: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdd: sdd1
Aug 14 13:52:06 seer kernel: SCSI device sde: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sde: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sde: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sde: sde1
Aug 14 13:52:06 seer kernel: SCSI device sdf: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdf: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdf: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdf: sdf1
Aug 14 13:52:06 seer kernel: SCSI device sdg: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdg: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdg: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdg: sdg1
Aug 14 13:52:06 seer kernel: SCSI device sdh: 2107390464 512-byte hdwr sectors (1078984 MB)
Aug 14 13:52:06 seer kernel: sdh: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdh: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdh: sdh1
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdi, sector 0
Aug 14 13:52:06 seer kernel: Buffer I/O error on device sdi, logical block 0
Aug 14 13:52:06 seer kernel: sdi : READ CAPACITY failed.
Aug 14 13:52:06 seer kernel: sdi : status=1, message=00, host=0, driver=08
Aug 14 13:52:06 seer kernel: sd: Current: sense key: Illegal Request
Aug 14 13:52:06 seer kernel: Add. Sense: Logical unit not supported
Aug 14 13:52:06 seer kernel:
Aug 14 13:52:06 seer kernel: sdi: test WP failed, assume Write Enabled
Aug 14 13:52:06 seer kernel: sdi: asking for cache data failed
Aug 14 13:52:06 seer kernel: sdi: assuming drive cache: write through
Aug 14 13:52:06 seer kernel: sdi:end_request: I/O error, dev sdi, sector 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdi, sector 0
Aug 14 13:52:06 seer last message repeated 4 times
Aug 14 13:52:06 seer kernel: Dev sdi: unable to read RDB block 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdi, sector 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdi, sector 0
Aug 14 13:52:06 seer kernel: unable to read partition table
Aug 14 13:52:06 seer kernel: SCSI device sdj: 21502464 512-byte hdwr sectors (11009 MB)
Aug 14 13:52:06 seer kernel: sdj: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdj: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdj: sdj1
Aug 14 13:52:06 seer kernel: SCSI device sdk: 421476864 512-byte hdwr sectors (215796 MB)
Aug 14 13:52:06 seer kernel: sdk: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdk: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdk: sdk1
Aug 14 13:52:06 seer kernel: SCSI device sdl: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdl: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdl: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdl: sdl1
Aug 14 13:52:06 seer kernel: SCSI device sdm: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdm: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdm: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdm: sdm1
Aug 14 13:52:06 seer kernel: SCSI device sdn: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdn: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdn: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdn: sdn1
Aug 14 13:52:06 seer kernel: SCSI device sdo: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdo: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdo: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdo: sdo1
Aug 14 13:52:06 seer kernel: SCSI device sdp: 2107390464 512-byte hdwr sectors (1078984 MB)
Aug 14 13:52:06 seer kernel: sdp: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdp: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdp: sdp1
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdq, sector 0
Aug 14 13:52:06 seer kernel: sdq : READ CAPACITY failed.
Aug 14 13:52:06 seer kernel: sdq : status=1, message=00, host=0, driver=08
Aug 14 13:52:06 seer kernel: sd: Current: sense key: Illegal Request
Aug 14 13:52:06 seer kernel: Add. Sense: Logical unit not supported
Aug 14 13:52:06 seer kernel:
Aug 14 13:52:06 seer kernel: sdq: test WP failed, assume Write Enabled
Aug 14 13:52:06 seer kernel: sdq: asking for cache data failed
Aug 14 13:52:06 seer kernel: sdq: assuming drive cache: write through
Aug 14 13:52:06 seer kernel: sdq:end_request: I/O error, dev sdq, sector 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdq, sector 0
Aug 14 13:52:06 seer last message repeated 5 times
Aug 14 13:52:06 seer kernel: Dev sdq: unable to read RDB block 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdq, sector 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdq, sector 0
Aug 14 13:52:06 seer kernel: unable to read partition table
Aug 14 13:52:06 seer kernel: SCSI device sdr: 21502464 512-byte hdwr sectors (11009 MB)
Aug 14 13:52:06 seer kernel: sdr: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdr: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdr: sdr1
Aug 14 13:52:06 seer kernel: SCSI device sds: 421476864 512-byte hdwr sectors (215796 MB)
Aug 14 13:52:06 seer kernel: sds: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sds: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sds: sds1
Aug 14 13:52:06 seer kernel: SCSI device sdt: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdt: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdt: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdt: sdt1
Aug 14 13:52:06 seer kernel: SCSI device sdu: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdu: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdu: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdu: sdu1
Aug 14 13:52:06 seer kernel: SCSI device sdv: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdv: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdv: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdv: sdv1
Aug 14 13:52:06 seer kernel: SCSI device sdw: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdw: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdw: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdw: sdw1
Aug 14 13:52:06 seer kernel: SCSI device sdx: 2107390464 512-byte hdwr sectors (1078984 MB)
Aug 14 13:52:06 seer kernel: sdx: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdx: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdx: sdx1
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdy, sector 0
Aug 14 13:52:06 seer kernel: sdy : READ CAPACITY failed.
Aug 14 13:52:06 seer kernel: sdy : status=1, message=00, host=0, driver=08
Aug 14 13:52:06 seer kernel: sd: Current: sense key: Illegal Request
Aug 14 13:52:06 seer kernel: Add. Sense: Logical unit not supported
Aug 14 13:52:06 seer kernel:
Aug 14 13:52:06 seer kernel: sdy: test WP failed, assume Write Enabled
Aug 14 13:52:06 seer kernel: sdy: asking for cache data failed
Aug 14 13:52:06 seer kernel: sdy: assuming drive cache: write through
Aug 14 13:52:06 seer kernel: sdy:end_request: I/O error, dev sdy, sector 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdy, sector 0
Aug 14 13:52:06 seer last message repeated 5 times
Aug 14 13:52:06 seer kernel: Dev sdy: unable to read RDB block 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdy, sector 0
Aug 14 13:52:06 seer kernel: end_request: I/O error, dev sdy, sector 0
Aug 14 13:52:06 seer kernel: unable to read partition table
Aug 14 13:52:06 seer kernel: SCSI device sdz: 21502464 512-byte hdwr sectors (11009 MB)
Aug 14 13:52:06 seer kernel: sdz: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdz: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdz: sdz1
Aug 14 13:52:06 seer kernel: SCSI device sdaa: 421476864 512-byte hdwr sectors (215796 MB)
Aug 14 13:52:06 seer kernel: sdaa: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdaa: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdaa: sdaa1
Aug 14 13:52:06 seer kernel: SCSI device sdab: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdab: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdab: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdab: sdab1
Aug 14 13:52:06 seer kernel: SCSI device sdac: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdac: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdac: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdac: sdac1
Aug 14 13:52:06 seer kernel: SCSI device sdad: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdad: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdad: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdad: sdad1
Aug 14 13:52:06 seer kernel: SCSI device sdae: 43006464 512-byte hdwr sectors (22019 MB)
Aug 14 13:52:06 seer kernel: sdae: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdae: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdae: sdae1
Aug 14 13:52:06 seer kernel: SCSI device sdaf: 2107390464 512-byte hdwr sectors (1078984 MB)
Aug 14 13:52:06 seer kernel: sdaf: Write Protect is off
Aug 14 13:52:06 seer kernel: SCSI device sdaf: drive cache: write through w/ FUA
Aug 14 13:52:06 seer kernel: sdaf: sdaf1
Aug 14 13:52:06 seer kernel: scsi_wr_disk: unknown partition table
Aug 14 13:52:07 seer kernel: end_request: I/O error, dev sda, sector 0
Aug 14 13:52:07 seer kernel: end_request: I/O error, dev sdi, sector 0
Aug 14 13:52:07 seer kernel: end_request: I/O error, dev sdq, sector 0
Aug 14 13:52:07 seer kernel: end_request: I/O error, dev sdy, sector 0
Here's some extra info:
# /sbin/blkid | grep asm
/dev/sdc1: LABEL="ORCL1" TYPE="oracleasm"
/dev/sdk1: LABEL="ORCL1" TYPE="oracleasm"
/dev/sds1: LABEL="ORCL1" TYPE="oracleasm"
/dev/sdaa1: LABEL="ORCL1" TYPE="oracleasm"
/dev/dpdb1: LABEL="ORCL1" TYPE="oracleasm"
I have learned that by excluding devices in the oracleasm configuration file I eliminate those I/O errors in /var/log/messages
# cat /etc/sysconfig/oracleasm
# This is a configuration file for automatic loading of the Oracle
# Automatic Storage Management library kernel driver. It is generated
# By running /etc/init.d/oracleasm configure. Please use that method
# to modify this file
# ORACLEASM_ENABELED: 'true' means to load the driver on boot.
ORACLEASM_ENABLED=true
# ORACLEASM_UID: Default user owning the /dev/oracleasm mount point.
ORACLEASM_UID=oracle
# ORACLEASM_GID: Default group owning the /dev/oracleasm mount point.
ORACLEASM_GID=oinstall
# ORACLEASM_SCANBOOT: 'true' means scan for ASM disks on boot.
ORACLEASM_SCANBOOT=true
# ORACLEASM_SCANORDER: Matching patterns to order disk scanning
ORACLEASM_SCANORDER="dp sd"
# ORACLEASM_SCANEXCLUDE: Matching patterns to exclude disks from scan
ORACLEASM_SCANEXCLUDE="sdc sdk sds sdaa sda"
# ls -la /dev/oracleasm/disks/
total 0
drwxr-xr-x 1 root root 0 Aug 14 10:47 .
drwxr-xr-x 4 root root 0 Aug 13 15:32 ..
brw-rw---- 1 oracle oinstall 251, 33 Aug 14 13:46 ORCL1
Now I can go into dbca to create the ASM instance, which starts up fine... create a new diskgroup, I see ORCL1 as a provision ASM disk I select it ... Click OK
CRASH!!! Box hangs have to reboot it....
I have gotten myself to exactly the same point right before clicking OK and here is what is in the ASM alertlog so far
Fri Aug 14 14:42:02 2009
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.1.0/db_1/dbs/arch
Autotune of undo retention is turned on.
IMODE=BR
ILAT =0
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.6.0.
Using parameter settings in server-side spfile /u01/app/oracle/product/11.1.0/db_1/dbs/spfile+ASM.ora
System parameters with non-default values:
large_pool_size = 12M
instance_type = "asm"
diagnostic_dest = "/u01/app/oracle"
Fri Aug 14 14:42:04 2009
PMON started with pid=2, OS id=3300
Fri Aug 14 14:42:04 2009
VKTM started with pid=3, OS id=3302 at elevated priority
VKTM running at (20)ms precision
Fri Aug 14 14:42:04 2009
DIAG started with pid=4, OS id=3306
Fri Aug 14 14:42:04 2009
PSP0 started with pid=5, OS id=3308
Fri Aug 14 14:42:04 2009
DSKM started with pid=6, OS id=3310
Fri Aug 14 14:42:04 2009
DIA0 started with pid=7, OS id=3312
Fri Aug 14 14:42:04 2009
MMAN started with pid=8, OS id=3314
Fri Aug 14 14:42:04 2009
DBW0 started with pid=9, OS id=3316
Fri Aug 14 14:42:04 2009
LGWR started with pid=6, OS id=3318
Fri Aug 14 14:42:04 2009
CKPT started with pid=10, OS id=3320
Fri Aug 14 14:42:04 2009
SMON started with pid=11, OS id=3322
Fri Aug 14 14:42:04 2009
RBAL started with pid=12, OS id=3324
Fri Aug 14 14:42:04 2009
GMON started with pid=13, OS id=3326
ORACLE_BASE from environment = /u01/app/oracle
Fri Aug 14 14:42:04 2009
SQL> ALTER DISKGROUP ALL MOUNT
Fri Aug 14 14:42:41 2009
At this point I don't want to click the OK until I am sure someone is in the office to reboot the machine manually if I do hang it again.... I hung it twice yesterday, however I did not have the devices excluded in the oracleasm configuration file as i do now
Edited by: user10193377 on Aug 14, 2009 3:23 PM
Well Clicking OK hun it again and I am waiting to get back into it, to see what new information might be gleened
Does anyone have any ideas on what to check or where to look????? Will update more once I can log back in

Hi Mark,
It looks like something is not correct with your raw device partition based on the error messages:
Aug 14 13:52:06 seer kernel: Add. Sense: Logical unit not supported
Aug 14 13:52:06 seer kernel:
Aug 14 13:52:06 seer kernel: sda: test WP failed, assume Write Enabled
Aug 14 13:52:06 seer kernel: sda: asking for cache data failed
Aug 14 13:52:06 seer kernel: sda: assuming drive cache: write through
Aug 14 13:52:06 seer kernel: sda:end_request: I/O error, dev sda, sector 0
It could be a number of things. I would check with your vendor and Oracle support to see if the multipath software drive is supported and if there is a potential workaround for ASM. Sorry this is not quite the solution, but its what jumps to mind based on issues with multipath software and storage vendors for ASM with Linux and Oracle. Have you checked the validation matrix available on Metalink?
Cheers,
Ben

ASM disk corrupted

Similar Messages

Maybe you are looking for