Database Crash/Recovery ?
Our NT Server crashed with the oracle database on it. We were able to backup the physical data files (User Datafile, Redo Log Datafile and Control File).
Is there a way we can install Oracle, create a database and make use of these Datafiles to recover the data ?
Mandar Walvekar
Absolutely.
Do this --
1. Perform the physical restore of the datafiles and redolog files from the tape to the disks in the exact locations
2. Install just the Oracle software without creating a database option
3. Create an instance using the oradim command line utility
4. Make necessary edits to the init.ora initialization file which has been restored.
Start up the database using this init.ora.
Try this - If it doesn't work, let's try recreating the control files option and check.
Similar Messages
-
Database Crash - Media Recovery Required - [SOLVED]
I am trying to understand why media recovery was required when our database had crashed ( this was 2 months ago ). This is just for my understanding and I would appreciate if you could explain it to me.
We have a 10g database on Solaris 10 and EMC SAN for storage. One of the technicians came in look at some electrical issue. And accidentally turned off the power supply to the SAN.
This caused the database instance to crash.
When the DBA tried to startup the database, he said that he was getting errors that some datafiles were corrupted and ruled that media recovery was required.
He took the backups taken 12 hours ago and applied the archived redo logs. Everything was fine except for the downtime ( it took 6 hours for this whole process).
I have several questions:
1. why would the datafiles get corrupted? was the disk in the process of writing and could not complete the task?
2. why did the database not do crash recovery i.e use redo logs? why did we have to use backup?
Thanks.
Message was edited by:
sayeed1. why would the datafiles get corrupted? was the
disk in the process of writing and could not complete
the task?This could happen if a datafile being accessed during the power failure resulting in corrupting the file which can happen to any file during power failure..In this case it was datafile resulting in dataabse crash...
2. why did the database not do crash recovery i.e
use redo logs? why did we have to use backup?Since the file is corrupted you need to replace it with a last backup and apply logs to bring it consistent state....
Again ..This is what i have understood so far
correct me if i am wrong -
ORACLE 9I DATABASE - CRASH & RECOVER
I AM USING ORACLE 9I FOR LINUX IN A DUAL XEON PROCESSOR SERVER. NOW DATABASE DOES NOT OPEN PROPERLY AND SHOW FOLLOWING MESSAGES:-
1) for SYSDBA user message for startup database
SQL*Plus: Release 9.2.0.1.0 - Production on Thu Aug 21 18:49:14 2003
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Enter user-name: sys as sysdba
Enter password:
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 621875780 bytes
Fixed Size 451140 bytes
Variable Size 335544320 bytes
Database Buffers 285212672 bytes
Redo Buffers 667648 bytes
Database mounted.
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open
SQL>
2) Then we reset the logs the message is:
SQL> alter database open resetlogs;
alter database open resetlogs
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/C/app/oracle9ir2/oradata/tepiapp/system01.dbf'
SQL>
3) When connecting with different user the message is :
Enter user-name: mis-dba
Enter password:
ERROR:
ORA-01033: ORACLE initialization or shutdown in progress
SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus
0;oracle@tepiapp:~[oracle@tepiapp oracle]$
Please guide me for detail solutions for this problem
Regards,
FSLhmmm ok, so it seems that you dont use RMAN to backup your database...
It seems there that your have a controlfile problem after a disk crash. I hope your database is on ARCHIVELOG mode.
Does your database crashed during activity or was idle ?
1. Mount the database
2. set autorecovery on ;
recover database using backup controlfile ;
media recovery complete <= this means that all is ok now
3. alter database open resetlogs ;
if #2 doesnt work cause of redo log problem such as corrupted redo log, or some transactions "blocked" in theses redo log cause of a big crash during the database activity, Oracle will tell you that a redo log is corrupted and will give you the hour of the redo log log entry that Oracle need to recover.
So you will have to make an incomplete recovery (that means you will lose some datas, no much but some) until time.
For example, your redo entry has been corrupted at 12:00:50
1. recover database until time '2003-08-25:12:00:00' ;
2. alter database open resetlogs ;
3. Make a full backup of your database !
Fred -
Crash recovery of productive db very slow
We had to shutdown a productive database with db2_kill, because it couldn't be stopped normally and had problem with a full FAILARCHPATH (After TSM server had problems, the archiving to TSM has not been successfully any more, even after TSM Server was up again: We had this problems before....)
The crash recovery takes very long. Sometimes even db2 list utilities <show details> seems to hang.
With db2pd -everything I can see the progress of the crash recovery:
Database Partition 0 -- Database PC1 -- Active -- Up 0 days 01:57:14 -- Date 05/07/2008 11:34:59
Recovery:
Recovery Status 0x00000C01
Current Log S0003363.LOG
Current LSN 061F2B330DBA
Job Type CRASH RECOVERY
Job ID 1
Job Start Time (1210145904) Wed May 7 09:38:24 2008
Job Description Crash Recovery
Invoker Type User
Total Phases 2
Current Phase 1
Progress:
Address PhaseNum Description StartTime CompletedWork TotalWork
0x000000020018E580 1 Forward Wed May 7 09:38:24 2008 786766439 bytes 1998253346 bytes
0x000000020018E670 2 Backward NotStarted 0 bytes 1998253346 bytes
So the db has now finished approx 1/3 of the bytes of the forward phase and then also have the backward phase!
In the db2diag.log there are no more entries after beginning of the crash recovery of 09:38.
We have move one logfile from the FAILARCHPATH directory (which was 100% full) to a different directory to be sure, that the slow crash recovery has nothing to do with the full FAILARCHPATH.
The log_dir directory has 20 logfiles (LOGPRIMARY+ LOGSECOND) in it (more could not be allocated there because the log_dir is sized according to the LOG-Parameters)
Parameter UTIL_HEAP_SZ = 150.000
Does anybody have an idea, why the crash recovery is so slow ?
Kind regards,
UtaHello Ralph,
the needed logfiles were all there and we didn't need to restore any logfiles from tsm (the "active" logfiles, which are needed for crash recovery should always reside in the log_dir...)
At 2008-05-07-14.17.07.357544 crash recovery was completed successfully.
At 2008-05-07-13.56.41.297552 the db has started archiving to tsm again:
ADM1844I Started archive for log file "S0003329.LOG".
According to dba collegues the crash recovery were only 50 % finished and then suddenly everything was finished. Since the "db2 list utilities" takes forward and backward phase into account for percentage, I assume that the backward phase was very fast.
The dba collegues have also recognized, that in the log_dir there were logfiles which were archived to tsm already. So they moved them out of the log_dir, and additional logfiles could be allocated (Before no add. logfile could be allocated). I couldn't say, if this was the reason, why the recovery was finished then afterwards.
The only problem is, that the database doesn't want to archive Logfiles S0003329- S0003350. Strange is also, that logfile 3329 was archived to the FAILARCHPATH yesterday successfully,
2008-05-06-12.27.10.316403+120 E4284459A420 LEVEL: Warning
PID : 3907 TID : 1 PROC : db2logmgr (PC1) 0
INSTANCE: db2pc1 NODE : 000
FUNCTION: DB2 UDB, data protection, sqlpgArchiveLogFile, probe:3170
MESSAGE : ADM1846I Completed archive for log file "S0003329.LOG" to
"/db2/PC1/log_archive/db2pc1/PC1/NODE0000/C0000009/" from
"/db2/PC1/log_dir/".
and now the db searches in the log_dir:
2008-05-07-13.57.02.525715+120 E25224816A315 LEVEL: Warning
PID : 28182 TID : 1 PROC : db2logmgr (PC1) 0
INSTANCE: db2pc1 NODE : 000
FUNCTION: DB2 UDB, data protection, sqlpgArchiveLogFile, probe:3108
MESSAGE : ADM1844I Started archive for log file "S0003329.LOG".
2008-05-07-13.57.02.526949+120 I25225132A364 LEVEL: Error
PID : 28182 TID : 1 PROC : db2logmgr (PC1) 0
INSTANCE: db2pc1 NODE : 000
FUNCTION: DB2 UDB, data protection, sqlpgArchiveLogVendor, probe:1630
RETCODE : ZRC=0x860F000A=-2045837302=SQLO_FNEX "File not found."
DIA8411C A file "" could not be found.
2008-05-07-13.57.02.527866+120 E25225497A367 LEVEL: Warning
PID : 28182 TID : 1 PROC : db2logmgr (PC1) 0
INSTANCE: db2pc1 NODE : 000
FUNCTION: DB2 UDB, data protection, sqlpgArchiveLogFile, probe:3150
MESSAGE : ADM1848W Failed archive for log file "S0003329.LOG" to "TSM chain 9"
2008-05-07-13.57.02.528352+120 I25225865A370 LEVEL: Error
PID : 28182 TID : 1 PROC : db2logmgr (PC1) 0
INSTANCE: db2pc1 NODE : 000
FUNCTION: DB2 UDB, data protection, sqlpgArchiveLogFile, probe:3160
MESSAGE : Failed to archive log file S0003329.LOG to TSM chain 9 from
/db2/PC1/log_dir/ with rc = -2045837302.
and this was none of the logfiles, which the collegue moved out of log_dir.
Has anybody seen the situation that the db couldn't archive from failarchpath to TSM after failure. We don't want to control every FAILARCHPATH after TSM-Failures....
Kind regards,
Uta -
Location of oracle database crash date time
Hi,
After system crash happens and oracle database is recovered from system crash where can i find the entry in oracle which shows the time of system crash?
I tried the following to get the system crash date and time:
When i start auditing a user by logon and the user is logged on to system oracle creates an entry in dba_audit_session table for user logontime in TIMESTAMP column, but if the system crashes in between when the user is logged in to the system then no entry is made in LOGOFF_TIME column of dba_audit_session table.
in followin example orcl user is being audited by logon.
orcl logsin the system at time 11:36 am and my system crashes.
when orcl logsin again after instance recovery logoff_time is blank, and shows a new entry of orcl.
SQL > select username,to_char(timestamp,'dd-mm-yyyy hh:mi:ss'),to_char(logoff_time,'dd-mm-yyyy hh:mi:ss') from dba_audit_session where username like 'TMS' order by timestamp;
USERNAME TO_CHAR(TIMESTAMP,' TO_CHAR(LOGOFF_TIME
ORCL 16-10-2012 11:36:16
ORCL 16-10-2012 11:46:33
My aim is to get the date & time of database crash.Hi;
As mention here the only way to check alert.log, If Asm avaliable you need also check asmlog and related log files.
If you have OSwatcher on your system you can also check what process or wha happend on your server too
PS:Please dont forget to change thread status to answered if it possible when u belive your thread has been answered, it pretend to lose time of other forums user while they are searching open question which is not answered,thanks for understanding
Regard
Helios -
Aborting crash recovery due to error 354
Help, What do I do?
alter database open
Mon Jun 16 06:47:10 2008
Beginning crash recovery of 1 threads
Mon Jun 16 06:47:10 2008
Started redo scan
Mon Jun 16 06:47:34 2008
Errors in file c:\oraclexe\app\oracle\admin\xe\udump\xe_ora_3496.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 18350 change 56285546348545 time 06/04/2008 13:42:02
ORA-00334: archived log: 'C:\ORACLEXE\APP\ORACLE\FLASH_RECOVERY_AREA\XE\ONLINELOG\O1_MF_2_2R411073_.LOG'
Mon Jun 16 06:47:34 2008
Aborting crash recovery due to error 354
Mon Jun 16 06:47:34 2008
Errors in file c:\oraclexe\app\oracle\admin\xe\udump\xe_ora_3496.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 18350 change 56285546348545 time 06/04/2008 13:42:02
ORA-00312: online log 2 thread 1: 'C:\ORACLEXE\APP\ORACLE\FLASH_RECOVERY_AREA\XE\ONLINELOG\O1_MF_2_2R411073_.LOG'
ORA-354 signalled during: alter database open...
Mon Jun 16 07:09:38 2008
db_recovery_file_dest_size of 10240 MB is 0.98% used.You are certified but you don't know how to to recover a database with corrupt redo logs? There you have my problem with certificates: they don't tell you jack about the true knowledge of the DBA and his problem solving skills. (or is this exactly what you are saying with "book knowledge"? I am no native English speaker, so I might have misunderstood).
Just enter your ORA-error in a metalink search and you will find notes that explain exactly what to do (the how-to's you mention ;)) -
Dear Members
let see a scneario
i have a primary database and a standby database
my production database crashed
how can i recover my standby database until last comitted transaction. how can i get imy transactions from online redo logs which were on primary database
now my primary database server is crashed i cant login to primary database.
is standby database's redo logs synched with primary database's online redo logs?
please help me out
regardsHi,
You need to check first whether all online Redo Logs till the point of time - are applied to DB.
Check for logs still need to apply or not - If all the logs are applied then you can stop the real time apply on standby and the cancel the recovery on standby and try to Open the DB in read write mode for access
Is there any other standby apart from this . ??
If not first take the backup of this standby immediately, then recover the DB on primary DB till the point of time of crash and try to mount the DB on primary.
then you can make it as standby.
- Pavan Kumar N
Oracle 9i/10g - OCP
http://oracleinternals.blogspot.com/ -
Can anyone tell me the senarios in which database crashes
hi
can anyone tell me the senarios in which database crashes other then backup recovery concepts.when the databse goes live
regards,
swapna.sThose sort of errors would mostly be TNS errors when the database server runs out of memory (memory configured for Oracle, swapping, number of network connections limit) etc...
Other than that, if the application is written badly (which I hope is not the case), then users may face deadlock situations but those would be handled by Oracle automatically.
Of course, these errors can be handled by properly tuning the database or the application. -
TABLESPACE BACKUP - Database crash
Hello guys,
i got a little question about "Online / Hot Backup" and a crashing instance.
The following worst case happen:
1) Alter Tablespace TABSP_USER01 in backup mode at 03:00 am
2) There is a problem, while copying the datafiles and the backup modus for the tablespace TABSP_USER01 exists all the time
3) Database crashes at 04:30 am and the tablespace TABSP_USER01 is still in backup mode
4) While the whole online/hot backup was run, an archive log backup runs at 04:00 am and the saved archive log files are deleted from disk
I know what happen to the datafiles while the "online backup". The SCN is frozen when the "backup modus" is set... and in the redo log files a complete block-image is written.
All dirty blocks are written to the datafiles, while the "backup modus" is on, but the SCN is not updated.
But my question is now:
1) If i restart my crashed instance at 05:03 am .. are the archive logs needed for checking (which were already back upped and deleted) at a startup?
2) Does Oracle verify the data in the redo/archive logs with the ones in the datafiles?
3) Or does Oracle only set the actual SCNs to the header of the datafiles (the actual scn is get from the controlfile)?
Thanks and Regards
Stefan#1) If i restart my crashed instance at 05:03 am .. are the archive logs needed for checking (which were already back upped and deleted) at a startup? ##
Yes it will ask for the recovery needed in the mount stage.you need to backup again...
# 2) Does Oracle verify the data in the redo/archive logs with the ones in the datafiles? &3) Or does Oracle only set the actual SCNs to the header of the datafiles (the actual scn is get from the controlfile)? ###
The asnwer is yes and no...It checks both header and information..
Since it is asking for the recocvery it is understandable that it checks for header is not in synch and will apply all the changes...to make it synch.(.all the sequenced are captured in redo-log and archived logs.)
kindly go through this document for clarifications..
http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14220/backrec.htm#i1007289
Thanks
-- Raman -
Estimation/Indication of Database Crash
OS Information:Any version
OS Information:Windows and/or Linux
Estimation/Indication of Database Crash
Hello,
Its an open question that can you tell me the errors/warnings (at least 3) which are exists in alert log at present, whose meaning is that your database will be crashed in next one hour. I mean which ORA-NNNNN or other in alert log; whose indication is same that your database (not instance) is going to be crash within next one hour definately, so that i may get backup or export. Please let me know that question is clear or not.
If you can please elaborate in view of different oracle version and/or OS, it will additional helpful reply to question.
Regards
Girish Sharma"there is already a service disruption"
If the archiver stops, then there is no service disruption until all redo log groups are full, so you do have time to fix the problem before then database hangs.
Most archiver hangs are due to disk full which you can avoid if you monitor disk usage, flashback area usage, and have automatic jobs to backup and purge archivelogs.
Archiver hung also happens due to bugs, controlfile locks, disk errors (NFS etc) or network errors (Dataguard), and in these cases you can't predict them with monitoring, so you won't know about them until you get alert log errors.
Another error with a similar result would be ORA-19809 :-
ARC0: Error 19809 Creating archive log file to '/oraexport/d1122/DB1122/archivelog/2010_01_06/o1_mf_1_240_%u_.arc'
Wed Jan 06 16:05:00 2010
Errors in file /opt/oracle/product/diag/rdbms/db1122/db1122/trace/db1122_arc2_3408.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 104857600 bytes is 100.00% used, and has 0 remaining bytes available.
You have following choices to free up space from recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
system command was used to delete files, then use RMAN CROSSCHECK and
DELETE EXPIRED commands.
Errors in file /opt/oracle/product/diag/rdbms/db1122/db1122/trace/db1122_arc2_3408.trc:
ORA-19809: limit exceeded for recovery files
ORA-19804: cannot reclaim 35865088 bytes disk space from 104857600 limit -
If ur database crash u have only export of database how u create database f
if ur database crash u have only export of database how u create database from starting with help of export give steps
What is an "export give steps?"
If ur's database crashes, how is it u's responsibility to restore it?
Now, if my database crashes, I just restart it in crash recovery mode and not even worry about the backup or export. It is not needed. -
Crash recovery: recreating invalid indexes
I want to discuss a phaenomen, which we often have saw during crash recoveries:
The "startsap db" is hanging, because after the crash recocvery itself is completed (according to db2diag.log), the connection to the database ist still hanging because the database recreate invalid indexes.
sf503:db2p02 10% db2 list utilities
ID = 2
Type = RESTART RECREATE INDEX
Database Name = P02
Partition Number = 0
Description = Recreating Invalid Index Objects
Start Time = 05/08/2007 09:37:42.721072
Sometimes this can take up to 25 minutes until this is finished and the connect is finished.
The strange thing is:
If a 2. connection to the database is made during "recreating indexes", this 2. connection is successfull and selects from t000 are successfull.
So, it is possible to execute "startsap r3" while the first connection is still blocked with recreating indexes.
I don't understand this behaviour:
- If the database really needs the recreation of invalid index objects for sucessfull database operation, then it should block ALL connections and not only the first one.
- If it's not urgently necessary, the database should done the recreating indexes in background and not block the first connection.
Also in the Recovery& High Availibility IBM Handbook I have nothing found about the "recreating invalid index objects"-feature.
Kind regards
UtaHello Jens,
thanks for the detailed answer!
As you wrote, the only workarond with changing db cfg parameter from "restart" to "access" needs some manual intervention if we don't want to risk the limit of maximum dialog runtime. So actually we will leave it as it is. We have a workaround with starting sap in a second window, when the crash recovery is finished according to db2diag.log and the 1. connection (startsap db) ist still recreating invalid indexes.
How can one determine the invalid indexes, who need recreation ?
Joachim: we don't have a real problem at our site. Fortunatelly we have crash recoveries only in the rare cases of system crashes, etc. So we are lucky not to have it every day....
Hopefully the behaviour will be changed to background index recreation in some later DB release. -
Hello,
Who initiate the crash recovery? SMON process or any other process?
In the ealier version of oracle (Oracle 7.3), it was instance recovery,
Now in Oracle 9i there are two types of recovery one crash recovery and another is instance recovery.
I want to know that whether in prior to Oracle 9i, there were two such recovery or only one?
Thanks.crash recovery
The automatic application of online redo records to a database after either a single-instance database crashes or all instances of an Oracle Real Applications Cluster configuration crash. Crash recovery only requires redo from the online logs: archived redo logs are not required.
In crash recovery, an instance automatically recovers the database before opening it. In general, the first instance to open the database after a crash or SHUTDOWN ABORT automatically performs crash recovery.
instance recovery
In an Oracle Real Applications Cluster configuration, the application of redo data to an open database by an instance when this instance discovers that another instance has crashed.
Hope it helps.
http://download.oracle.com/docs/cd/B10501_01/server.920/a96519/glossary.htm#432431
Adith -
Crash recovery everytime starting an instance
Is it normal that a crash recovery is started every time I start
an Oracle instance on a WinNT 4.0 system.
The xxxALRT.log file contains messages like this:
alter database open
Beginning crash recovery of 1 threads
Thread recovery: start rolling forward thread 1
and so on ...
The database startup takes a long time but after that the db
works normal.Oracle always does recovery processing on startup. If you
shutdown abort your startup will take longer because there is
more recovery processing that has to be done. Shutting down
immediate takes longer because it performs clean up process so
that when you startup not as much recovery processing is needed. -
Crash recovery/ instance recovery
Hi,
How do oracle identifies crash recovery/ instance recovery
is required?
Regards,
MathewHi,
>>But how do oracle identifies dabase is abnormally down and crash recovery is required?
I think that the checkpoint information that is desynchronized in redo log files and datafiles. It is necessary understand what is a checkpoint and what the CKPT process do. A checkpoint is a moment in time when all the changes (dirty blocks) made in the database buffer cache are made to the data files. The checkpoint is performed by the CKPT process and it creates an entry in the control file to identify the point in the online redo log file from where the instance recovery should begin in case of an instance failure. One of the ways a checkpoint is initiated is by the data block writer (DBWR) process. The DBWR process initiates a checkpoint by writing all modified data blocks in the data buffers (dirty buffers) to the data files. After a checkpoint is performed, all committed transactions are written to the data files. If the instance were to crash at this point, only new transactions that occurred after this checkpoint would need to be applied to the database to enable a complete recovery. Therefore, the checkpoint process determines which transactions from the redo logs need to be applied to the database in the event of a failure and subsequent recovery.
Cheers
Maybe you are looking for
-
My battery only lasts aboround 3 hours
My battery only lasts around 3 hours
-
SAPGUI problem on FEDORA linux
Hi friends, we have worked on window OS. recently we have changed the OS , we are working on FEDORA linux, comming to problem. production server and quality server are working fine. but development server is not working. when ever we have open this s
-
Problem during installation of Integration (SAP BO 2007) with CLUSTER
I've a problem during the installation of SAP BO Integration version 2007 SP00 PL15. The InstallShield Wizard make a test to connection to SQL Server database SBO-COMMON and the test have a negative result: "Unable to connect to database; verify data
-
How to get characters from a line in a file ?
Hello Everybody , I am able to read the lines from the file using readLine( ) method. But i do not want to read anything that comes after the character '#' Any thing that follows a '#' character is considered as a commet in my case. The contents of m
-
Rollover images disappearing in Browser Lab
I set up my rollover images for my horizontal nav bar in a table and they look great until I test them in Browser Lab. Then randomly, they disappear. Please give some advise.