RMAN Hanging

Oracle Version ---- 10.2.0.4
OS version-------- Sun SPARC 5.10
Backup Type -------Differential Incremental Backup
We are on a 2 node RAC. Our RMAN backup fails at least twice every week on random days.
When i checked the log, it is always hanging here
crosschecked backup piece: found to be 'AVAILABLE'
backup piece handle=/u05/lmprod_p4lor85q_1_1.bkp recid=9945 stamp=730702011
crosschecked backup piece: found to be 'AVAILABLE'
backup piece handle=/u05/ctrl_file/RMAN_CTL_10022:730702065:1.bkp recid=9946 stamp=730702072
crosschecked backup piece: found to be 'AVAILABLE'
backup piece handle=/u05/lmprod_control_c-510928216-20100926-00.ctl recid=9947 stamp=730702087
Crosschecked 44 objects
released channel: ORA_DISK_1
released channel: ORA_DISK_2
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=1270 instance=lmprod2 devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=2639 instance=lmprod1 devtype=DISK
--------hanging
ps -ef | grep rmanwill show that the RMAN process is still present. Everytime RMAN is hung for more than 5 hours, I'll kill these hung processes.
Our backup location
/u05is in an NFS mount. I have checked cross mounting of Archive log locations. It is fine.
How do i diagnose, fix this issue?
Edited by: Jack on Sep 27, 2010 5:27 AM

Hello Jack,
Try to attach to the rman process in question with system tracer and see on which system call(s) RMAN is waiting exactly.
I'm not very familiar with Solaris, but in AIX this command is called truss and can be run as i.e.:
AIX> truss -o <output file> -D -p <pid>A similar command must exist in Solaris.

Similar Messages

Rman hangs at exit if locally connected

hiho,
i've a nasty problem. if i connect rman locally 'rman nocatalog target /' it will
hang at exit.
i tried to connect at rman-prompt with the same result.
im using 8.1.6.1.0 on linux.
# su - oracle
[oracle]$ rman nocatalog
Recovery Manager: Release 8.1.6.1.0 - Production
RMAN> connect target;
RMAN-06005: connected to target database: ACCSRV02 (DBID=1987850870)
RMAN-06009: using target database controlfile instead of recovery catalog
RMAN> exit
Recovery Manager complete.
rman hangs
can anyone confirm this? workaround? any
hints?
thanks for your time,
ciao -ap
null

Hi!
I got same results on linux RH6.2 with 8.1.6.1.0, with 8.1.7 work fine ;)
You're right!
A10!

RMAN hangs when started at Linux command promp

Running 9.2.0.4 on Red Hat Linux 8.0. Has anyone experienced RMAN hang when "rman" entered at command prompt?
Thanks!

Ah!!!! I think I had the same once! Use the linux command 'which rman' to determine what it is actually running! I bet it is some dodgy linux program instead. Change your PATH settings

RMAN hangs on v$rman_status select

RMAN hangs after performing a backup or even simple commands.
I found that the "guilty" sql is:
select /*+ rule */ round(sum(MBYTES_PROCESSED)), round(sum(INPUT_BYTES)), round(sum(OUTPUT_BYTES)) from V$RMAN_STATUS START WITH RECID = :row_id and STAMP = :row_stamp CONNECT BY PRIOR RECID = parent_recid
I tried to do a simple select on v$rman_status, which also hangs.
I couldn't even desc v$rman_status.
I tried to gather statistics on the underlying tables (X$KRBMRST, X$KSFQP, X$KCCRSR) at no avail (the problem has been described in Oracle 10.1, see DocID 375386.1)
Another possible solution found would be to recreate the control file. Not the best solution on a production database.
Environment: Oracle SE 11.2 on Suse SLES 10.2
Any idea ?
Edited by: user10770996 on Feb 28, 2010 5:38 PM

Thanks for your help Hemant,
The V$SESSION view just informs the following for the "waiting" session:
EVENT SQL*Net message from client
STATE WAITED KNOWN TIME
WAIT_CLASS Idle
As this doesn't seem very useful to me, i used the oradebug utility to do a hang analysis (should show the latch contention, isn't it ?)
Here is (part of) the result
is not in a wait:
last wait: 40 min 45 sec ago
blocking: 0 sessions
current sql: select * from v$rman_status
wait history:
1. event: 'SQL*Net message from client'
time waited: 38.355161 sec
wait id: 13 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
* time between wait #1 and #2: 0.000001 sec
2. event: 'SQL*Net message to client'
time waited: 0.000001 sec
wait id: 12 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
* time between wait #2 and #3: 0.000012 sec
3. event: 'SQL*Net message from client'
time waited: 0.000017 sec
wait id: 11 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
To complement this, i looked at the corresponding process at the os level using simple top -p. This showed a constant processor consumption and an increasing memory usage
.Seems that the process is not blocked, but is lost in its processing. Like and infinite loop...

RMAN hangs during restore using TSM

Hi all,
I need an advice. My database version is 11.2.0.2.0. TSM Client version is 5.5.2
I'm testing a restore using Tivoli Storage Manager. I'm trying to restore a specific tablespace using RMAN
RMAN> connect target /
connected to target database (not started)
RMAN> run {
2> startup mount;
3> allocate channel t1 device type sbt_tape parms 'ENV=(TDPO_OPTFILE=/usr/tivoli/tsm/client/oracle/bin64/tdpo.opt)';
4> restore tablespace 'ECRIXAR';
5> recover tablespace 'ECXIAR';
6> alter datbase open;
7> }
RMAN-06196: Oracle instance started
RMAN-06199: database mounted
Total System Global Area 2137886720 bytes
Fixed Size 2221336 bytes
Variable Size 1308625640 bytes
Database Buffers 822083584 bytes
Redo Buffers 4956160 bytes
RMAN-06009: using target database control file instead of recovery catalog
RMAN-08030: allocated channel: t1
RMAN-08500: channel t1: SID=63 device type=SBT_TAPE
RMAN-08526: channel t1: Data Protection for Oracle: version 5.5.2.0
RMAN-03090: Starting restore at 07-MAR-11
RMAN-08016: channel t1: starting datafile backup set restore
RMAN-08089: channel t1: specifying datafile(s) to restore from backup set
RMAN-08610: channel t1: restoring datafile 00014 to /dbspace/oradata/CMSDB/ecrixar01.dbf
RMAN-08003: channel t1: reading from backup piece CMSDB_25_20110304.bkp
And on this place RMAN just hangs and nothing happens further...
I tried to search some information and tried to write some trace files during restore like this...
oracle@hostname:/> rman target / trace /home/oracle/log/rman.trc debug
connected to target database (not started)
RMAN> run {
2> startup mount;
3> allocate channel t1 device type sbt_tape parms 'ENV=(TDPO_OPTFILE=/usr/tivoli/tsm/client/oracle/bin64/tdpo.opt)' trace=2;
4> restore tablespace 'ECRIXAR';
5> recover tablespace 'ECXIAR';
6> alter datbase open;
7> }
But I cannot find any errors in trace files...
What else can i do to check what's happenig?

But I cannot find any errors in trace files...This indicates the problem is not on the RMAN, but on the TSM side. So you have to check the TSM logs, you also can switch on tracing:
http://publib.boulder.ibm.com/tividd/td/DPON/SH26-4112-02/en_US/HTML/anou0009.htm
And on this place RMAN just hangs and nothing happens further...How long do you wait?
Werner

RMAN Hangs

Kernel: 2.4.21-47.EL
Oracle: 9.2.0.4.0 (x64)
On rman DB:
RMAN> connect catalog rman/rman@catdb
connected to recovery catalog database
recovery catalog is not installed
RMAN> create catalog;
recovery catalog created
RMAN> exit
On production db: ( tested tnsping and sqlplus to catdb is fine from production server)
rman target / catalog rman/rman@catdb
Recovery Manager: Release 9.2.0.4.0 - 64bit Production
Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.
++++ Hangs here +++++++
Have anyone seen this? Thanks for your help,
R-

'rman target / catalog rman/rman@catdb' connects to 2 databases, so you have to check WHERE is the problem?
Try first 'rman target /' alone. Then 'rman' and inside the session 'connect catalog ...'.
Werner

RMAN Hangs when executing

Hello friends,
I am running Oracle 9i R2 on RHEL4 32bit,
Hardware HP ProLiant servers
Just today i installed this setup, and database is doing fine.
Except of RMAN
When i run rman command on the UNix prompt... it just hangs. nothing comes out of it.
Neither do i see any error.. nor do i see any exit prompt. It just gets stuck after i type rman and hit enter.
Pls help.

I am putting up the entire content of the bash profile from the top to the bottom... just incase i messed up with something somewhere.
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin
# Set the LD_ASSUME_KERNEL environment variable only for Red Hat 9,
# RHEL AS 3, and RHEL AS 4 !!
# Use the "Linuxthreads with floating stacks" implementation instead of NPTL:
export LD_ASSUME_KERNEL=2.4.1 # for RH 9 and RHEL AS 3
export LD_ASSUME_KERNEL=2.4.19 # for RHEL AS 4
# Oracle Environment
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/9.2.0
export ORACLE_SID=hp001
export ORACLE_TERM=xterm
# export TNS_ADMIN= Set if sqlnet.ora, tnsnames.ora, etc. are not in $ORACLE_HOME/network/admin
export NLS_LANG=AMERICAN;
export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data
LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
export LD_LIBRARY_PATH
# Set shell search paths
export PATH=$PATH:$ORACLE_HOME/bin
CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib
CLASSPATH=$CLASSPATH:$ORACLE_HOME/network/jlib
export CLASSPATH
export PATH
unset USERNAME

Rman hangs during restoration with no errors

Hi,
Platform:solaris 10
oracle version:oracle 10g
I am restoring the backup from tape using RMAN.
iWe have production instances in /data1 to /data11; it is connected to TAPE through fibre cable with the speed of 256MBS/sec.
Our restoration is /aux_data01 to /aux_data11 in the same server.( fairport80)
But our restoration takes very long time. To restore 900 GB is has taken 42 hrs.
a) During the validation it has not give any error.
b) During restoration also it has not give any error (till to time)
Last update in the restoration log was today morning at 2:00 am. There after the restoration was in stale. It seems that it does not processing. Please assist me.
We are restoring the database as of 16-oct-2007.
As of 16-oct-2007 the database size was 1,763GB. It has restored 1742GB, There after no progress at all. (From 2am today)
Validate.
During validate it has not given any error
Restoration.
During the restoration has not thrown any error at all. But no progress after 24-oct-2007 2:00 am onwards.
All data files were restored, but control files are not restored.
Conclusion
Last time in 30Hrs we hit an error. but now after 84 hrs it is stale.
Can anyone help in this please
With Regards
Boo

hi,
can we have a look at your restore command?
regards
Alan

Backup with rman hang

I tested this step of backup of rman on my newly installed 10g R2 SE on AIX box, but it seemed hung there (at least after 2 hours, nothing continue)
RMAN> run {
backup as compressed backupset incremental level 0 cumulative device type disk tag 'BAANBK$LEVEl_0' database;
Starting backup at 10-JAN-07
using channel ORA_DISK_1
channel ORA_DISK_1: starting compressed incremental level 0 datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00006 name=/dbbaan/oradata/baanbk/baandbs02.dbf
input datafile fno=00007 name=/dbbaan/oradata/baanbk/baandbs03.dbf
input datafile fno=00008 name=/dbbaan/oradata/baanbk/baandbs04.dbf
input datafile fno=00009 name=/dbbaan/oradata/baanbk/baandbs05.dbf
input datafile fno=00010 name=/dbbaan/oradata/baanbk/baandbs06.dbf
input datafile fno=00011 name=/dbbaan/oradata/baanbk/baanidx01.dbf
input datafile fno=00012 name=/dbbaan/oradata/baanbk/baanidx02.dbf
input datafile fno=00013 name=/dbbaan/oradata/baanbk/baanidx03.dbf
input datafile fno=00014 name=/dbbaan/oradata/baanbk/baanidx04.dbf
input datafile fno=00015 name=/dbbaan/oradata/baanbk/baanidx05.dbf
input datafile fno=00016 name=/dbbaan/oradata/baanbk/baanidx06.dbf
input datafile fno=00017 name=/dbbaan/oradata/baanbk/baanidx07.dbf
input datafile fno=00001 name=/dbbaan/oradata/baanbk/system01.dbf
input datafile fno=00002 name=/dbbaan/oradata/baanbk/undotbs01.dbf
input datafile fno=00003 name=/dbbaan/oradata/baanbk/undotbs02.dbf
input datafile fno=00004 name=/dbbaan/oradata/baanbk/sysaux01.dbf
input datafile fno=00005 name=/dbbaan/oradata/baanbk/baandbs01.dbf
input datafile fno=00018 name=/dbbaan/oradata/baanbk/users01.dbf
channel ORA_DISK_1: starting piece 1 at 10-JAN-07
That´s all i got on screen. Where is the rman log, so I can check what was going? The same script is running well on my 10gR2 EE at RHEL3. Tried without the "compressed" came out the same hung.

$ rman target / nocatalog
Recovery Manager: Release 10.2.0.1.0 - Production on Wed Jan 10 13:20:24 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
connected to target database: BAANBK (DBID=905495458)
using target database control file instead of recovery catalog
RMAN> show all;
RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 3 DAYS;
CONFIGURE BACKUP OPTIMIZATION ON;
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT '/dbrecovery/flash_recovery_area/%U';
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/oracleapp/oracle/product/10.2.0/db_1/dbs/snapcf_baanbk.f'; # default
This is the same configuration as that of RHEL3
This is a dev box, so no other activity.
Message was edited by:
user508054
Message was edited by:
user508054

RMAN BACKUP hangs up on archive logs

Hi,
in 9i on Linux, My rman backup script is :
RMAN> run {
2> allocate channel t1 type disk;
3> backup incremental level=0 format '/mnt/rman/MYDB/full_%d_%t_%s_%p' database;
4> sql 'alter system switch logfile';
5> backup format '/mnt/rman/MYDB/al_%d_%t_%s_%p'
6> archivelog all delete input;
7> backup format '/mnt/rman/MYDB/ctl_%d_%t_%s_%p' current controlfile;
8> }
It works well until :
backup format '/mnt/rman/MYDB/al_%d_%t_%s_%p' archivelog all delete input;
Here it hangs up (may be there are many many archive log files). What do you propose ? How can we ask RMAN just backup archive logs since some recent dates ? How can we delete most of ancient archive logs ? Since many times RMAN backup was in error then archive logs were not deleted. Now impossible to finish RMAN backup. Many thanks for your help.

Hi,
I launched following since last night but it is always waiting :
RMAN> crosscheck archivelog all;
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=110 devtype=DISK
What can I do ? Any other way to say to RMAN that archived logs are not available ?
Many thanks.

Rman backup command hangs for datafile ,works with archivelogs backup.

Issue is archivelogs backup is going fine ,But when we go for datafile backup its hanging there from oracle side as not able to give file handle.
RMAN session started with debug it gives only
DBGRPC: ENTERED krmqgns [expect.cpp/673]
16:00:40.19 DBGRPC: krmqgns: looking for work for channel default (krmqgns) [expect.cpp/673]
16:00:40.19 DBGRPC: krmqgns: commands remaining to be executed: (krmqgns) [expect.cpp/673]
16:00:40.19 DBGRPC: CMD type=backup cmdid=1 status=STARTED [expect.cpp/673]
16:00:40.19 DBGRPC: 1 STEPstepid=1 cmdid=1 status=STARTED [expect.cpp/673]
16:00:40.19 DBGRPC: krmqgns: no work found for channel default (krmqgns) [expect.cpp/673]
16:00:40.19 DBGRPC: (krmqgns) [expect.cpp/673]
16:00:40.19 DBGRPC: EXITED krmqgns with status 1 [expect.cpp/673]
16:00:40.19 DBGRPC: krmxpoq - returning rpc_number: 17 with status: STARTED16 for channel sqlbt_ch1 [expect.cpp/673]
16:00:40.19 DBGRPC: krmxr - sleeping for 10 seconds [expect.cpp/673]
Is any body seen this type of message from rman.
Thanks
Shirish

OK. Did you miss the second Oracle document?
RMAN Debug For Backup Shows "krmxr: sleeping for x seconds" [ID 458259.1]
Also a ton of information here :
RMAN backup database as copy from file system to ASM diskgroup very slow
and here :
RMAN Error > Please assist!
and here :
RMAN and Amazon Web Services
http://oravdba.blogspot.com/2011_01_01_archive.html
Best Regards
mseberg

RMAN Cloning Hangs

Hi, in RMAN cloning when i tried to execute the below script
run{
2> allocate auxiliary channel aux1 device type disk;
3> allocate auxiliary channel aux2 device type disk;
4> allocate auxiliary channel aux3 device type disk;
5> duplicate target database to auxclone;
6> }
It hangs at the below point
contents of Memory Script:
shutdown clone;
startup clone nomount ;
executing Memory Script
after a long time i got the following error
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 06/14/2010 18:34:18
RMAN-03015: error occurred in stored script Memory Script
RMAN-06136: ORACLE error from auxiliary database: ORA-01013: user requested cancel of current operation
Any suggestions plsss...........
Regards
Balaji
Edited by: user13254016 on Jun 20, 2010 12:08 PM

user13254016 wrote:
Hi, in RMAN cloning when i tried to execute the below script
run{
2> allocate auxiliary channel aux1 device type disk;
3> allocate auxiliary channel aux2 device type disk;
4> allocate auxiliary channel aux3 device type disk;
5> duplicate target database to auxclone;
6> }
It hangs at the below point
contents of Memory Script:
shutdown clone;
startup clone nomount ;
executing Memory Script
after a long time i got the following error
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 06/14/2010 18:34:18
RMAN-03015: error occurred in stored script Memory Script
RMAN-06136: ORACLE error from auxiliary database: ORA-01013: user requested cancel of current operation
Any suggestions plsss...........
Regards
Balaji
Edited by: user13254016 on Jun 20, 2010 12:08 PMhave you paused in between the process of rman cloning?
RMAN-06136: ORACLE error from auxiliary database: ORA-01013: user requested cancel of current operation
then only you will get the above error. startup in nomount and then execute the script. of duplicate.

RMAN Backup Seems To Hang - Never Finishes

We have a short RMAN block tracking backup script running on four different 11gR1 RAC DBs (11.1.0.7.5). This script only backs up changed records and it has worked fine on all of the 4 RHEL 4-based DBs for many months w/o issue and w/o any changes to the script. Recently, it has encountered an issue on just one of te single RAC DBs. On this DB it runs for days and I eventually have to kill it. The only error I can see in the alert logs is:
ORA-00235: control file read without a lock inconsisten due to concurrent update
I don't know if this error is related to our issue as I see it before the backup starts and quite a bit in subsequent alert (and trace files) logs during the backup. I also realize that this error can often be ignored as being attributed to concurrent activities and that retries can sometimes solve the issue. The first message I see i the log after the backup starts is "Incremental restore complete if datafile 45 to datafile copy /flash/RAC1/datafile/01_mf_mpd_02_6bw0cdy6_.dbf." We obviously use Oracle Managed Files (OMF) and the backup directs the files to be copied over to a flahs area to then be scanned for changes.
I typically run the script from a cron, but this error happens even when I run it from the command line. I normally run it from the 2nd node of the 4 node cluster. Of additional note is a large amount of 2-way streams replication that we run on this DB to/from another DB. The latter DB has no issues with the same backup. I mention this because of the large amount of LogMiner activity that is sometimes goin on along with regular users.
The backup seems to be doing all of the correct things as it always has as you see messages in the backup log indicating channels allocated, stasrting incremantal backups, recovering data files, reading backup pieces form flash, skipping archive logs previously backed up (ALOT of previously backed up archived logs). Again, NO ERRORS appear in the log. The last few lines of the log (before I terminate the run) are:
input archived log tread=1 sequence=110567 RECID=485848 STAMP=746169926
input archived log tread=1 sequence=110568 RECID=485849 STAMP=746170951
input archived log tread=1 sequence=110569 RECID=485850 STAMP=746171860
input archived log tread=1 sequence=110570 RECID=485854 STAMP=746173097
channel ORA_DISK_4: starting piece 1 at 22-MAR-11 <---- this is always the last line of output in the log
I would include the entire log, but our company does not allow us to do this because of security issues. Otherwise, I would include it here. However, I have copied the script in question below to help in problem resolution. This hang condition occurs reagardless of whether thiere is high or low activity on the DB - which is a TEST DB. I just need to get this fixed.
---------------------SCRIPT---------------------
# backup_hot.sh
#!/bin/bash
if [ -z "$1" ] || [ ! -z "$2" ]; then
echo; echo "usage: backup_hot.sh DB_NAME"; echo; exit
fi
TGT=`echo $1 | tr abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ`
LOG_DIR=/apps/oracle/oracle_home/RMAN/scripts/Log
# Setup ORACLE_SID, ORACLE_BASE, ORACLE_HOME, PATH
case $TGT in
RAC1) . ~oracle/rac1.env ;;
RAC2) . ~oracle/rac2.env ;;
FRAC1) . ~oracle/frac1.env ;;
FRAC2) . ~oracle/frac2.env ;;
*) echo; echo "exit: unrecognized database target"; echo; exit ;;
esac
/apps/oracle/product/11.1.0/db_1/bin/rman target sys/mypwd CATALOG rman/[email protected] log $LOG_DIR/`date +%F_%T`_${TGT}_backup_hot << EOFrun
recover copy of database with tag 'incr' until time 'sysdate-7';
backup incremental level 1 cumulative copies=1 section size 1g for recover of copy with tag 'incr' database;
report schema;
backup archivelog all not backed up skip inaccessible;
backup current controlfile;
backup spfile;
change archivelog all crosscheck;
crosscheck backup;
crosscheck copy;
delete noprompt obsolete;
delete noprompt expired backup;
restore database validate preview;
restore spfile validate;
restore controlfile validate;
sql 'alter database backup controlfile to trace';
EOF
--------------------END OF SCRIPT------------------------
Thanks for any help.
Matt

No idea, but some clarifications:
Are they all running the exact same script, or copies? If the latter, look for spaces after the EOF. Check the date on the .env file, too, in case someone modded it and left some garbage that makes itself known later.
EOFrun a copy past typo, or is that an indicator of something screwy in your script?
I believe there is some log buffering going on, so there may be more stuff not written out to the log. So I think it is getting past the starting piece one. If you break up the subsequent commands into their own run statements, maybe it will become clear what is actually stopping.
I believe there is large pool usage, what are your large pool settings.
NFS anywhere in your configuration?
What is $2?
Maybe the controlfile is the problem, maybe the crosschecks are hanging because something else is locking them out... are you sure you don't have other versions of this script still running? What processes are accessing the controlfile? Is anything trying to copy it?

OSB/RMAN to tape it's failed (hang)

We are trying to backup ODA RAC database to SL150 tape library, backup hangs with no errors. Oracle support was trying to solve the problem with no luck. any help please:
n summary:
OSB/RMAN to disk it's OK
OSB/RMAN to fake tape it's OK
OSB/filesystem to tape it's OK
OSB/RMAN to tape it's failed (hang)
we have checked permission devices, RMAN parameters all seems to be ok.

There's not enough information here to begin. Please supply full "obtool lsdev -lvg" output and also the full log files from the jobs.
Thanks
Rich

RMAN - Connect target / Hang

Hello,
Last three days we have our backup failed due to strange error.
We have this backup script running from same Shell script almost last three years. It was working fine till last Friday and suddenly stop working.
We have db upgraded before three months ago from 10.2.0.3 to 10.2.0.4. as far as i know there is no other db changes or OS changes
If i try to connect directly to catalog it's work fine
rman Recovery Manager: Release 10.2.0.4.0 - Production on Tue Jul 13 15:26:59 2010
Copyright (c) 1982, 2007, Oracle. All rights reserved.
RMAN> connect catalog XXX/XX@RMAN_XXX
connected to recovery catalog database
Than it's connect to the catalog fine.
But when i try to connect to the target it's hand forever.
rman Recovery Manager: Release 10.2.0.4.0 - Production on Tue Jul 13 15:29:45 2010
Copyright (c) 1982, 2007, Oracle. All rights reserved.
RMAN> connect target /
Hang
hang
hang
Here is the backup logs
Backup Logs :
XXXXX Online Backup WITH CATALOG Started At : Monday, July 12, 2010 7:00:15 PM CDT
Started running on Mon Jul 12 19:00:15 CDT 2010
Recovery Manager: Release 10.2.0.4.0 - Production on Mon Jul 12 19:00:15 2010
Copyright (c) 1982, 2007, Oracle. All rights reserved.
RMAN>
DB version : oracle 10.2.0.4
Catalog : 10.2.0.3
Os : SunOS 5.10
Experts , please put some lights .
Thanks in advance

NO clue in Alert log.
We have enough space for backup
Here is the output
12263: door_info(7, 0xFFFFFD7FFFDF2AD0) = 0
12263: door_call(7, 0xFFFFFD7FFFDF2B30) = 0
12263: getgid() = 19900 [19900]
12263: getuid() = 623 [623]
12263: getpid() = 12263 [12240]
12263: brk(0x0663CC20) = 0
12263: brk(0x06648C20) = 0
12263: brk(0x06648C20) = 0
12263: brk(0x0664CC20) = 0
12263: brk(0x0664CC20) = 0
12263: brk(0x06650C20) = 0
12263: so_socket(PF_INET, SOCK_DGRAM, IPPROTO_IP, "", SOV_DEFAULT) = 21
12263: bind(21, 0x0664C5F0, 16, SOV_SOCKBSD) = 0
12263: getsockname(21, 0xFFFFFD7FFFDF67D0, 0xFFFFFD7FFFDF6810, SOV_DEFAULT) = 0
12263: getpeername(21, 0xFFFFFD7FFFDF67D0, 0xFFFFFD7FFFDF6810, SOV_DEFAULT) Err#134 ENOTCONN
12263: getsockopt(21, SOL_SOCKET, SO_SNDBUF, 0xFFFFFD7FFFDF6904, 0xFFFFFD7FFFDF6900, SOV_DEFAULT) = 0
12263: getsockopt(21, SOL_SOCKET, SO_RCVBUF, 0xFFFFFD7FFFDF6904, 0xFFFFFD7FFFDF6900, SOV_DEFAULT) = 0
12263: fcntl(21, F_SETFD, 0x00000001) = 0
12263: ioctl(21, FIONBIO, 0xFFFFFD7FFFDF69A8) = 0
12263: brk(0x06650C20) = 0
12263: brk(0x06658C20) = 0
12263: access("/var/tmp/.oracle", F_OK) = 0
12263: chmod("/var/tmp/.oracle", 01777) Err#1 EPERM [ALL]
12263: so_socket(PF_UNIX, SOCK_STREAM, 0, "", SOV_DEFAULT) = 22
12263: access("/var/tmp/.oracle/sprocr_local_conn_0_PROC", F_OK) = 0
12263: connect(22, 0xFFFFFD7FFFDF0668, 110, SOV_DEFAULT) = 0
12263: fcntl(22, F_SETFD, 0x00000001) = 0
12263: sigaction(SIGPIPE, 0xFFFFFD7FFFDF0870, 0xFFFFFD7FFFDF0960) = 0
12263: ioctl(22, FIONBIO, 0xFFFFFD7FFFDF1DD8) = 0
12263: write(22, " 0\0\0\001\001\001\001\0".., 48) = 48
12263: pollsys(0x0664A540, 2, 0x00000000, 0x00000000) = 1
12263: pollsys(0x0664A540, 2, 0x00000000, 0x00000000) = 1
12263: read(22, " 0\0\0\001\001\001\001\0".., 32768) = 48
12263: write(22, " 4\0\0\001\002\001010101".., 52) = 52
12265: read(13, 0x06512FA6, 2064) (sleeping...)
12240: read(12, 0x012F1056, 2064) (sleeping...)
12263: pollsys(0x0664A540, 2, 0x00000000, 0x00000000) (sleeping...)
12240: Received signal #2, SIGINT, in read() [caught]
12240: read(12, 0x012F1056, 2064) Err#91 ERESTART
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12240: kill(12263, SIGURG) = 0
12263: Received signal #21, SIGURG, in pollsys() [caught]
12263: siginfo: SIGURG pid=12240 uid=623
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12263: pollsys(0x0664A540, 2, 0x00000000, 0x00000000) Err#91 ERESTART
12240: setcontext(0xFFFFFD7FFFDD8410)
12263: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12263: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12263: lwp_sigmask(SIG_SETMASK, 0x9FBED057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12263: setcontext(0xFFFFFD7FFFDF0A10)
12240: Received signal #2, SIGINT, in read() [caught]
12240: read(12, 0x012F1056, 2064) Err#91 ERESTART
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12240: setcontext(0xFFFFFD7FFFDD8410)
12240: Received signal #2, SIGINT, in read() [caught]
12240: read(12, 0x012F1056, 2064) Err#91 ERESTART
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12240: setcontext(0xFFFFFD7FFFDD8410)
12240: Received signal #2, SIGINT, in read() [caught]
12240: read(12, 0x012F1056, 2064) Err#91 ERESTART
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12240: lwp_sigmask(SIG_SETMASK, 0x9FBEF057, 0x0000FFF7) = 0xFFBFFEFF [0x0000FFFF]
12240: setcontext(0xFFFFFD7FFFDD8410)
12263: pollsys(0x0664A540, 2, 0x00000000, 0x00000000) (sleeping...)
12240: read(12, 0x012F1056, 2064) (sleeping...)

RMAN Hanging

Similar Messages

Maybe you are looking for