Backup Suspended
I am facing issue in one of my production server. Backup
of a database appear suspended it is not progressing after attaining 99%completion. I restarted the sql service and again ran the backup for the db still it is getting suspended when it reached 99%completion. size of the db is 1 GB. It is in suspended state
for last 6 days. iam facing issue in sql server 2008 enterprise edition. i need help on this and wait type is CMEMTHREAD
Did you run checkdb and was it successful? Do you have a corrupt full text catalog - can you rebuild it and retry the backup?
Satish Kartan www.sqlfood.com
Similar Messages
-
Backup Suspended on Agent Unreachable - 11GR2 Windows
Hi as the title suggests an RMAN backup has gone into a state of Suspended on Agent Unreachable.
I have a 2 node RAC and have run emctl status dbconsole on each node. On node 1 I get a message: EM Daemon is running. On node 2 I get a message: Oracle enterprise Manager 11g is running.
Why are these messages different? and could this be the reason why the job has gone into the state it has?
Thanks in advanceI strongly suggest to not using RMAN via OEM, but with the good old command line.
The messages are different because on one node there is the real web app (oc4j...) while on the other one there is only the agent running so it is expected.
Alessio -
How to temporarily suspend backups for a host?
We are running production OSB backups for a site with approximately 50 filesystems,
consisting of full, incremental, quarterly archive, and on-demand archive backups.
Our installation consists of
o administration master - obtool version 10.4.0.2.0 (64-bit OEL 5.8)
o media servers - obtool version 10.4.0.2 (64-bit OEL 5.8)
o SL8500 library - managed by ACSLS 8.1 with admin server attach point
Problem:
We recently lost a ZFS storage appliance storage pool due to pool corruption.
All backed-up filesystems were restored to another host pending a rebuild of
the corrupted ZFSSA.
Currently all backups are fail because they cannot communicate with the down
host. We want to suspend all attempts to back up the corrupted host pending
the rebuild. We have tried taking the host out of service via 'chhost -O'
but the scheduler launches backups that hang indefinitely with a session
status of 'awaiting resource availability'.
Is there any way to suspend backups of a host temporarily without having to
remove it from all the schedules it's in? Seems there should be but I've
been unable to find this in the docs.
Appreciate any help.
Thanks,
-pcYou can just comment it out of the dataset using a #
Then when you're ready to put it back, just uncomment it again.
Thanks
Rich -
Rman backup job Suspended on Agent Unreachable
Hi,
I have Oracle 10.2.0 installed on windows 2003 server. i configure oracle suggested rman backup which works fine till now. but now it shows
Suspended on Agent Unreachable message and backup doesn't happen.
if check like
C:\Documents and Settings\Administrator>emctl status agen
Oracle Enterprise Manager 10g Database Control Release 10
Copyright (c) 1996, 2005 Oracle Corporation. All rights
Agent is Not Running
C:\Documents and Settings\Administrator>emctl start agent
Oracle Enterprise Manager 10g Database Control Release 10
Copyright (c) 1996, 2005 Oracle Corporation. All rights
These Windows services are started:
Application Experience Lookup Service
Application Management
Automatic Updates
COM+ Event System
COM+ System Application
Computer Browser
Cryptographic Services
DCOM Server Process Launcher
DefWatch
Distributed File System
Distributed Transaction Coordinator
DNS Client
DNS Server
Error Reporting Service
Event Log
File Replication Service
FTP Publishing Service
HTTP SSL
IIS Admin Service
Indexing Service
Intel PDS
Intersite Messaging
IPSEC Services
Kerberos Key Distribution Center
Logical Disk Manager
LogMeIn
LogMeIn Maintenance Service
Net Logon
Network Connections
Network Location Awareness (NLA)
NT LM Security Support Provider
OracleDBConsoleneo
OracleOraDb10g_home1iSQL*Plus
OracleOraDb10g_home1TNSListener
OracleServiceNEO
Plug and Play
Print Spooler
Protected Storage
Remote Access Connection Manager
Remote Procedure Call (RPC)
Remote Registry
Secondary Logon
Security Accounts Manager
Server
Shell Hardware Detection
SQL Server (MSSQLSERVER)
SQL Server Agent (MSSQLSERVER)
SQL Server Analysis Services (MSSQLSERVER)
SQL Server FullText Search (MSSQLSERVER)
SQL Server Integration Services
SQL Server Reporting Services (MSSQLSERVER)
Symantec AntiVirus Server
System Event Notification
Task Scheduler
TCP/IP NetBIOS Helper
Telephony
Terminal Services
Windows Management Instrumentation
Windows Search
Windows Time
Workstation
World Wide Web Publishing Service
The command completed successfully.means my agent was stop but now i start it.
i find on Google that this type of error happen when you change in database like ip address, sid etc but i haven't do any change.
i am facing same problem so many times.
Now how to reschedule the existing backup job so that it starts work again.
thanks
umeshthanks for reply
i do it.
C:\Documents and Settings\Administrator>emctl status dbconsole
Oracle Enterprise Manager 10g Database Control Release 10.2.0.1.0
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
http://Techwave:5500/em/console/aboutApplication
EM Daemon is not running.
Logs are generated in directory C:\oracle\product\10.2.0\db_1/Techwave_neo/sysman/log
C:\Documents and Settings\Administrator>emctl stop dbconsole
Oracle Enterprise Manager 10g Database Control Release 10.2.0.1.0
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
http://Techwave:5500/em/console/aboutApplication
The OracleDBConsoleneo service is stopping.....
The OracleDBConsoleneo service was stopped successfully.
C:\Documents and Settings\Administrator>emctl start dbconsole
Oracle Enterprise Manager 10g Database Control Release 10.2.0.1.0
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
http://Techwave:5500/em/console/aboutApplication
Starting Oracle Enterprise Manager 10g Database Control ...The OracleDBConsoleneo service is starting................
The OracleDBConsoleneo service was started successfully.now it is working fine.
now issue is how i reschedule my suspended job.
i got this error.
Edit is not supported for this job type, only general information about the job can be updated No scheduled executions - this job can not be modified. -
How to change backup location in suspend mode?
Hello
I am trying to change the backup location in an image that is in suspend mode to different location into different deployment share.
I have tried modifying varialbles.dat in Minint so it will reflect the new location in new deployment share, but it did not work.
ThanksThe variables.dat was my suggestion.
I'm struggling to understand why you can't just restart the deployment.
Blog: http://scriptimus.wordpress.com -
How to ASR 9k auto backup to external FTP server
How to take auto running configuration backup when use commit command in asr9k .Our asr9k sofware Version 4.0.1[Default] .I usse the commands
ftp client password encrypted 050D121F345F4B1B4
ftp client username cisco
ftp client source-interface GigabitEthernet0/0/0/14
ftp client anonymous-password cisco
(Password information changed)
configuration commit auto-save filename ftp://10.10.10.3/ASR9K/asr_conf
But auto backup not happening .following errror showing after every COMMIT command.
( Error:Couldn't save file /ftp://10.10.10.3/ASR9k/asr_conf.
Error:'CfgMgr' detected the 'warning' condition 'Operation is temporarily suspended.' )
Manually I able to take Running-configuration backup through ftp int same lacation (
ftp://10.10.10.3/ASR9K/asr_conf ) .But automatically not happening .Please help .I have not personally seen this issue before but you may find verifying your config helpful.
SUMMARY STEPS
1. configure
2. show running-config
3. describe hostname hostname
4. end
5. show sysdb trace verification shared-plane | include path
6. show sysdb trace verification location node-id
7. show cfgmgr trace
8. show configuration history commit
9. show configuration commit changes {last | since | commit-id}
10. show config failed startup
11. cfs check --> Verifies the Configuration File System (CFS)
However I did notice the following:
Error:Couldn't save file /ftp://10.10.10.3/ASR9k/asr_conf. Notice the preceeding '/'?
Thanks -
How to migrate a suspended domain from one server to another?
I am using oracle vm server 2.2.2 and i am trying to create the disaster recovery strategy for my environment
Initially i have successfully paused the vm VM1 using the "xm save VM1 suspended_file " command and save the memory to a file named suspended_file
then i backed up the directory under OVS/runnning_pool on the magnetic tape includind the suspended file
when i tried to restore the vm1 to a new server i was not able to restore the vm using the xm restore command.I have import the virtual machine from vm manager and therefore the paths in configuration file contains the correct paths pointing to the images in the new server pool.
however the following error appears when run the xm restore suspended_file
Error :disk image does not exit /old_path_to_the_image
For unknown reason the suspended file contains the old path to the disk image.Is there any way to alter the suspended file in order to point to the new directory
If i start the vm ignoring the suspend file could cause any problems ?
thanks in advance.
i am waiting for your responsesIf all of the database files, including the binaries, are on a SAN you might be able to...
1. Cold backup of everything - database files and binaries.
2. Dismount the SAN volumes from the old server.
3. Mount the SAN volumes to the new server - using the same mount points.
4. Start up the database on the new server.
If the OS version/release are the same between new and old, that should be about all you need to do.
If the OS is same but upgraded to new version/release you will want to relink the Oracle executables before starting up. -
How do I Access purple backups from external hard drive for Time Machine
My hard drive crashed and I installed a new one.
Did internet recovery and upgraded to my previous OS.
I am trying to get my data but the backup is purple that I need and I cannot access it.
How do I do this.
Also if I did something wrong, how do I setup Time machine to restore my whole system and set it up so the backups are accessible instead of purple.
Thanks in advance.
I am using Mavericks
External hd connected with USBdonavonknight
Very impressive that it is that easy but trying to get my data is a pain.
Time Machine is a backup of your computer SYSTEM, not idealized as a data archive.
Consider other options for the future >
Data Storage Platforms; their Drawbacks & Advantages
#1. Time Machine / Time Capsule
Drawbacks:
1. Time Machine is not bootable, if your internal drive fails, you cannot access files or boot from TM directly from the dead computer.
2. Time machine is controlled by complex software, and while you can delve into the TM backup database for specific file(s) extraction, this is not ideal or desirable.
3. Time machine can and does have the potential for many error codes in which data corruption can occur and your important backup files may not be saved correctly, at all, or even damaged. This extra link of failure in placing software between your data and its recovery is a point of risk and failure. A HD clone is not subject to these errors.
4. Time machine mirrors your internal HD, in which cases of data corruption, this corruption can immediately spread to the backup as the two are linked. TM is perpetually connected (or often) to your computer, and corruption spread to corruption, without isolation, which TM lacks (usually), migrating errors or corruption is either automatic or extremely easy to unwittingly do.
5. Time Machine does not keep endless copies of changed or deleted data, and you are often not notified when it deletes them; likewise you may accidently delete files off your computer and this accident is mirrored on TM.
6. Restoring from TM is quite time intensive.
7. TM is a backup and not a data archive, and therefore by definition a low-level security of vital/important data.
8. TM working premise is a “black box” backup of OS, APPS, settings, and vital data that nearly 100% of users never verify until an emergency hits or their computers internal SSD or HD that is corrupt or dead and this is an extremely bad working premise on vital data.
9. Given that data created and stored is growing exponentially, the fact that TM operates as a “store-it-all” backup nexus makes TM inherently incapable to easily backup massive amounts of data, nor is doing so a good idea.
10. TM working premise is a backup of a users system and active working data, and NOT massive amounts of static data, yet most users never take this into consideration, making TM a high-risk locus of data “bloat”.
11. TM like all HD-based data is subject to ferromagnetic and mechanical failure.
12. *Level-1 security of your vital data.
Advantages:
1. TM is very easy to use either in automatic mode or in 1-click backups.
2. TM is a perfect novice level simplex backup single-layer security save against internal HD failure or corruption.
3. TM can easily provide a seamless no-gap policy of active data that is often not easily capable in HD clones or HD archives (only if the user is lazy is making data saves).
#2. HD archives
Drawbacks:
1. Like all HD-based data is subject to ferromagnetic and mechanical failure.
2. Unless the user ritually copies working active data to HD external archives, then there is a time-gap of potential missing data; as such users must be proactive in archiving data that is being worked on or recently saved or created.
Advantages:
1. Fills the gap left in a week or 2-week-old HD clone, as an example.
2. Simplex no-software data storage that is isolated and autonomous from the computer (in most cases).
3. HD archives are the best idealized storage source for storing huge and multi-terabytes of data.
4. Best-idealized 1st platform redundancy for data protection.
5. *Perfect primary tier and level-2 security of your vital data.
#3. HD clones (see below for full advantages / drawbacks)
Drawbacks:
1. HD clones can be incrementally updated to hourly or daily, however this is time consuming and HD clones are, often, a week or more old, in which case data between today and the most fresh HD clone can and would be lost (however this gap is filled by use of HD archives listed above or by a TM backup).
2. Like all HD-based data is subject to ferromagnetic and mechanical failure.
Advantages:
1. HD clones are the best, quickest way to get back to 100% full operation in mere seconds.
2. Once a HD clone is created, the creation software (Carbon Copy Cloner or SuperDuper) is no longer needed whatsoever, and unlike TM, which requires complex software for its operational transference of data, a HD clone is its own bootable entity.
3. HD clones are unconnected and isolated from recent corruption.
4. HD clones allow a “portable copy” of your computer that you can likewise connect to another same Mac and have all your APPS and data at hand, which is extremely useful.
5. Rather than, as many users do, thinking of a HD clone as a “complimentary backup” to the use of TM, a HD clone is superior to TM both in ease of returning to 100% quickly, and its autonomous nature; while each has its place, TM can and does fill the gap in, say, a 2 week old clone. As an analogy, the HD clone itself is the brick wall of protection, whereas TM can be thought of as the mortar, which will fill any cracks in data on a week, 2-week, or 1-month old HD clone.
6. Best-idealized 2nd platform redundancy for data protection, and 1st level for system restore of your computers internal HD. (Time machine being 2nd level for system restore of the computer’s internal HD).
7. *Level-2 security of your vital data.
#4. Online archives
Drawbacks:
1. Subject to server failure or due to non-payment of your hosting account, it can be suspended.
2. Subject, due to lack of security on your part, to being attacked and hacked/erased.
Advantages:
1. In case of house fire, etc. your data is safe.
2. In travels, and propagating files to friends and likewise, a mere link by email is all that is needed and no large media needs to be sent across the net.
3. Online archives are the perfect and best-idealized 3rd platform redundancy for data protection.
4. Supremely useful in data isolation from backups and local archives in being online and offsite for long-distance security in isolation.
5. *Level-1.5 security of your vital data.
#5. DVD professional archival media
Drawbacks:
1. DVD single-layer disks are limited to 4.7Gigabytes of data.
2. DVD media are, given rough handling, prone to scratches and light-degradation if not stored correctly.
Advantages:
1. Archival DVD professional blank media is rated for in excess of 100+ years.
2. DVD is not subject to mechanical breakdown.
3. DVD archival media is not subject to ferromagnetic degradation.
4. DVD archival media correctly sleeved and stored is currently a supreme storage method of archiving vital data.
5. DVD media is once written and therefore free of data corruption if the write is correct.
6. DVD media is the perfect ideal for “freezing” and isolating old copies of data for reference in case newer generations of data become corrupted and an older copy is needed to revert to.
7. Best-idealized 4th platform redundancy for data protection.
8. *Level-3 (highest) security of your vital data.
[*Level-4 data security under development as once-written metallic plates and synthetic sapphire and likewise ultra-long-term data storage] -
Hi,
I faced this problem suddenly when i was trying some indexes on a local dummy table(using 'sa' user id) on DB COSMOS. After working around indexes i saved the file and closed Query Analyzer. On relogging again with 'sa' user Id i faced the following problem "A connection was successfully established but then an error occured during the login process.(provider : Shared Memory Provider, error:0 - No process is on the other end of the pipe.) "
Then i logged in with Windows Authentication account. however i couldnt access DB COSMOS as it was put in Suspended Mode. On checking the event viewer i found below error
"SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xb42e9a1f; actual: 0xb45a9f73). It occurred during a read of page (2:0) in database ID 7 at offset 0000000000000000 in file 'E:\COSMIC\COSMOS_log.ldf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online."
Well how the above error can be repaired. There are no backups of this db. I cannot run DBCC CHECKDB command for COSMOS as the same has been suspended.
Please give me the solution to above problem.
ThanksHi ,
Thanks for the info. Lekss thanks for that article..it was quite useful.
Please check the below error that i found in error log.
2009-12-29 21:09:45.81 spid20s Error: 824, Severity: 24, State: 2.
2009-12-29 21:09:45.81 spid20s SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xb42e9a1f; actual: 0xb45a9f73). It occurred during a read of page (2:0) in database ID 7 at offset 0000000000000000 in file 'E:\COSMIC\COSMOS_log.ldf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
2009-12-29 21:09:45.81 spid20s Error: 3414, Severity: 21, State: 2.
2009-12-29 21:09:45.81 spid20s An error occurred during recovery, preventing the database 'COSMOS' (database ID 7) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.
2009-12-29 21:09:45.85 spid53 EMERGENCY MODE DBCC CHECKDB (COSMOS, repair_allow_data_loss) WITH no_infomsgs executed by HOME\ADMIN terminated abnormally due to error state 5. Elapsed time: 0 hours 0 minutes 1 seconds.
by putting my DB in emergency mode i could get the tables.Luckily there were only ten tables. however how do i take backup of the entire database in this case. As mentioned above, on executing DBCC CheckDB ('DBname', REPAIR_ALLOW_DATA_LOSS) after setting dB to single User mode, the system still displayed that the dB was in suspect mode and cannot be opened. Also the other system dbs including master is damaged and in Online mode. how do i get damaged free dBs. What should be done in real live situation where data is very important. -
Backup method in Noarchive mode
Hi All,
I have one question regarding backup methods. I have my all database in Oracle11gR2 in Noarchive mode. I want to define some backup method but no idea which one is preferable in noarchive mode. Cold backup is again not possible. can I take tablespace/datafile backup ? Is there any other option which I can perform in Noarchive mode or temporarily put the database in Archive and perform the backup?
Please guide me.
Thanks...user12115 wrote:
Hi All,
First of all thanks for your reply and suggestions. I have not lost the interest but reading and thinking on all your suggestion.
About expdp yes I think this is the only option I was thinking before post a question here. but as we know this is should not consider as a backup.
For Archive mode - this is R&D database and we have disk space issue if we can not put database in Archive mode.
For COld backup - again not possible because single database uses in different time zone so DB should be up and running almost all the time.I think the operative word there is "almost". How big is the database? How long does it take to shut it down, back it up, and bring it back up? Unless the database is really big (why would an r&d database be really big, unless r&d was specific to VLDB) you should be able to do the job during the guy's lunch break. Even for an r&d database, you might be able to justify noarchivelog, but you will really feel the pain if you get a physically corrupted database and all you have is an export.
>
The only one option which I am not aware is what about John speicified. John can you please suggest or forward me some link with some backup example for more details. this is really helpful to me. From your suggested link I understand this is possible but after put the database in suspend mode how do I copy the Tablespace/datafiles - just simple Linux cp command or some other options also?
Edited by: user12115 on Mar 5, 2013 5:08 AM
Edited by: user12115 on Mar 5, 2013 5:15 AM -
Trying to email database backup notification, boxes are greyed out
Hi all,
I am running OEM Grid Control version 10.2.0.3.0. I have properly setup an SMTP server, with all of the notification rules, and it all works fine, as far as notifying me when a database is down, tablespace gets full, etc...
However, for some reason, I can't get it to send an email notifcation from a db backup job. I have an existing job, that does a full backup of a 9.2 database, and it all works fine. But now when I go in and edit the job, then click on the Access tab, at the bottom I see the "Email notification for owner" section, and I see the 5 boxes that could be checked, Scheduled, Running, Suspended, Completed, Problems. But all of the checkboxes are greyed out - I can't check any of them. Does anyone know why this might be? I thought it might be because this is backing up a 9i database, but then I went and checked for my backup jobs against 10g databases, and I see the same issue. I am the owner of all of these jobs, and I am a "Super Administrator", so I don't know what else to check...
Any thoughts?
Thanks!!Hi all,
Just wanted to update this issue, in case anyone was interested. I ended up logging an SR with Oracle, and got a reply back that this is a known issue, here's what they said:
This is a known issue. I assume here you've gone into the database pages and scheduled a b
ackup job from there, then when you look in the jobs tab you cannot edit the not
ifications for the job. You cannot edit the notifications for this type of job.
Here is what you can do:
1. Go to your Database --> Maintenance --> Schedule Backup
2. Work through the backup wizard until the review page.
3. Copy the RMAN script from this page.
4. Click on the Jobs tab --> Create Job "RMAN Script" --> Go
5. Paste the copied script from 3. under the parameters tab.
6. Go to the Access tab and you will now be able to set E-Mail Notification for Owner -
Snapshot Backups on HP EVA SAN
Hi everyone,
We are implementing a new HP EVA SAN for our SAP MaxDB Wintel environment. As part of the SAN setup we will be utilising the EVAs snapshot technology to perform a nightly backup.
Currently HP Data Protector does not support MaxDB for its "Zero Downtime Backup" concept (ZDB), thus we need to perform LUN snapshots using the EVAs native commands. ZDB would have been nice as it integrates into SAP and lets the DB/SAP know when a snapshot backup has occurred. However as I mentioned this feature is not available on MaxDB (only SAP on Oracle).
We are aware that SAP supports snapshots on external storage devices as stated in OSS notes 371247 and 616814.
To perform the snapshot we would do something similar (if not exactly) like note 616814 describes as below:
To create the split mirror or snapshot, proceed as follows:
dbmcli -d <database_name> -u < dbm_user>,<password>
util_connect < dbm_user>,<password>
util_execute suspend logwriter
==> Create the snapshot on the EVA
util_execute resume logwriter
util_release
exit
Obviously MaxDB and SAP are unaware that a "backup" has been performed. This poses a couple of issues that I would like to see if anyone has a solution too.
a. To enable automatic log backup MaxDB must know that it has first completed a "full" backup. Is it possible to have MaxDB be aware that a snapshot backup has been taken of the database, thus allowing us to enable automatic log backup?
b. SAP also likes to know its been backed up also. Earlywatch Alert reports start to get a little upset when you don't perform a backup on the system for awhile.
Also DB12 will mention that the system isn't in a recoverable state, when in fact it is. Any work arounds available here?
Cheers
ShaunHi Shaun,
interesting thread sofar...
> It would be nice to see HP and SAP(MaxDB) take the snapshot technology one or two steps further, to provide a guaranteed consistent backup, and can be block level verified. I think HPs ZDB (zero downtime backup eg snapshots) technology for SAP on Oracle using Data Protector does this now?!??!
Hmm... I guess the keyword here is 'market'. If there is enough market potential visible, I tend to believe that both SAP and HP would happily try to deliver such tight integration.
I don't know how this ZDB stuff works with Oracle, but how could the HP software possibly know how a Oracle block should look like?
No, there are just these options to actually check for block consistency in Oracle: use RMAN, use DBV or use SQL to actually read your data (via EXP, EXPDB, ANALYZE, custom SQL)
Even worse, you might come across block corruptions that are not covered by these checks really.
> Data corruption can mean so many things. If your talking structure corruption or block corruption, then you do hope that your consistency checks and database backup block checks will bring this to the attention of the DBA. Hopefully recovery of the DB from tape and rolling forward would resolve this.
Yes, I was talking about data block corruption. Why? Because there is no reliable way to actually perform a semantic check of your data. None.
We (SAP) simply rely on that, whatever we write to the database by the Updater is consistent from application point of view.
Having handled far too much remote consulting messages concerning data rescue due to block corruptions I can say: getting all readable data from the corrupt database objects is really the easy part of it.
The problems begin to get big, once the application developers need to think of reports to check and repair consistency from application level.
> However if your talking data corruption as is "crap data" has been loaded into the database, or a rogue ABAP has corrupted several million rows of data then this becomes a little more tricky. If the issue is identified immediately, restoring from backup is a fesible option for us.
> If the issue happened over 48hrs ago, then restoring from a backup is not an option. We are a 24x7x365 manufacturing operation. Shipping goods all around the world. We produce and ship to much product in a 24hr window that can not be rekeyed (or so the business says) if the data is lost.
Well in that case you're doomed. Plain and simple. Don't put any effort into getting "tricky", just let never ever run any piece of code that had not passed the whole testfactory. That's really the only chance.
> We would have to get tricky and do things such as restore a copy of the production database to another server, and extract the original "good" documents from the copy back into the original, or hopefully the rogue ABAP can correct whatever mistake they originally made to the data.
That's not a recovery plan - that is praying for mercy.
I know quite a few customer systems that went to this "solution" and had inconsistencies in their system for a long long time afterwards.
> Look...there are hundreds of corruption scenarios we could talk about, but each issue will have to be evaluated, and the decision to restore or not would be decided based on the issue at hand.
I totally agree.
The only thing that must not happen is: open a callconference and talk about what a corruption is in the first place, why it happened, how it could happen at all ... I spend hours of precious lifetime in such non-sense call confs, only to see - there is no plan for this at customer side.
> I would love to think that this is something we could do daily to a sandpit system, but with a 1.7TB production database, our backups take 6hrs, a restore would take about 10hrs, and the consistency check ... well a while.
We have customers saving multi-TB databases in far less time - it is possible.
> And what a luxury to be able to do this ... do you actually know of ANY sites that do this?
Quick Backups? Yes, quite a few. Complete Backup, Restore, Consistency Check cycle? None.
So why is that? I believe it's because there is no single button for it.
It's not integrated into the CCMS and/or the database management software.
It might also be (hopefully) that I never hear of these customers. See as a DB Support Consultant I don't get in touch with "sucess stories". I see failures and bugs all day.
To me the correct behaviour would be to actually stop the database once the last verified backup is too old. Just like everybody is used to it, when he hits a LOGFULL /ARCHIVER STUCK situation.
Until then - I guess I will have a lot more data rescue to do...
> Had a read ... being from New Zealand I could easily relate to the sheep =)
> Thats not wan't I meant. Like I said we are a 24x7x365 system. We get a maximum of 2hrs downtime for maintenance a month. Not that we need it these days as the systems practically run themselves. What I meant was that between 7am and 7pm are our busiest peak hours, but we have dispatch personnel, warehouse operations, shift supervisors ..etc.. as well as a huge amount of batch running through the "night" (and day). We try to maintain a good dialog response during the core hours, and then try to perform all the "other" stuff around these hours, including backups, opt stats, and business batch, large BI extractions ..etc..
> Are we busy all day and night ... yes ... very.
Ah ok - got it!
Especially in such situations I would not try to implement consistency checks on your prod. database.
Basically running a CHECK DATA there does not mean anything. Right after a table finished the check it can get corrupted although the check is still running on other tables. So you have no guranteed consistent state in a running database - never really.
On the other hand, what you really want to know is not: "Are there any corruptions in the database?" but "If there would be any corruptions in the database, could I get my data back?".
This later question can only be answered by checking the backups.
> Noted and agreed. Will do daily backups via MaxDB kernel, and a full verification each week.
One more customer on the bright side
> One last question. If we "restored" from an EVA snapshot, and had the DB logs upto the current point-in-time, can you tell MaxDB just to roll forward using these logs even though a restore wasn't initiated via MaxDB?
I don't see a reason why not - if you restore the data and logarea and bring the db to admin mode than it uses the last successfull savepoint for startup.
If you than use recover_start to supply more logs that should work.
But as always this is something that needs to be checked on your system.
That has been a really nice discussion - hope you don't get my comments as offending, they really aren't meant that way.
KR Lars -
Solaris 11.1 x86, lots of "snapshot already exists" in backup pool
Experts,
This is strange. I have been using time-slider and a synchronization script every 15 minutes since the OpenSolaris days. The recent update to 11.1 has caused the backup pool randomly to refuse to delete phantom snapshots during a weekly trim of the pool. Just now I saw several errors during a snapshot delete:
deleting: backup/export/site@zfs-auto-snap_daily-2012-11-05-14h48
cannot destroy 'backup/export/site@zfs-auto-snap_daily-2012-11-05-14h48': snapshot already exists
The solution is to export then import the pool, but this happened earlier today and I could not export, even with the -f flag because the pool was busy (?). I suspended the synchronization and could find no processes with open files on mount points. I settled on a reboot.
Does this happen to anyone else? Is there a less disruptive way to recover?
Also, I have noticed that CPU utilization in pool and I/O operations is markedly higher on 11.1 vs. 11. Server is HP DL585 with 4x dual core Opteron 8216. Before the 11.1 upgrade, scrubbing three pools could sustain 400MB/s throughput with about 30% CPU utilization. After 11.1, the scrub will consume about 70% CPU with 320-ish MB/s throughput. Nothing else has changed. Similar effects happen during the regular snapshot synchronizations.
Any insight on either issue is appreciated.
Thanks,
MartyI also get this - it's quite annoying.
This may help:
#!/bin/sh
svcadm disable -t time-slider
for f in `zfs list -H -t snapshot -o name`; do zfs release -r org.opensolaris:time-slider-plugin:zfs-send $f; done
svcadm clear svc:/application/time-slider/plugin:zfs-send
svcadm enable time-slider -
How do you "unsuspend" a suspended job in GRID?
Hello all,
I have a GRID job that is running a RMAN backup, an incremental level=1 backup.
This job runs at the end of each day. For several days now...it just fails with the error message:
"An execution in the current run or one of the previous runs was suspended by the system"
I cannot seem to figure a way to UNsuspend this job to it will start to run again. I've tried to use the RESUME button on this job, but it tells me:
"Error
Cannot resume job: job is not suspended but executions may be suspended. If one or more executions are suspended, please resume executions individually."
I'm stuck..and so far, I can't find anything on Oracle Support to remedy this.
Any suggestions?
Thank you,
cayenneTry this:
Log in to the console, go to the job tab, click on the job activity menu and advanced search.
On the status field select "Problems" and enter the target name (database name as it is registered), then click go.
If I am right you should get a list of all the backup job runs for that database with problems, in the view field change to "executions".
I think by default the executions should be ordered by scheduled date/time, if you go to the first one with issues, click on it and that will take you to the execution detail.. in there should be a "Retry" button, click on it and if the target is up and the job correct it will execute properly and the next backup automatically scheduled should run OK.
If it does not complete properly, then check the output and fix the issue, or post the output here...
ef -
Oracle Management Console 10g - Job Status - Suspended on Agent Unreachable
Recently we updated our RDMS from 10.2.0.1 to 10.2.0.4.0 PATCH 25.
Ever since we upgraded, we have had trouble with our RMAN backups. Now when a scheduled backup begins it never ends. Now whenever we schedule a job, from a backup to any simple system command, the status immediately returns "Suspended on Agent Unreachable".
We can start and stop the dbconsole successfully, and I can use the OEM to monitor the database, and make changes to that. However, I cannot run any scheduled database jobs through the OEM. However, I can run the rman jobs via the command line.
The database server, and the OEM console is on the save server. I am not running RAC. Everything database related is on this one server.
Here are the results for emctl status agent:
E:\oracle\product\10.2.0\db_1\BIN>emctl status agent
Oracle Enterprise Manager 10g Database Control Release 10.2.0.4.0
Copyright (c) 1996, 2007 Oracle Corporation. All rights reserved.
Agent Version : 10.1.0.6.0
OMS Version : 10.1.0.6.0
Protocol Version : 10.1.0.2.0
Agent Home : E:\oracle\product\10.2.0\db_1\content.mydomain.com_ORCL
Agent binaries : E:\oracle\product\10.2.0\db_1
Agent Process ID : 34372
Agent URL : http://content.mydomain.com:3938/emd/main
Started at : 2010-01-11 14:58:24
Started by user : SYSTEM
Last Reload : 2010-01-11 14:58:24
Last successful upload : (none)
Last attempted upload : (none)
Total Megabytes of XML files uploaded so far : 0.00
Number of XML files pending upload : 5016
Size of XML files pending upload(MB) : 42.77
Available disk space on upload filesystem : 37.94%
Agent is Running and ReadyHere are the results for emctl status agent:
E:\oracle\product\10.2.0\db_1\BIN>emctl upload
Oracle Enterprise Manager 10g Database Control Release 10.2.0.4.0
Copyright (c) 1996, 2007 Oracle Corporation. All rights reserved.
EMD upload error: uploadXMLFiles skipped :: OMS version not checked yet..I think the EMD upload error may be the problem, but I'm unsure how to resolve this.
What do I need to do in order to resolve this issue?
If any more info would be useful, please let me know and I will post it immediately.
thanks.Rondeyli,
Thanks, that was it. I followed your instructions, and was able to get the system to work. I had to alter the commands a bit to get them to work on my system, so here is what I did.
I performed the following steps:
1. Ran the following command:
emctl stop dbconsole
2. deleted all files in $AGENT_HOME/sysman/emd/upload and $AGENT_HOME/sysman/emd/state
3. Ran the following command:
emctl clearstate dbconsole
4. Ran the following command:
emctl secure dbconsole
5. Ran the following command:
emctl start dbconsoleThis got everything running for me.
thanks.
Maybe you are looking for
-
Hi Xperts, i have created a report on a multiprovider.now we have arequirment that if the Material number is same in Both the cube then it should populate in cube1 in Report output.(Muliprovider have 2 cubes.). Fro Exp.if the report is(Now the report
-
Embedding video in a gui (getting started advice)
I have a project with a swing gui that i would like to embed a video screen in. the video will stream from a web camera and provide a live picture. I am new to the idea of handling video streams so have i come to the right place? what should i look a
-
Changing "Reply to" email option in iPhone 3.0?
I am wondering if anyone has heard about Apple adding a "reply to" email change option? I don't want to use Gmail or anything like that. I just want a native option to do so.
-
Extending the display duration of text
What property controls the duration a text will be displayed. Currently, I have a 10 second animation, and after adding the text it disappeared after 3 seconds. I have chosen a LiveType font (Iglow) and when I changed the direction property to Ping-P
-
Mapping labour as resource in snp
Hi , any idea if labour can be mapped in SNP as resource? how can it be done? Regards, Illford