Recovery Mechanism in Solaris

Hi to all,
I am new to Solaris (comimg from HP-UX world) and I was wondering if there is some tool in Solaris world for making exact image of the system and use it afterwards to restore the system as it was at the moment of taking the image.
HP-UX have such tool called ignite make_tape_recovery and is very handy tool fot this pourpuse.
Something like this in Solaris?

dejan.stojcevski wrote:
Thanks a lot Ivan.
This answered my question.
I will search around to learn some more about flash archives and see what they can do too.
Anyway a little comparisson with HP's make_tape_recovery:
1. make_tape_recovery creates a bootable tape. No need to boot from instalation CD. You boot directly from the tape. ufsdump is not doing this.
2. make_tape_recovery does not require to partition the underlying root disk. It is doing this automatically. ufsdump does not have this functionality.
3. make_tape_recovery is fully automated backup/recovery mechanism mining after you boot from the tape you can return around 1 hour and you will have completly recovered system. ufsdump requires mounting/unmounting of slices.This sounds a lot like SCO's root/boot floppy/tape restore solution.
Yet I think that this comparison is not correct because Sun's ufsdump and HP's make_tape_recovery are two diferent types of software (different philosophy). Sun's ufsdump is like HP's fsbackup utility - tools for full file system backups. HP's make_tape_recovery <=> Sun's ??? (flash archives maybe?)I don't think Sun has anything like this and the closest you could get would be a Flash archive or a Jumpstart server. And then you would still have to do a restore after a machine has been booted up.
The closest you could get to something like in the Sun world would probably be "Bare Metal Restore" from Veritas, now Symantec.
alan

Similar Messages

  • Error Handling/Recovery Mechanism in ODI

    can u ps provide sum infor related to Error handling/recovery mechanism in ODI?
    say for instance a link breaks down while moving data from source to staging or/and staging to target..what will happen?? is it like that the processed records will be dumped into the target table, or no record will be moved into the target table?
    Is it like "ZERO or ALL" kindaa stuff that ODI works on?
    I really need help on this?

    There is an option - Restart in the Operator. When you right click and click on restart .ODI will start from the steps failed.
    I beleive if the database is down then restart can help you but if the agent is down , then you might need to start the session completely ,reason being when agent send the SQL process to database it wait till the Database process and send the record back to it.During that interval if the agent goes down the database would have processed the records but agent wouldn't be ready to read those record and by the time you bring the agent up ,the session would have died and so you would need to start the session again.
    You can test and see if the Restart option helps you .

  • Instance recovery mechanism

    Hi,
    I try to understand the mechanism of instance recovery of Oracle 8i/ Oracle 9i database and I am little bit confused, since I got some opposed information about that.
    Please find below my questions:
    1. Is SCN number assigned to all transactions or ONLY to committed transactions?
    2. Is the checkpoint number the highest SCN number after an checkpoint event?
    3. After checkpoint event, dirty buffer are written to database files by DBWR. Does DBWR write only dirty buffers of committet transactions or does DBWR write out all dirty buffers in the DB buffer cache?
    Thanks in advance for your answers!
    regards
    Peter

    1. The system change number (SCN) is updated whenever there is a commit.
    2. If you are talking about the checkpoint_change# in the v$database table, yes
    3. All dirty buffers are written to disk. Data files can and do contain uncommitted data
    Justin
    Distributed Database Consulting, Inc.
    http://www.ddbcinc.com/askDDBC

  • Want info about eventing mechanism in Solaris.

    Does Solaris have facility to register for events such as "link down" other than snmp ?? If yes, where can I find information about it ?
    Thanks,
    Dev

    Try looking at syseventd. This is the mechanism to expose events from the kernel/drivers to userland. I'm not sure if network link events are currently exposed.

  • Auto-recovery mechanism(S) of oracle database???

    Hi,
    I read that if the database is closed abnormally, say to power failure, then once the database is re-opened, the RECO foreground process will recover all transactions that were in-doubt, that is neither commited or rolled back, how does it do??? can anyone explain on that???...???

    this is the instance recovery. do not confuse with database recovery.
    If you suddenly power-off of your db server while running, in most cases, only the uncommited transaction are lost.
    You should read the doc

  • Does Linux filesystem undermine Oracle's recovery mechanism?

    I've been an Oracle DBA for 10 years and have been using Oracle
    on Linux for several months, but am not a Linux expert by any
    means. A client told me something about the filesystem Linux uses
    (x2?) that I find hard to believe. Can anyone shed some light on
    this?
    The claim is that the Linux filesystem does not implement
    synchronous writes correctly. The implication is that when a user
    commits a transaction and Oracle flushes the redo log to disk,
    Oracle may think the redo information has been successfully
    written when in fact its still sitting in a buffer somewhere
    waiting to write. If a drive failure occurs, the redo might never
    get written, but meanwhile the user has already been informed
    their transaction has been committed.
    Oracle does not flush data block buffers to disk when you commit
    a transaction. Only the redo is flushed. If the instance were to
    fail, Oracle reads the redo when you restart the instance and
    performs instance recovery automatically.
    If the Linux filesystem does not implement synchronous writes
    legitimately, then the recovery mechanisms in Oracle are
    compromised--indeed a successful commit is not a guarantee of
    data permanence.
    Its hard to believe that this could be true; I don't see how
    Oracle Corporation could put so much effort into porting their
    flagship products to Linux if data permanence cannot be
    guaranteed.
    Is my client mistaken in their understanding of the Linux
    filesystem? Any insights from the Linux gurus out there would be
    gratefully appreciated!
    Regards,
    Roger Schrag
    Database Specialists, Inc.
    null

    Roger Schrag (guest) wrote:
    : I've been an Oracle DBA for 10 years and have been using
    : Oracle on Linux for several months, but am not a Linux expert
    : by any means. A client told me something about the filesystem
    : Linux uses (x2?) that I find hard to believe. Can anyone shed
    : some light on this?
    : The claim is that the Linux filesystem does not implement
    : synchronous writes correctly. The implication is that when a
    : user commits a transaction and Oracle flushes the redo log to
    : disk, Oracle may think the redo information has been
    : successfully written when in fact its still sitting in a
    : buffer somewhere waiting to write. If a drive failure occurs,
    : the redo might never get written, but meanwhile the user has
    : already been informed their transaction has been committed.
    The problem doesn't lie with Linux - fsync() and O_SYNC are
    supported and, AFAIK, behave correctly.
    The problem is that Oracle doesn't appear to use them. The redo
    logs arnt fsync'ed on commit, nor do they appear to be opened
    with O_SYNC.
    Data loss will result, as you point out, if the RDBMS doesn't
    tell the operating system to save the data synchronously.
    Play with strace()ing the RDBMS background processes and confirm
    this for yourself.
    I'm not in a position to progress this with Oracle. Someone,
    obviously, should, otherwise the next article on ZDNet could be
    "Linux causes massive Oracle dataloss"..
    null

  • Help required with inittab mechanism on Solaris 11

    Hi
    we have an application running on Solaris 11 which uses /etc/inittab entries to spawn and maintain a number of daemon processes. I know that use of /etc/inittab is deprecated in Solaris 11 but at the moment we would prefer not to have to modify our software to use SMF. However saying that we have hit an issue which may force our hand....
    We have inittab entries which look like this:
    app1:34:respawn: su - appuser -c "exec /APP/myDaemon.sh >> /APP/myDaemon.log 2>&1" > /dev/null 2>&1 0<&1
    When init state 3 is entered we see two processes associated with myDaemon:
    # uname -a
    SunOS host40054 5.11 11.0 sun4v sparc sun4v
    # ps -Af |grep myDa
    appuser 21383 21380 0 19:26:58 ? 0:00 -ksh -c exec /APP/myDaemon.sh >> /APP/myDaemon.log
    root 21380 1 0 19:26:58 ? 0:00 su - appuser -c exec /APP/myDaemon.sh >> /APP/myDaemon.log
    If I drop the OS to init state 2 only the parent process is terminated:
    # /usr/sbin/init 2
    # ps -Af |grep myDa
    appuser 21383 21380 0 19:26:58 ? 0:00 -ksh -c exec /APP/myDaemon.sh >> /APP/myDaemon.log
    leaving the daemon process still running. Returning to init state 3 we now see:
    # /usr/sbin/init 3
    # ps -Af |grep myDa
    appuser 26997 1 0 19:30:54 ? 0:00 -ksh -c exec /APP/myDaemon.sh >> /APP/myDaemon.log
    root 26994 1 0 19:30:54 ? 0:00 su - appuser -c exec /APP/myDaemon.sh >> /APP/myDaemon.log
    appuser 21383 1 0 19:26:58 ? 0:00 -ksh -c exec /APP/myDaemon.sh >> /APP/myDaemon.log
    So now we've ended up with two daemon processes running (not what we want).
    If I compare with Solaris 10 and earlier, the inittab works very well for us. We only ever see the single daemon process running and this is dropped and respawned as required.
    # uname -a
    SunOS host40041 5.10 Generic_147440-04 sun4u sparc SUNW,Sun-Fire-V210
    # ps -Af | grep myD
    appuser 2390 27941 0 19:38:21 ? 0:00 -ksh -c exec /APP/myDaemon.sh >> /APP/myDaemon.log
    # /usr/sbin/init 2
    # ps -Af | grep myD
    root 2665 1187 0 19:39:19 pts/1 0:00 grep myD
    # /usr/sbin/init 3
    # ps -Af | grep myD
    appuser 2795 27941 0 19:39:30 ? 0:00 -ksh -c exec /APP/myDaemon.sh >> /APP/myDaemon.log
    With Solaris 11 it's as if the su shell process has been made explicit and now the mechnanism, as far as we are concerned, is broken.
    Can anyone suggest a way to get back to the Solaris 10 way of working, or some kind of workaround for this issue, before we have to dust off our XML skills and wrestle with SMF.
    Many thanks
    Dave.

    If you use Access Policy for triggering Provision/Revoke then it will Cancel all the tasks and you won't be able to use the same instance. On each Enable you'll see new instance of Seibel RO which would be wrong.
    Workaround:
    Create two Tasks Enable and Disable and configured it properly like Disable Effect and Enable Effect.
    Attach Adapter which is attached with Delete User Task in Disable Task and Map with Disabled
    Attach Adapter which is attached with Create User Task in Enable Task and Map with Provisioned

  • Controlfile recovery (Oracle 9i Solaris 9)

    Hi Guys,
    In my environment I had 2 control files 1 and 2. I think 1 got corrupted. so I deleted it then made a copy of 1 to 2. But now nothing seems to work.
    SQL> shutdown immediate;
    ORA-00210: cannot open the specified controlfile
    ORA-00202: controlfile: '/database/oradata/lldev/control02.lldev'
    ORA-27041: unable to open file
    SVR4 Error: 13: Permission denied
    Additional information: 3
    SQL>
    Pls assist.
    Thank you All.

    To summarize, did you -
    1) copy the control file while the database was up?
    2) Copy the correct control file and not the corrupted one.
    Best practice is to shutdown the database add a new location with the correct control file copy and hash the corrupt file location. Once the startup is proper, the corrupt control file can be deleted and the corresponding entry removed from the initialization parameters at next database restart.
    Could you also post the relevant alert log entries?

  • Enable password recovery in cisco 2950 with AAA

    Hello friends,
    I need to reccover switch enable password, i have already configured AAA also, when i am tryig to follow below proceedure finally saying Authorization failed. how can i recover enable password,
    Regards,
    Haris
    If I try to recover password like this description says
    http://www.cisco.com/en/US/docs/switches/lan/catalyst2960/software/release/12.2_25_see/configuration/guide/swtrbl.html#wp1090048
    Step 1 Connect a terminal or PC with terminal-emulation software to the switch console port.
    Step 2 Set the line speed on the emulation software to 9600 baud.
    Step  3 Power off the switch. Reconnect the power cord to the switch and,  within 15 seconds, press the Mode button while the System LED is still  flashing green.
    Base ethernet MAC Address: 00:0x:xx:xx:xx:xx
    Xmodem file system is available.
    The password-recovery mechanism is enabled.
    The system has been interrupted prior to initializing the
    flash filesystem. The following commands will initialize
    the flash filesystem, and finish loading the operating
    system software:
    flash_init
    load_helper
    boot
    switch:
    Step 4 switch: flash_init
    Initializing Flash...
    flashfs[0]: 600 files, 19 directories
    flashfs[0]: 0 orphaned files, 0 orphaned directories
    flashfs[0]: Total bytes: 32514048
    flashfs[0]: Bytes used: 7713792
    flashfs[0]: Bytes available: 24800256
    flashfs[0]: flashfs fsck took 10 seconds.
    ...done Initializing Flash.
    Boot Sector Filesystem (bs) installed, fsid: 3
    Setting console baud rate to 9600...
    Step5 switch:load_helper
    Step6 switch: dir flash:
    Directory of flash:/
    2 -rwx 916 <date> vlan.dat
    5 drwx 192 <date> c2960-lanbase-mz.122-25.SEE1
    620 -rwx 5488 <date> config.text
    621 -rwx 5 <date> private-config.text
    24800256 bytes available (7713792 bytes used)
    Step7 switch: rename flash:config.text flash:config.text.old
    Step8 switch: boot
    Loading "flash:c2960-lanbase-mz.122-25.SEE1/c2960-lanbase-mz.122-25.SEE1.bin"...
    Initializing flashfs...
    flashfs[1]: 600 files, 19 directories
    flashfs[1]: 0 orphaned files, 0 orphaned directories
    flashfs[1]: Total bytes: 32514048
    flashfs[1]: Bytes used: 7713792
    flashfs[1]: Bytes available: 24800256
    flashfs[1]: flashfs fsck took 1 seconds.
    flashfs[1]: Initialization complete....done Initializing flashfs.
    64K bytes of flash-simulated non-volatile configuration memory.
    Base ethernet MAC Address : 00:0x:xx:xx:xx:xx
    Motherboard assembly number : xxxxxxxxxx
    Power supply part number : xxxxxxxxxxx
    Motherboard serial number : xxxxxxxxxxx
    Power supply serial number : xxxxxxxxxxx
    Model revision number : B0
    Motherboard revision number : B0
    Model number : WS-C2960G-24TC-L
    System serial number : xxxxxxxxxxxx
    Top Assembly Part Number : xxxxxxxxxxxx
    Top Assembly Revision Number : B0
    Version ID : V02
    CLEI Code Number : xxxxxxxxxxxxx
    Hardware Board Revision Number : 0x01
    Switch Ports Model SW Version SW Image
    * 1 24 WS-C2960G-24TC-L 12.2(25)SEE1 C2960-LANBASE-M
    Press RETURN to get started!
    Step9 Hit <Enter>
    Would you like to terminate autoinstall? [yes]: yes
    Step10
    --- System Configuration Dialog ---
    Would you like to enter the initial configuration dialog? [yes/no]no
    Switch>
    Step11 Switch> enable
    Step12 Switch# rename flash:config.text.old flash:config.text
    Destination filename [config.text]? <Enter>
    Step13 Switch# copy flash:config.text system:running-config
    Destination filename [running-config]?<Enter>
    5488 bytes copied in 0.940 secs (5838 bytes/sec)
    Step14 NewSwitchName#conf t
    % Authorization failed.
    Doesn't this procedure work any more ?

    The password recovery worked, but you copied your problematic config back to the switch. Skip Step 13 and paste only the working part of the config to the switch.
    You can see your renamed config with "more flash:config.text.old".

  • PRD system recovery - "HOLD" & "To Be Delivered" messages

    Hi Gurus,
    I'd like to discuss this crucial thing.
    In our systems occures sometimes this probem - messages in "HOLD" status and "To Be Delivered" status.
    HOLD status - occurs in serialized queues (EIEO) when error occurs in some message in this queue, all others are set to status HOLD. Which is ok. The question is: when we cancel the "error causing" message, do we need to restart other messages from the queue manually? Or does there work some "auto-recovery" mechanism?
    And what about the status TBD? When occurs this one? and what steps can be done to solve this problem?
    Thank you all,
    Olian

    Hi Olian,
    I was experiencing the same issues the other day with IDOC - Soap scenario.
    I have given a wrong URL on the target adapter and everytime I sent an IDOC, it went to Waiting status and then after some time it gave a error message "system error"
    if system error occurs, it block the messages in that queue for subsequent entries.
    Solutions:
    1. If you are using IDOC on the sender side, go to Interface determination and deselect the checkbox "maintain order at runtime" (this is basically you dont have a sender CC for IDOC)
    2. Find out the message that resulted in system error and resend it or cancel that message.
    3. Find out the cause of that error message and fix it so that it wont happen next time
    Regards,
    Nikhil.

  • Purpose of ONLINE REDO LOG FILES - Media or Instance recovery or BOTH ?

    Hi
    Currently studying this topic for the 1z0-031 exam and am a little confused.
    my books (from instructor led class) say
    -redo logs are a mean to provide redo transactions in the event of a DATABASE recovery
    -redo log buffer gets flushed to redo log files to provide a recovery mechanism in case of MEDIA FAILURE
    Then it says
    -Online redo log files are used in a situation such as an INSTANCE FAILURE to recover uncommitted data which has not yet been written to the data files
    - online redo log files are used for RECOVERY only.
    Am i misunderstanding? Or are redo log files for both MEDIA and INSTANCE recovery? Or just INSTANCE ?
    confused....
    Amanjit

    Online Redo Log Files are used in a sense for both Media and Instance Recovery. If your database is in NoArchive Mode then you will only be able to use the Redo Log Files for instance recover. But if you are running in Archive Log Mode then Redo Log Files are archived and will allow you to recover from media failure.

  • ORACLE8 OPS BACKUP & RECOVERY

    제품 : ORACLE SERVER
    작성날짜 : 2004-08-16
    ORACLE8 OPS BACKUP & RECOVERY
    =============================
    SCOPE
    Standard Edition 에서는 Real Application Clusters 기능이 10g(10.1.0) 이상 부터 지원이 됩니다.
    Explanation
    OPS에서의 database backup & recovery 방법은 single instance의 backup 방법과
    비슷하다. 즉, Single instance에서의 모든 backup 방법은 ops에서도 지원된다.
    1. Backup 방법
    다음의 backup 방법 모두 사용이 가능하다. 여기서는 2)의 os 명령을 이용한
    backup 방법에 대해 기술합니다.
    1) Recovery Manager (RMAN) : <Bulletin 11451> 참고
    2) OS 명령을 활용한 백업
    Noarchive log mode : full offline backup only
    Archive log mode : full or partial, offline or online backup
    3) export : <Bulletin 10080> 참고 : ORACLE 7 BACKUP 및 RECOVERY 방법
    2. backup 정책 수립 시 고려 사항
    1) disk crash나 user error 등으로 말미암은 손실을 허용하지 않는다면 ARCHIVE
    LOG MODE를 사용해야 한다.
    2) 대부분 모든 instance는 자동 archiving을 사용한다.
    3) 모든 data backup 작업이 어떤 instance 건 가능하다.
    4) media recovery 시 모든 thread의 archive file이 사용된다.
    5) Instance recovery 시 살아있는 instance의 smon에 의해 자동으로 recovery된다.
    3. Noarchive log mode : Full offline backup
    1) 다음의 view들을 query하여 backup이 필요한 file을 알아낸다.
    V$DATAFILE or DBA_DATA_FILES
    V$LOGFILE
    V$CONTROLFILE
    2) 모든 instance를 shutdown한다.
    3) 확인된 file을 backup destination으로 copy한다.
    4. Archive log mode : Partial or Full Online Backup
    1) 백업을 수행하기 전에 ALTER SYSTEM ARCHIVE LOG CURRENT 명령 실행(이 명령을
    실행하여 현재 운영되지 않는 데이터베이스를 포함한 모든 노드의 current redo
    log에 대한 로그 스위치와 그에 따른 아카이브를 모든 인스턴스에서 실행시킨다.)
    2) ALTER TABLESPACE tablespace BEGIN BACKUP 명령 실행
    3) ALTER TABLESPACE 명령이 성공적을 실행될 때까지 대기
    4) OS에서 적절한 명령어를 활용하여 테이블스페이스에 속하는 데이터파일들을 백업
    (tar, cpio, cp 등)
    5) OS 명령을 활용한 백업이 다 끝날 때까지 대기
    6) ALTER TABLESPACE tablespace END BACKUP 명령 수행
    7) ALTER DATABASE BACKUP CONTROLFILE TO filename 이나
    ALTER DATABASE BACKUP CONTROLFILE TO TRACE
    명령을 수행시켜 컨트롤 파일을 백업.
    만약 아카이브 로그 파일을 백업받는다면 END BACKUP 명령을 실행시킨 이후
    ALTER SYSTEM ARCHIVE LOG CURRENT 명령을 실행시켜 END BACKUP 시점까지의
    모든 리두 로그 파일들을 확보한다.
    5. Import Parameter
    1) Controlfile 내의 Redo Log History (MAXLOGHISTORY )
    CREATE DATABASE 명령이나 CREATE CONTROLFILE 명령에서 MAXLOGHISTORY 값을
    지정하여 parallel server에서 다 채워진 리두 로그 파일에 대한 history를
    컨트롤 파일이 저장하도록 할 수 있다. 이미 데이터베이스를 생성한 후라면
    log history 값을 증가시키거나 감소시키기 위해서는 컨트롤 파일을 재생성
    하여야만 한다.
    MAXLOGHISTORY는 컨트롤 파일 내의 archive history를 얼마나 저장할 수
    있는지를 지정하며, 기본값은 플랫폼 별로 다르다. 이 값이 0이 아닌 다른
    값으로 지정된다면 log switch가 발생할 때마다 LGWR 프로세스에서는 컨트롤
    파일에 다음 정보를 기록한다.
    thread number, log sequence number, low SCN, low SCN timestamp, next SCN
    (next log의 가장 낮은 SCN값)
    (이 정보는 리두 로그 파일이 archive된 후가 아니라 log switch가 발생할 때
    컨트롤 파일에 저장된다.)
    MAXLOGHISTORY 값에서 지정한 값을 넘어서 log history가 저장되어야 할 경우
    가장 오래된 history를 overwrite하는 방식으로 저장된다. Log history 정보는
    OPS에서 자동 media recovery 시 SCN, thread number를 기준으로 적절한
    아카이브 로그 파일을 찾아 재구성하는 데 사용된다. 데이터베이스를 exclusive
    모드에서 한개의 쓰레드만 사용하는 환경에서는 log history 정보가 필요하지 않다.
    Log history 관련 정보는 V$LOG_HISTORY를 이용해 조회해 볼 수 있다.
    서버 관리자에서 V$RECOVERY_LOG를 조회하면 media recovery에 필요한 아카이브
    로그에 대한 정보를 얻을 수 있다.
    Multiplex된 리두 로그 파일에 대해서, log history 내에서 여러개의 entry가
    사용되지 않는다. 각각의 entry는 개개의 파일에 대한 정보가 아니라, multiplex
    된 log 파일의 그룹에 대한 정보를 가지고 있다.
    2) Archive Log Mode 시 Parameter
    OPS에서 archive log mode로 변경 시 exclusive mode로 db mount 후에 변경한다.
    a. LOG_ARCHIVE_FORMAT
    파라미터     설명     예
    %T     thread number, left-zero-padded     arch0000000001
    %t     thread number, not padded     arch1
    %S     log sequence number, left-zero-padded     arch0000000251
    %s     log sequence number, not padded     arch251
    이 가운데 %T와 %t는 OPS에서만 유효한 파라미터이다.
    모든 instance의 format은 같아야 하며 OPS 환경에서는 반드시 thread 번호를
    포함시켜야 한다.
    예) log_archive_format = %t_%s.arc
    b. LOG_ARCHIVE_START
    - 자동 archiving : TRUE로 지정한 후 인스턴스를 구동시키면 background process
    인 ARCH에서 자동 archiving을 수행한다. Closed Thread의 경우에는 실행 중인
    thread에서 closed thread를 대신해 log switch와 archiving을 수행한다.
    이것은 모든 노드에서 비슷한 SCN을 유지하도록 하기 위해 강제적으로 log switch
    가 발생할 때 일어난다
    - 수동 Archiving : FALSE이면 archive를 시작하도록 지시하는 명령을 명시적으로
    내리지 않는 이상 동작을 멈추고 대기한다. OPS에서는 각각의 인스턴스에서 서로
    다른 LOG_ARCHIVE_START 값을 사용할 수 있다.
    다음과 같은 방법으로 수동 archiving을 수행할 수 있다.
    ALTER SYSTEM ARCHIVE LOG SQL 명령을 실행
    ALTER SYSTEM ARCHIVE LOG START 명령을 실행하여 자동 archiving을 실행하도록
    지정.
    수동 archiving은 명령을 실행시킨 노드에서만 실행 되며, 이 때 archiving
    작업을 ARCH 프로세스가 처리하지 않는다.
    c. LOG_ARCHIVE_DEST
    archive log file이 만들어질 directory를 지정한다.
    예) log_archive_dest = /arch2/arc
    6. OPS Recovery
    1) Instance Failure 시
    Instance failure는 S/W나 H/W 상의 문제, 정전이나 background process에서
    fail이 발생하거나, shutdown abort를 시키거나 OS crash 등 여러가지 이유로
    인해 instance가 더 이상 작업을 진행할 수 없을 때 발생할 수 있다.
    Single instance 환경에서는 instance failure는 instance를 restart 시키고
    database를 open하여 해결된다. Mount 상태에서 open 되는 중간 단계에서 SMON은
    online redo log 파일을 읽어 instance recovery 작업을 수행한다.
    OPS에서는 instance failure가 발생 했을 경우 다른 방식으로 instance
    recovery가 수행된다. OPS에서는 한 노드에서 fail이 발생했다고 하더라도
    다른 노드의 인스턴스는 계속 운영될 수 있기 때문에 instance failure는
    database가 가용하지 않다는 것을 의미하지는 않는다.
    Instance recovery는 dead instance를 처음으로 발견한 SMON 프로세스에서
    수행한다. Recovery가 수행되는 동안 다음과 같은 작업이 일어난다.
    - Fail이 발생하지 않은 다른 인스턴스에서는 fail이 발생한 인스턴스의
    redo log 파일을 읽어 들여 데이터파일에 그 내용을 적용시킨다.
    - 이 기간 동안 fail이 발생하지 않은 다른 노드에서도 buffer cache 영역의
    내용을 write 하지는 못한다.
    - DBWR disk I/O가 일어나지 못한다.
    - DML 사용자에 의해 lock request를 할 수 없다.
    a. Single-node Failure
    한 인스턴스에서 fail이 난 다른 인스턴스에 대한 recovery를 수행하는 동안,
    정상적으로 운영 중인 인스턴스는 fail이 난 인스턴스의 redo log entry를
    읽어 들어 commit이 된 트랜잭션의 결과치를 데이터베이스에 반영시킨다.
    따라서 commit 된 데이터에 대한 손실은 일어나지 않으며, fail이 난
    인스턴스에서 commit 시키지 않은 트랜잭션에 대해서는 rollback을 수행하고,
    트랜잭션에서 사용 중이던 자원을 release시킨다.
    b. Multiple-node Failure
    만약 OPS의 모든 인스턴스에서 fail이 발생했을 경우, 인스턴스 recovery는
    어느 한 인스턴스라도 open이 될 때 자동으로 수행된다. 이 때 open되는 인스턴스는
    fail이 발생한 인스턴스가 아니라도 상관 없으며, OPS에서 shared 모드
    혹은 execlusive 모드에서 데이터베이스를 mount 하더라도 상관 없이 수행된다.
    오라클이 shared 모드에서 수행되던, execlusive 모드에서 수행되건,
    recovery 절차는 하나의 인스턴스에서, fail이 난 모든 인스턴스에 대한
    recovery를 수행하는지 여부를 제외하고는 동일하다.
    2) Media Failure 시
    Oracle에서 사용하는 file을 저장하는 storage media에 문제가 발생했을 경우
    발생한다. 이와 같은 상황에서는 일반적으로 data에 대한 read/write가 불가능하다.
    Media failure가 발생했을 경우 recovery는 single instance의 경우와
    마찬가지로 recovery가 수행되어야 한다. 두 경우 모드 archive log 파일을
    이용해서 transaction recovery를 수행하여야 한다.
    3) Node Failure 시
    OPS 환경에서, 한 노드 전체에 fail이 발생했을 때, 해당 노드에서 동작하던
    instance와 IDLM 컴포넌트에서도 fail이 발생한다. 이 경우 instance recovery를
    하기 위해서는 IDLM은 lock에 대한 remaster를 시키기 위해 그 자신을
    reconfigure시켜야 한다.
    한 노드에서 fail이 발생했을 때 Cluster Manager 또는 다른 GMS product에서는
    failure를 알리고, reconfiguration을 수행하여야만 한다. 이 작업이 수행되어야만
    다른 노드에서 운영 중인 LMD0 프로세스와의 통신이 가능하다.
    오라클에서는 fail이 발생한 노드에서 잡고 있는 lock 정보를 access할 경우나,
    LMON 프로세스에서 heartbeat을 이용해서 fail이 발생한 노드가 더 이상
    가용하지 않다는 것을 감지할 때 failure가 발생한 것을 알게 된다.
    IDLM에서 reconfigure가 일어나면 instance recovery가 수행된다.
    Instance recovery는 recovery를 수행하는 동안 자원에 대한 contention을
    피하기 위해 전체 데이터베이스의 작업을 일시 중지시킬 수 있다.
    FREEZE_DB_FOR_FAST_INSTANCE_RECOVERY initialization parameter 값을
    TRUE로 지정하며 전체 데이터베이스가 일시적으로 작업을 멈추게 된다.
    데이터 화일에서 fine-grain lock을 사용할 경우 기본값은 TRUE이다.
    이 값을 FALSE로 지정할 경우 recovery가 필요한 데이터만이 일시적으로 작업이
    멈춰진다. 데이터 화일이 hash lock을 사용할 경우 FALSE가 기본 값이다.
    4) IDLM failure 시
    한 노드에서 다른 연관된 프로세스의 fail이나 memory fault 등의 이유로 인해
    IDLM 프로세스만 fail이 발생했다면 다른 노드의 LMON에서는 이 문제를 감지하여
    lock reconfiguration process를 시작한다.
    이 작업이 진행 중인 동안 lock 관련 작업은 처리가 정지되고 PCM lock 또는
    다른 resource를 획득하기 위해 일부 사용자들은 대기 상태로 들어간다.
    5) Interconnect Failure ( GMS failure ) 시
    노드 간의 interconnect에서 fail이 발생하면 각각의 노드에서는 서로 다른
    노드의 IDLM과 GMS에서 fail 이 발생했다고 간주하게 된다. GMS에서는 quorum
    disk나 node에 pinging 등을 수행하는 다른 방법을 통해 시스템의 상태를 확인한다.
    이 경우 Fail이 발생한 connection에 대해 두 노드 혹은 한쪽 노드에서
    shutdown 이 일어난다.
    Oracle 8 recovery mechanism에서는 노드 혹은 인스턴스에서 강제로 fail이
    발생했을 경우 IDLM이나 instance가 startup 될 수 없게 된다. 경우에 따라서는
    노드 간의 IDLM communication이 가용한지 여부를 확인하기 위해 cluster
    validation code를 직접 작성하여 사용할 수도 있다. 이 방법을 사용하여
    GMS에서 제공하지는 않지만, 문제를 진단한 후 shutdown을 수행하도록 할 수 있다.
    이같은 code를 작성하기 위해서는 단일 PCM lock에서 처리되는 단일 data block에
    대해 계속해서 update 를 수행해 보는 루틴이 들어가면 된다. 서로 연결된
    두 노드에서 이 프로그램을 실행시키게 될 경우 interconnect에서 fail이
    난 상황을 진단할 수 있게 된다.
    만약 여러개의 노드가 cluster를 구성할 경우에는 매 interconnect 마다
    다른 PCM lock에 의해 처리되는 data block을 update 함으로써, 어떤 노드와의
    interconnect에 문제가 발생했는지를 알아낼 수 있다.
    7. Parallel Recovery
    Parallel Recovery의 목표는 compute와 I/O parallelism을 사용해서 crash
    recovery, single-instance recovery, media recovery 시 소요되는 시간을 줄이는
    데 있다.
    Parallel recovery는 여러 디스크에 걸쳐 몇 개의 데이터파일에 대해 동시에
    recovery를 수행할 때 가장 효율적이다
    다음과 같이 2가지 방식으로 병렬화시킬 수 있다.
    - RECOVERY_PARALLELISM 파라미터 지정
    - RECOVER 명령의 옵션에 지정
    오라클 서버는 하나의 프로세스에서 log file을 순차적으로 읽어들이고, redo
    정보를 여러 개의 recovery 프로세스에 전달해, log file에 기록된 변동 사항을
    데이터파일에 적용시킬 수 있다.
    Recovery Process는 오라클에서 자동적으로 구동되므로, recovery를 수행할 경우
    한 개 이상의 session을 사용할 필요가 없다.
    RECOVERY_PARALLELISM의 최대값은 PARALLEL_MAX_SERVERS 파라미터에 지정된 값을
    초과할 수 없다.
    Reference Ducumment
    Oracle8 ops manual

    Configuration files of the Oracle Application server can be backed up by "Backup and Recovery Tool"
    Pls refer to the documentation,
    http://download.oracle.com/docs/cd/B32110_01/core.1013/b32196/part5.htm#i436649
    Also "backup to tapes feature" is not yet supported by this tool
    thanks,
    Murugesh
    Message was edited by:
    Murugesan Appukuttty

  • Fail-over Recovery ?

    Hi All,
    can anyone help me out to know if "Fail-Over Recovery" concept is avaliable in Hyperion Essbase 11.1.1.3.
    If possible, please explain me how it ca be done.
    Regards

    Rajesh Kumar wrote:
    Hi
    I am working on data base fail over recovery mechanism. I am working on weblogic6.1Sp1 server installed on a unix machine. We are using J2EE architecture in our application. We have used Entity beans for dase base transactions.
    My main objective is to allow my applictaion to switch over to secondry data base in case of failure of primary data base.
    I have already developed a prototype which is working fine for a client application's request.But i can't use it for entity beans with container managed persistance.
    So what i want to ask you is as follows:
    Is there a way to switch between data bases for container managed entity beans.If yes then how to implement it?
    Thank you
    RajeshEasy. Define a multipool to tap a pool to the regular database first, and in cases when that DBMS is down,
    tap a second pool to the fallback DBMS. Define a TxDataSource for the multipool, and have the beans
    use that DataSource.
    Joe

  • Application Restart and Recovery APIs doesn't work for windows services

    I am using the Application Restart and Recovery mechanism (provided in Windows API Code Pack for Microsoft.NET Framework) to collect some information (i.e. stack information when there's an unhandledexception)  before my windows service crash down.
    It works well for windows form applications, but the callback method wouldn't be called if the host is a windows service. 
    I have checked the article: https://msdn.microsoft.com/zh-cn/subscriptions/downloads/cc303708
    But it doesn't specify clearly whether it works for a windows service. It seems that the recovery will only be activated when the user interacts with the error dialog of Windows Error Reporting (clicking "close" on the dialog, for example).
    So I am wondering is my guess right that the Application Restart and Recovery mechanism doesn't work for windows services. Or is there a better way to meet my requirement? 

    I would suggest trying ARR if that's what you want to use.  The restart portion won't work, but it doesn't need to as if you fail out of your service, the Windows service controller will handle recovery (up to and including restarting your service).
     You configure those recovery actions either through code or one of the built in administrative tools for services such as services.msc.  
    DebugDiag/ADplus and similar tools ultimately do use built-in APIs; you don't need to add anything external to collect debugging information.  You do however have to write a good deal of code to do somethings.  It's pretty simple to use the unmanaged
    function that I pointed out before and
    MiniDumpWriteDump to write a minidump when you hit an unexpected error(the dbghelp.dll that comes installed with Windows has it so you don't need anything additional installed).  You can even write a basic debugger that literally debugs a process using
    only kernel32 functions (see
    https://msdn.microsoft.com/en-us/library/windows/desktop/ms679301(v=vs.85).aspx if you're interested).  
    WinSDK Support Team Blog: http://blogs.msdn.com/b/winsdk/

  • Old 1760 With password recovery disabled, no way to factory reset

    Hi
    I have an old 1760 router with Password Recovery Functionality Disabled
    I don't care about its actual configuration , I need factory reset
    I Followed the well documented procedure :
    Normal boot
    Self decompressing the image : #################################################
    ################################################################ [OK]
    Smart Init is disabled. IOMEM set to: 15
    PMem allocated: 57042944 bytes; IOMem allocated: 10065920 bytes
                  Restricted Rights Legend
    Use, duplication, or disclosure by the Government is
    subject to restrictions as set forth in subparagraph
    (c) of the Commercial Computer Software - Restricted
    Rights clause at FAR sec. 52.227-19 and subparagraph
    (c) (1) (ii) of the Rights in Technical Data and Computer
    Software clause at DFARS sec. 252.227-7013.
               cisco Systems, Inc.
               170 West Tasman Drive
               San Jose, California 95134-1706
    Cisco Internetwork Operating System Software
    IOS (tm) C1700 Software (C1700-SV8Y7-M), Version 12.3(6d), RELEASE SOFTWARE (fc1
    Copyright (c) 1986-2004 by cisco Systems, Inc.
    Compiled Fri 15-Oct-04 03:46 by kellythw
    Image text-base: 0x80008120, data-base: 0x81440804
    Send break at this time , then :
    Do you want to reset the router to factory default
    configuration and proceed [y/n] ? y
    Reset router configuration to factory default.
    cisco 1760 (MPC860P) processor (revision 0x500) with 55706K/9830K bytes of memor
    y.
    Processor board ID FOC07450X9P (3881152211), with hardware revision 0000
    MPC860P processor: part number 5, mask 2
    Bridging software.
    X.25 software, Version 3.0.0.
    1 FastEthernet/IEEE 802.3 interface(s)
    32K bytes of non-volatile configuration memory.
    32768K bytes of processor board System flash (Read/Write)
    WARNING:
    Executing this command will disable password recovery mechanism.
    Do not execute this command without another plan for
    password recovery.
    Are you sure you want to continue? [yes/no]: y
    The router boot up normally anyway , still with original password unrecovered instead a fresh factory default.
    Any hint please ?????
    Thank you

    Federico,
    There is something quite strange going on but one thing has caught my attention in particular. This is a part of your transcript:
    Send break at this time , then :
    Do you want to reset the router to factory default configuration and proceed [y/n] ? y Reset router configuration to factory default.cisco 1760 (MPC860P) processor (revision 0x500) with 55706K/9830K bytes of memory.Processor board ID FOC07450X9P (3881152211), with hardware revision 0000MPC860P processor: part number 5, mask 2Bridging software.X.25 software, Version 3.0.0.1 FastEthernet/IEEE 802.3 interface(s)32K bytes of non-volatile configuration memory.32768K bytes of processor board System flash (Read/Write)WARNING:Executing this command will disable password recovery mechanism.Do not execute this command without another plan forpassword recovery.Are you sure you want to continue? [yes/no]: y
    Notice that the first question is whether you want to erase the configuration - you respond with yes, and the router continues booting. The second question displayed clearly shows that the router continues loading the configuration file and in particular processes the no service password-recovery command.
    What would happen if you answered with n to this second question, preventing the router from accepting the no password-recovery stored command? Could you reload the router afterwards and try the password recovery procedure again?
    Also, if this router has a removable Flash card, would you be able to enter the ROMMON and set the configuration register to 0x2142 if you removed the card and tried booting the router?
    Best regards,
    Peter

Maybe you are looking for

  • IPad2 display won't fit the screen

    My iPad screen display is too big so all outside margins can't be seen.  I tried to reset and tried to shrink.  Still too big. It's like my screen is 110% trying to fit a 100% screen.  Ideas?

  • RSRV - Database information about Infoprovider

    Greeting fellow gurus, I have run an RSRV check against my Basic Infocube and am getting 0 entries in all of my fact tables and dimensions.  When I do an SE16 against the /BIC/* tables that make up the fact and dim tables, I see they do have entries.

  • Question - New to Flash

    I am new to flash and building a simple picture slideshow with buttons for the next frame and previous frame to use on a website. I have two questions. 1. How do I get the pictures to start over at the beginning once I get to the last frame? Right no

  • DW CS4 acting screwy now - inserted links no longer site relative...

    I'm using DW CS4 on Win 7 and it was working great for a while, but now it's going haywire. Everytime I try to use the little target icon (or any insert method other than manually typing) to link to a document it is inserting local system level full

  • Do not like Firefox 6.0, how do I un-install?

    Just downloaded Firefox 6.0 and do not like what has changed, how do I un-install?