Split Brain handling in oracle 10g

How does oracle10g deal with split brain? Does it use voting disk or files or does it use any of the SCSI protocols ?

It uses the CRS voting disk to determine who should and shouldn't continue.

Similar Messages

  • Language Handling in Oracle 10g Database

    I want to know that how language works in oracle 10g database at micro level.Actually what happen when we insert any record into database and how database handle it.
    from insertion to storage and displaying any stored record...what exaclty happens?
    how oracle 10g handle it?

    That would be big long session.
    You can start from following link,
    http://download.oracle.com/docs/cd/B19306_01/server.102/b14220/toc.htm

  • Split brain syndrome in RAC

    As per Split brain syndrome in Oracle RAC in case of inter-connect failures the master node will evict other/dead nodes .
    Let say 2 node RAC configuration node 1 is defined as master node (by some parameter like load and others) incase of network failures node 1 will terminate node 2 from cluster.
    ;l
    what happens if master node, in this case node 1 fails. which will terminate node 1 and is node 2 will become master node ?

    Hi,
    It occurs when the instance members in a RAC fail to ping/connect to each other via this private interconnect, but the servers are all pysically up and running and the database instance on each of these servers is also running. These individual nodes are running fine and can conceptually accept user connections and work independently. So basically due to lack of commincation the instance thinks that the other instance that it is not able to connect is down and it needs to do something about the situation. The problem is if we leave these instance running, the sane block might get read, updated in these individual instances and there would be data integrity issue, as the blocks changed in one instance, will not be locked and could be over-written by another instance. Oracle has efficiently implemented check for the split brain syndrome.
    In RAC if any node becomes inactive, or if other nodes are unable to ping/connect to a node in the RAC, then the node which first detects that one of the node is not accessible, it will evict that node from the RAC group. e.g. there are 4 nodes in a rac instance, and node 3 becomes unavailable, and node 1 tries to connect to node 3 and finds it not responding, then node 1 will evict node 3 out of the RAC groups and will leave only Node1, Node2 & Node4 in the RAC group to continue functioning.
    The split brain concepts can become more complicated in large RAC setups. For example there are 10 RAC nodes in a cluster. And say 4 nodes are not able to communicate with the other 6. So there are 2 groups formed in this 10 node RAC cluster ( one group of 4 nodes and other of 6 nodes). Now the nodes will quickly try to affirm their membership by locking controlfile, then the node that lock the controlfile will try to check the votes of the other nodes. The group with the most number of active nodes gets the preference and the others are evicted. Moreover, I have seen this node eviction issue with only 1 node getting evicted and the rest function fine, so I cannot really testify that if thats how it work by experience, but this is the theory behind it.
    When we see that the node is evicted, usually oracle rac will reboot that node and try to do a cluster reconfiguration to include back the evicted node.
    You will see oracle error: ORA-29740, when there is a node eviction in RAC. There are many reasons for a node eviction like heart beat not received by the controlfile, unable to communicate with the clusterware etc.
    And also You can go through Metalink Note ID: 219361.1

  • Sun Fire V490 x 2 servers with Oracle RAC facing Split brain problem

    Hi all,
    I have Sun Fire V490 x 2 servers with Oracle RAC and they faced a Split brain problem. One of the node's database instance has gone down, The DBA claims it is due to network problem, but as such the networks are OK. We use the on board CE1 interface for Cluster interconnect and CE0 as the public interface.
    Did anybody face this kind of a problem? Could this be a hardware/OS patch problem?
    I had kept a continuous ping for 24 hours after this happened last time and the output shows no packet loss
    Many thanks in advance.
    Ushas Symon

    In order to diagnose this properly, you'll need to provide too much detail and far too many log files for a generic discussion forum to handle.
    Use your service contract and open a support case.
    Because a cluster environment is involved you'll likely end up talking to the cluster support staff.
    They can analyze hardware and software errors as well as review whether you configured the systems in a supportable fashion.
    Be prepared to make a direct connection to each system and gather data using such as by using the Explorer tool. The technical support staff will tell you what they will actually need.

  • Oracle 10G JMS error : Failed to handle dispatch message ... exception ORABPEL-05002

    Hi All,
    We are using a JMS adapter as partner link in a BPEL process to produce messages to JMS Queue.
    We are getting an error while producing a message to a JMS Queue.
    The error comes for 1 or 2 instances among 5 to10 intsances. (all these instances ran with in 1 or 2 seconds).
    We are using Oracle 10G SOA suite.
    This error is getting in the Production environment.
    The error is as follows (took error from server log).
    Failed to handle dispatch message ... exception ORABPEL-05002
    Message handle error.
    An exception occurred while attempting to process the message "com.collaxa.cube.engine.dispatch.message.invoke.InvokeInstanceMessage"; the exception is: faultName: {{http://schemas.oracle.com/bpel/extension}bindingFault}
    messageType: {{http://schemas.oracle.com/bpel/extension}RuntimeFaultMessage}
    parts: {{summary=file:/sw/appaia/product/SOA/bpel/domains/default/tmp/.bpel_CurrencyExchangeListEbizJMSProducer_1.0_4ec528ec93a8a6ff0278fab9701dcc71.tmp/CurrencyExchangeListEbizJMSProducerV1.wsdl [ Produce_Message_ptt::Produce_Message(OutputParameters) ] - WSIF JCA Execute of operation 'Produce_Message' failed due to: Adapter Framework unable to create outbound JCA connection.
    file:/sw/appaia/product/SOA/bpel/domains/default/tmp/.bpel_CurrencyExchangeListEbizJMSProducer_1.0_4ec528ec93a8a6ff0278fab9701dcc71.tmp/CurrencyExchangeListEbizJMSProducerV1.wsdl [ Produce_Message_ptt::Produce_Message(OutputParameters) ] - : The Adapter Framework was unable to establish an outbound JCA connection due to the following issue: oracle.j2ee.connector.proxy.ProxyInterceptException: javax.resource.ResourceException: RollbackException: Transaction has been marked for rollback: null [Caused by: RollbackException: Transaction has been marked for rollback: null]
    ; nested exception is:
    ORABPEL-12511
    Adapter Framework unable to create outbound JCA connection.
    I tried the below JMS tunings with differnt combination, but no improvement.
                <property name="useJCAConnectionPool">true</property>
                <property name="maxSizeJCAConnectionPool">500</property>
                <property name="retryMaxCount">10</property>
                <property name="retryInterval">60</property>
    Please let me know if any other logs or any other information is required.
    Thanks,
    Pershad

    sorry, I forgot the below things which I tried to tune the JMS.
    <property name="retryMaxCount">10</property>
    <property name="retryInterval">60</property>
    <property name="useJCAConnectionPool">true</property>
    <property name="maxSizeJCAConnectionPool">500</property>
    Thanks.

  • Oracle 10G New Feature........Part 1

    Dear all,
    from last couple of days i was very busy with my oracle 10g box,so i think this is right time to
    share some intresting feature on 10g and some internal stuff with all of you.
    Have a look :-
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Oracle 10g Memory and Storage Feature.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    1.Automatic Memory Management.
    2.Online Segment Shrink
    3.Redolog Advisor, checkpointing
    4.Multiple Temporary tablespace.
    5.Automatic Workload Repository
    6.Active Session History
    7.Misc
    a)Rename Tablespace
    b)Bigfile tablespace
    c)flushing buffer cache
    8.ORACLE INTERNAL
    a)undocumented parameter (_log_blocks_during_backup)
    b)X$ view (x$messages view)
    c)Internal Structure of Controlfile
    1.Automatic memory management
    ================================
    This feature reduce the overhead of oracle DBA.previously mostly time we need to set diff oracle SGA parameter for
    better performance with the help of own experience,advice views and by monitoring the behaviour
    of oracle database.
    this was just time consuming activity.........
    Now this feature makes easy life for oracle DBA.
    Just set SGA_TARGET parameter and it automatically allocate memory to different SGA parameter.
    it focus on DB_CACHE_SIZE
    SHARED_POOL_SIZE
    LARGE_POOL
    JAVA_POOL
    and automatically set it as
    __db_cache_size
    __shared_pool_size
    __large_pool_size
    __java_pool_size
    check it in alert_log
    MMAN(memory manager) process is new in 10g and this is responsible for sga tuning task.
    it automatically increase and decrease the SGA parameters value as per the requirement.
    Benefit:- Maximum utlization of available SGA memory.
    2.Online Segment Shrink.
    ==========================
    hmmmmm again a new feature by oracle to reduce the downtime.Now oracle mainly focus on availablity
    thats why its always try to reduce the downtime by intrducing new feature.
    in previous version ,reducing High water mark of table was possible by
    Exp/imp
    or
    alter table move....cmd. but on these method tables was not available for normal use for long hrs if it has more data.
    but in 10g with just few command we can reduce the HWmark of table.
    this feature is available for ASSM tablespaces.
    1.alter table emp enable row movement.
    2.alter table emp shrink space.
    the second cmd have two phases
    first phase is to compact the segment and in this phase DML operations are allowed.
    second phase(shrink phase)oracle shrink the HWM of table, DML operation will be blocked at that time for short duration.
    So if want to shrink the HWM of table then we should use it with two diff command
    first compact the segment and then shrink it on non-peak hrs.
    alter table emp shrink space compact. (This cmd doesn't block the DML operation.)
    and alter table emp shrink space. (This cmd should be on non-peak hrs.)
    Benefit:- better full table scan.
    3.Redolog Advisor and checkpointing
    ================================================================
    now oracle will suggest the size of redo log file by V$INSTANCE_RECOVERY
    SELECT OPTIMAL_LOGFILE_SIZE
    FROM V$INSTANCE_RECOVERY
    this value is influence with the value of FAST_START_MTTR_TARGET .
    Checkpointing
    Automatic checkpointing will be enable after setting FAST_START_MTTR_TARGET to non-zero value.
    4.Multiple Temporary tablespace.
    ==================================
    Now we can manage multiple temp tablespace under one group.
    we can create a tablespace group implicitly when we include the TABLESPACE GROUP clause in the CREATE TEMPORARY TABLESPACE or ALTER TABLESPACE statement and the specified tablespace group does not currently exist.
    For example, if group1 is not exists,then the following statements create this groups with new tablespace
    CREATE TEMPORARY TABLESPACE temp1 TEMPFILE '/u02/oracle/data/temp01.dbf'
    SIZE 50M
    TABLESPACE GROUP group1;
    --Add Existing temp tablespace into group by
    alter tablespace temp2 tablespace group group1.
    --we can also assign the temp tablespace group on database level as default temp tablespace.
    ALTER DATABASE <db name> DEFAULT TEMPORARY TABLESPACE group1;
    benefit:- Better I/O
    One sql can use more then one temp tablespace
    5.AWR(Automatic Workload Repository):-
    ================================== AWR is built in Repository and Central point of Oracle 10g.Oracle self managing activities
    is fully dependent on AWR.by default after 1 hr, oracle capure all database uses information and store in AWR with the help of
    MMON process.we called it Memory monitor process.and all these information are kept upto 7 days(default) and after that it automatically purge.
    we can generate a AWR report by
    SQL> @?/rdbms/admin/awrrpt
    Just like statspack report but its a advance and diff version of statspack,it provide more information of Database as well as OS.
    it show report in Html and Text format.
    we can also take manually snapshot for AWR by
    BEGIN
    DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();
    END;
    **The STATISTICS_LEVEL initialization parameter must be set to the TYPICAL or ALL to enable the Automatic Workload Repository.
    [oracle@RMSORA1 oracle]$ sqlplus / as sysdba
    SQL*Plus: Release 10.1.0.2.0 - Production on Fri Mar 17 10:37:22 2006
    Copyright (c) 1982, 2004, Oracle. All rights reserved.
    Connected to:
    Oracle Database 10g Enterprise Edition Release 10.1.0.2.0 - Production
    With the Partitioning, OLAP and Data Mining options
    SQL> @?/rdbms/admin/awrrpt
    Current Instance
    ~~~~~~~~~~~~~~~~
    DB Id DB Name Inst Num Instance
    4174002554 RMSORA 1 rmsora
    Specify the Report Type
    ~~~~~~~~~~~~~~~~~~~~~~~
    Would you like an HTML report, or a plain text report?
    Enter 'html' for an HTML report, or 'text' for plain text
    Defaults to 'html'
    Enter value for report_type: text
    Type Specified: text
    Instances in this Workload Repository schema
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    DB Id Inst Num DB Name Instance Host
    * 4174002554 1 RMSORA rmsora RMSORA1
    Using 4174002554 for database Id
    Using 1 for instance number
    Specify the number of days of snapshots to choose from
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Entering the number of days (n) will result in the most recent
    (n) days of snapshots being listed. Pressing <return> without
    specifying a number lists all completed snapshots.
    Listing the last 3 days of Completed Snapshots
    Snap
    Instance DB Name Snap Id Snap Started Level
    rmsora RMSORA 16186 16 Mar 2006 17:33 1
    16187 16 Mar 2006 18:00 1
    16206 17 Mar 2006 03:30 1
    16207 17 Mar 2006 04:00 1
    16208 17 Mar 2006 04:30 1
    16209 17 Mar 2006 05:00 1
    16210 17 Mar 2006 05:31 1
    16211 17 Mar 2006 06:00 1
    16212 17 Mar 2006 06:30 1
    16213 17 Mar 2006 07:00 1
    16214 17 Mar 2006 07:30 1
    16215 17 Mar 2006 08:01 1
    16216 17 Mar 2006 08:30 1
    16217 17 Mar 2006 09:00 1
    Specify the Begin and End Snapshot Ids
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Enter value for begin_snap: 16216
    Begin Snapshot Id specified: 16216
    Enter value for end_snap: 16217
    End Snapshot Id specified: 16217
    Specify the Report Name
    ~~~~~~~~~~~~~~~~~~~~~~~
    The default report file name is awrrpt_1_16216_16217.txt. To use this name,
    press <return> to continue, otherwise enter an alternative.
    Benefit:- Now DBA have more free time to play games.....................:-)
    Advance version of statspack
    more DB and OS information with self managing capabilty
    New Automatic alert and database advisor with the help of AWR.
    6.Active Session History:-
    ==========================
    V$active_session_history is view that contain the recent session history.
    the memory for ASH is comes from SGA and it can't more then 5% of Shared pool.
    So we can get latest and active session report from v$active_session_history view and also get histortical data of
    of session from DBA_HIST_ACTIVE_SESS_HISTORY.
    v$active_session_history include some imp column like:-
    ~SQL identifier of SQL statement
    ~Object number, file number, and block number
    ~Wait event identifier and parameters
    ~Session identifier and session serial number
    ~Module and action name
    ~Client identifier of the session
    7.Misc:-
    ========
    Rename Tablespace:-
    =================
    in 10g,we can even rename a tablespace by
    alter tablespace <tb_name> rename to <tb_name_new>;
    This command will update the controlfile,data dictionary and datafile header,but dbf filename will be same.
    **we can't rename system and sysaux tablespace.
    Bigfile tablespace:-
    ====================
    Bigfile tablespace contain only one datafile.
    A bigfile tablespace with 8K blocks can contain a 32 terabyte datafile.
    Bigfile tablespaces are supported only for locally managed tablespaces with automatic segment-space management.
    we can take the advantage of bigfile tablespace when we are using ASM or other logical volume with RAID.
    without ASM or RAID ,it gives poor response.
    syntax:-
    CREATE BIGFILE TABLESPACE bigtbs
    Flushing Buffer Cache:-
    ======================
    This option is same as flushing the shared pool,but only available with 10g.
    but i don't know, whats the use of this command in prod database......
    anyway we can check and try it on test server for tuning n testing some query etc....
    SQL> alter system flush buffer_cache;
    System altered.
    ++++++++++++++++++
    8.Oracle Internal
    ++++++++++++++++++
    Here is some stuff that is not related with 10g but have some intresting things.
    a)undocumented parameter "_log_blocks_during_backup"
    ++++++++++++++++++++++++
    as we know that oracle has generate more redo logs during hotbackup mode because
    oracle has to maintain the a complete copy of block into redolog due to split block.
    we can also change this behaviour by setting this parameter to False.
    If Oracle block size equals the operating system block size.thus reducing the amount of redo generated
    during a hot backup.
    WITHOUT ORACLE SUPPORT DON'T SET IT ON PROD DATABASE.THIS DOCUMENT IS JUST FOR INFORMATIONAL PURPOSE.
    b)some X$ views (X$messages)
    ++++++++++++++++
    if you are intresting in oracle internal architecture then x$ view is right place for getting some intresting things.
    X$messages :-it show all the actions that a background process do.
    select * from x$messages;
    like:-
    lock memory at startup MMAN
    Memory Management MMAN
    Handle sga_target resize MMAN
    Reset advisory pool when advisory turned ON MMAN
    Complete deferred initialization of components MMAN
    lock memory timeout action MMAN
    tune undo retention MMNL
    MMNL Periodic MQL Selector MMNL
    ASH Sampler (KEWA) MMNL
    MMON SWRF Raw Metrics Capture MMNL
    reload failed KSPD callbacks MMON
    SGA memory tuning MMON
    background recovery area alert action MMON
    Flashback Marker MMON
    tablespace alert monitor MMON
    Open/close flashback thread RVWR
    RVWR IO's RVWR
    kfcl instance recovery SMON
    c)Internal Structure of Controlfile
    ++++++++++++++++++++++++++++++++++++
    The contents of the current controlfile can be dumped in text form.
    Dump Level Dump Contains
    1 only the file header
    2 just the file header, the database info record, and checkpoint progress records
    3 all record types, but just the earliest and latest records for circular reuse record types
    4 as above, but includes the 4 most recent records for circular reuse record types
    5+ as above, but the number of circular reuse records included doubles with each level
    the session must be connected AS SYSDBA
    alter session set events 'immediate trace name controlf level 5';
    This dump show lots of intresting information.
    it also show rman recordes if we used this controlfile in rman backup.
    Thanks
    Kuljeet Pal Singh

    You can find each doc in html and pdf format on the Documentation Library<br>
    You can too download all the documentation in html format to have all on your own computer here (445.8MB)<br>
    <br>
    Nicolas.

  • Election problem after repeated split-brains with two nodes

    Hi
    I'm using a customized source based on BDB-5.1.19 (excxx_repquote)
    with two site one - MASTER and the other SLAVE...
    nsite=2
    ack=quorum
    - the master is writing to quotedb at a rate of 10 txn per sec
    - the test consist to isolate the client from the master (split brain) and reconnect it after a random time include from 1sec to 10sec
    the test run well about 10 times but at a moment the process slave receive DB_EVENT_REP_ELECTION_FAILED
    and the master enter in election mode and never exit from the CLIENT mode. I must say that to freeze the client I decide to kill me (kill -9 my pid) when I receive such event...
    here is the verbose log on the master...
    [1307872770:871621][6510/47655809107168] MASTER: rep_send_function returned: 110
    [1307872770:973655][6510/47655809107168] MASTER: bulk_msg: Send buffer after copy due to PERM
    [1307872770:973667][6510/47655809107168] MASTER: send_bulk: Send 266 (0x10a) bulk buffer bytes
    [1307872770:973672][6510/47655809107168] MASTER: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 68 eid -1, type bulk_log, LSN [21][986648] perm
    [1307872770:973693][6510/47655809107168] MASTER: will await acknowledgement: need 1
    [1307872771:26623][6510/47655809107168] MASTER: rep_send_function returned: 110
    [1307872771:126380][6510/1162996032] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 70 eid 0, type log, LSN [21][946345]
    [1307872771:126407][6510/1162996032] MASTER: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 68 eid -1, type dupmaster, LSN [0][0] nobuf
    [1307872771:126695][6510/1162996032] MASTER: rep_start: Found old version log 17
    [1307872771:126753][6510/1162996032] CLIENT: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 68 eid -1, type newclient, LSN [0][0] nobuf
    [1307872771:126833][6510/1183975744] CLIENT: starting election thread
    [1307872771:126876][6510/1183975744] CLIENT: Start election nsites 2, ack 1, priority 100
    [1307872771:126890][6510/1183975744] CLIENT: Election thread owns egen 69
    [1307872771:127423][6510/1173485888] CLIENT: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 70 eid 0, type newclient, LSN [0][0]
    [1307872771:130079][6510/1183975744] CLIENT: Tallying VOTE1[0] (2147483647, 69)
    [1307872771:130113][6510/1183975744] CLIENT: Beginning an election
    [1307872771:130134][6510/1183975744] CLIENT: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 68 eid -1, type vote1, LSN [21][986728] nobuf
    [1307872771:130147][6510/1173485888] CLIENT: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 68 eid -1, type master_req, LSN [0][0] nobuf
    [1307872771:130438][6510/1152506176] CLIENT: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 70 eid 0, type vote1, LSN [21][946437]
    [1307872771:130460][6510/1162996032] CLIENT: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 70 eid 0, type alive, LSN [21][986728]
    [1307872771:130467][6510/1152506176] CLIENT: Updating gen from 68 to 70
    [1307872771:130482][6510/1162996032] CLIENT: Received ALIVE egen of 71, mine 69
    [1307872771:130503][6510/1162996032] CLIENT: Election finished in 0.003602000 sec
    [1307872771:130515][6510/1162996032] CLIENT: Election done; egen 70
    [1307872771:130534][6510/1152506176] CLIENT: Received vote1 egen 71, egen 71
    [1307872771:130581][6510/1152506176] CLIENT: Tallying VOTE1[0] (0, 71)
    [1307872771:130593][6510/1089075520] CLIENT: starting election thread
    [1307872771:130619][6510/1152506176] CLIENT: Incoming vote: (eid)0 (pri)100 ELECTABLE (gen)70 (egen)71 [21,946437]
    [1307872771:130642][6510/1152506176] CLIENT: Not in election, but received vote1 0x282c 0x8
    [1307872771:130674][6510/1089075520] CLIENT: Start election nsites 2, ack 1, priority 100
    [1307872771:130692][6510/1089075520] CLIENT: Election thread owns egen 71
    [1307872771:130704][6510/1194465600] CLIENT: starting election thread
    [1307872771:130733][6510/1194465600] CLIENT: Start election nsites 2, ack 1, priority 100
    [1307872771:132922][6510/1089075520] CLIENT: Tallying VOTE1[1] (2147483647, 71)
    [1307872771:132949][6510/1089075520] CLIENT: Accepting new vote
    [1307872771:132958][6510/1089075520] CLIENT: Beginning an election
    [1307872771:132973][6510/1089075520] CLIENT: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 70 eid -1, type vote1, LSN [21][986728] nobuf
    [1307872771:132985][6510/1194465600] CLIENT: election thread is exiting
    [1307872771:133012][6510/1089075520] CLIENT: Tallying VOTE2[0] (2147483647, 71)
    [1307872771:133037][6510/1089075520] CLIENT: Counted my vote 1
    [1307872771:133048][6510/1089075520] CLIENT: Skipping phase2 wait: already got 1 votes
    [1307872771:133060][6510/1089075520] CLIENT: Got enough votes to win; election done; (prev) gen 70
    [1307872771:133071][6510/1089075520] CLIENT: Election finished in 0.002367000 sec
    [1307872771:133084][6510/1089075520] CLIENT: Election done; egen 72
    [1307872771:133111][6510/1089075520] CLIENT: Ended election with 0, e_th 1, egen 72, flag 0x2a2c, e_fl 0x0, lo_fl 0x6
    [1307872771:133170][6510/1173485888] CLIENT: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 70 eid 0, type alive, LSN [0][0]
    [1307872771:133187][6510/1173485888] CLIENT: Racing replication msg lockout, ignore message.
    [1307872771:173744][6510/1162996032] CLIENT: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 70 eid 0, type vote2, LSN [0][0]
    [1307872771:173769][6510/1162996032] CLIENT: Racing replication msg lockout, ignore message.
    [1307872771:231593][6510/1183975744] CLIENT: Ended election with 0, e_th 0, egen 72, flag 0x2a2c, e_fl 0x0, lo_fl 0x1c
    [1307872771:231629][6510/1183975744] CLIENT: election thread is exiting
    [1307872777:443794][6510/1131526464] CLIENT: init connection to site 2.0.0.210:12345 with result 115
    [1307872971:644194][6510/1131526464] CLIENT: init connection to site 2.0.0.210:12345 with result 115
    [1307873165:844583][6510/1131526464] CLIENT: init connection to site 2.0.0.210:12345 with result 115
    [1307873360:44955][6510/1131526464] CLIENT: init connection to site 2.0.0.210:12345 with result 115
    [1307873554:245347][6510/1131526464] CLIENT: init connection to site 2.0.0.210:12345 with result 115
    [1307873748:445736][6510/1131526464] CLIENT: init connection to site 2.0.0.210:12345 with result 115
    [1307873942:646117][6510/1131526464] CLIENT: init connection to site 2.0.0.210:12345 with result 115
    [1307874136:846509][6510/1131526464] CLIENT: init connection to site 2.0.0.210:12345 with result 115
    .... and infinite stay to this situation
    My question is why the Master is suddenly transformed into CLIENT and why it's never returning to the MASTER
    Thanks in advance ...
    here is the log for the client
    [1307872315:455113][1282/1181583680] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type log, LSN [21][984396]
    [1307872315:455134][1282/1160603968] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type log, LSN [21][984483] perm
    [1307872315:609962][1282/1181583680] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type bulk_log, LSN [21][984733] perm
    [1307872315:764958][1282/1181583680] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type bulk_log, LSN [21][984986] perm
    [1307872315:919962][1282/1181583680] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type bulk_log, LSN [21][985238] perm
    [1307872316:75018][1282/1181583680] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type bulk_log, LSN [21][985491] perm
    [1307872316:229959][1282/1181583680] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type bulk_log, LSN [21][985741] perm
    [1307872316:384949][1282/1181583680] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type bulk_log, LSN [21][985993] perm
    [1307872316:499899][1282/1181583680] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type bulk_log, LSN [21][986141] perm
    [1307872316:539895][1282/1181583680] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type log, LSN [21][986221]
    [1307872316:540078][1282/1171093824] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type log, LSN [21][986307]
    [1307872316:540100][1282/1160603968] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type log, LSN [21][986394] perm
    [1307872316:694950][1282/1171093824] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type bulk_log, LSN [21][986648] perm
    [1307872316:847349][1282/1129134400] MASTER: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 70 eid -1, type log, LSN [21][946345]
    [1307872316:847698][1282/1171093824] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type dupmaster, LSN [0][0]
    [1307872316:847999][1282/1181583680] MASTER: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type newclient, LSN [0][0]
    [1307872316:848168][1282/1171093824] MASTER: rep_start: Found old version log 17
    [1307872316:848222][1282/1181583680] CLIENT: Racing replication msg lockout, ignore message.
    [1307872316:848398][1282/1171093824] CLIENT: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 70 eid -1, type newclient, LSN [0][0] nobuf
    [1307872316:848504][1282/1192073536] CLIENT: starting election thread
    [1307872316:848542][1282/1192073536] CLIENT: Start election nsites 2, ack 1, priority 100
    [1307872316:848566][1282/1192073536] CLIENT: Election thread owns egen 71
    [1307872316:849634][1282/1192073536] CLIENT: Tallying VOTE1[0] (2147483647, 71)
    [1307872316:849654][1282/1192073536] CLIENT: Beginning an election
    [1307872316:849680][1282/1192073536] CLIENT: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 70 eid -1, type vote1, LSN [21][946437] nobuf
    [1307872316:851403][1282/1160603968] CLIENT: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type vote1, LSN [21][986728]
    [1307872316:851448][1282/1160603968] CLIENT: Received vote1 egen 69, egen 71
    [1307872316:851470][1282/1160603968] CLIENT: Received old vote 69, egen 71, ignoring vote1
    [1307872316:851481][1282/1160603968] CLIENT: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 70 eid 0, type alive, LSN [21][986728] nobuf
    [1307872316:851538][1282/1171093824] CLIENT: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 68 eid 0, type master_req, LSN [0][0]
    [1307872316:851558][1282/1171093824] CLIENT: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 70 eid 0, type alive, LSN [0][0] nobuf
    [1307872316:854254][1282/1160603968] CLIENT: /opt/bdb/ rep_process_message: msgv = 5 logv 17 gen = 70 eid 0, type vote1, LSN [21][986728]
    [1307872316:854275][1282/1160603968] CLIENT: Received vote1 egen 71, egen 71
    [1307872316:854317][1282/1160603968] CLIENT: Tallying VOTE1[1] (0, 71)
    [1307872316:854339][1282/1160603968] CLIENT: Incoming vote: (eid)0 (pri)100 ELECTABLE (gen)70 (egen)71 [21,986728]
    [1307872316:854353][1282/1160603968] CLIENT: Existing vote: (eid)2147483647 (pri)100 (gen)70 (sites)2 [21,946437]
    [1307872316:854369][1282/1160603968] CLIENT: Accepting new vote
    [1307872316:854379][1282/1160603968] CLIENT: Phase1 election done
    [1307872316:854395][1282/1160603968] CLIENT: Voting for 0
    [1307872316:854407][1282/1160603968] CLIENT: /opt/bdb/ rep_send_message: msgv = 5 logv 17 gen = 70 eid 0, type vote2, LSN [0][0] nobuf
    [1307872317:960344][1282/1192073536] CLIENT: After phase 2: votes 0, nvotes 1, nsites 2
    [1307872317:960389][1282/1192073536] CLIENT: Election finished in 1.111809000 sec
    [1307872317:960401][1282/1192073536] CLIENT: Election done; egen 72
    [1307872317:960412][1282/1192073536] CLIENT: Ended election with -30974, e_th 0, egen 72, flag 0x282c, e_fl 0x0, lo_fl 0x0
    Kill me !!
    --- my source
    on the master I run manually :
    txn_rate 1
    loop_rate 10
    loop 1 20000
    * See the file LICENSE for redistribution information.
    * Copyright (c) 2001, 2010 Oracle and/or its affiliates. All rights reserved.
    * $Id$
    * In this application, we specify all communication via the command line. In
    * a real application, we would expect that information about the other sites
    * in the system would be maintained in some sort of configuration file. The
    * critical part of this interface is that we assume at startup that we can
    * find out
    *      1) what our Berkeley DB home environment is,
    *      2) what host/port we wish to listen on for connections; and
    *      3) an optional list of other sites we should attempt to connect to.
    * These pieces of information are expressed by the following flags.
    * -h home (required; h stands for home directory)
    * -l host:port (required; l stands for local)
    * -C or -M (optional; start up as client or master)
    * -r host:port (optional; r stands for remote; any number of these may be
    *     specified)
    * -R host:port (optional; R stands for remote peer; only one of these may
    * be specified)
    * -a all|quorum (optional; a stands for ack policy)
    * -b (optional; b stands for bulk)
    * -n nsites (optional; number of sites in replication group; defaults to 0
    *     to try to dynamically compute nsites)
    * -p priority (optional; defaults to 100)
    * -v (optional; v stands for verbose)
    #include <cstdlib>
    #include <cstring>
    #include <iostream>
    #include <string>
    #include <sstream>
    #include <sys/types.h>
    #include <signal.h>
    #include <db_cxx.h>
    #include "RepConfigInfo.h"
    #include "dbc_auto.h"
    using std::cout;
    using std::cin;
    using std::cerr;
    using std::endl;
    using std::ends;
    using std::flush;
    using std::istream;
    using std::istringstream;
    using std::ostringstream;
    using std::string;
    using std::getline;
    #include <stdio.h>
    #include <readline/readline.h>
    #include <readline/history.h>
    #define     CACHESIZE     (10 * 1024 * 1024)
    #define     DATABASE     "quote.db"
    #define     DATABASE2     "quote2.db"
    const char *progname = "excxx_repquote";
    #include <errno.h>
    #ifdef _WIN32
    #define WIN32_LEAN_AND_MEAN
    #include <windows.h>
    #define     snprintf          _snprintf
    #define     sleep(s)          Sleep(1000 * (s))
    extern "C" {
    extern int getopt(int, char * const *, const char *);
    extern char *optarg;
    typedef HANDLE thread_t;
    typedef DWORD thread_exit_status_t;
    #define     thread_create(thrp, attr, func, arg)                    \
    (((*(thrp) = CreateThread(NULL, 0,                         \
         (LPTHREAD_START_ROUTINE)(func), (arg), 0, NULL)) == NULL) ? -1 : 0)
    #define     thread_join(thr, statusp)                         \
    ((WaitForSingleObject((thr), INFINITE) == WAIT_OBJECT_0) &&          \
    GetExitCodeThread((thr), (LPDWORD)(statusp)) ? 0 : -1)
    #else /* !_WIN32 */
    #include <pthread.h>
    typedef pthread_t thread_t;
    typedef void* thread_exit_status_t;
    #define     thread_create(thrp, attr, func, arg)                    \
    pthread_create((thrp), (attr), (func), (arg))
    #define     thread_join(thr, statusp) pthread_join((thr), (statusp))
    #endif
    // Struct used to store information in Db app_private field.
    typedef struct {
         bool app_finished;
         bool in_client_sync;
         bool is_master;
         bool no_dummy_wr;
    } APP_DATA;
    static void log(const char *);
    void checkpoint_thread (void );
    void log_archive_thread (void );
    void dummy_write_thread (void );
    class RepQuoteExample {
    public:
         RepQuoteExample();
         void init(RepConfigInfo* config);
         void doloop();
         int terminate();
         static void event_callback(DbEnv* dbenv, u_int32_t which, void *info);
         void print_stocks_size(Db *dbp);
    private:
         // disable copy constructor.
         RepQuoteExample(const RepQuoteExample &);
         void operator = (const RepQuoteExample &);
         // internal data members.
         APP_DATA          app_data;
         RepConfigInfo *app_config;
         DbEnv          cur_env;
         thread_t ckp_thr;
         thread_t lga_thr;
         thread_t dmy_thr;
         // private methods.
         void print_stocks(Db *dbp);
         void print_env(DbEnv *dbenv);
         void prompt();
    RepQuoteExample *g_runner=NULL;
    RepConfigInfo *g_config=NULL;
    class DbHolder {
    public:
         DbHolder(DbEnv env, const char _dbname) : env(env)
              dbp = 0;
              if (_dbname) dbname=_dbname;
              else dbname=DATABASE;
         ~DbHolder() {
         try {
              close();
         } catch (...) {
              // Ignore: this may mean another exception is pending
         bool ensure_open(bool creating) {
         if (dbp)
              return (true);
         dbp = new Db(env, 0);
         u_int32_t flags = DB_AUTO_COMMIT;
         if (creating)
              flags |= DB_CREATE;
         try {
              //dbp->open(NULL, DATABASE, NULL, DB_BTREE, flags, 0);
              //dbp->open(NULL, dbname, NULL, DB_BTREE, flags, 0);
              dbp->open(NULL, NULL, dbname, DB_BTREE, flags, 0);
              return (true);
         } catch (DbDeadlockException e) {
         } catch (DbRepHandleDeadException e) {
         } catch (DbException e) {
              if (e.get_errno() == DB_REP_LOCKOUT) {
              // Just fall through.
              } else if (e.get_errno() == ENOENT && !creating) {
              // Provide a bit of extra explanation.
              log("Stock DB does not yet exist");
              } else
              throw;
         // (All retryable errors fall through to here.)
         log("please retry the operation");
         close();
         return (false);
         void close() {
         if (dbp) {
              try {
              dbp->close(0);
              delete dbp;
              dbp = 0;
              } catch (...) {
              delete dbp;
              dbp = 0;
              throw;
         operator Db *() {
         return dbp;
         Db *operator->() {
         return dbp;
    private:
         Db *dbp;
         DbEnv *env;
         const char *dbname;
    class StringDbt : public Dbt {
    public:
    #define GET_STRING_OK 0
    #define GET_STRING_INVALID_PARAM 1
    #define GET_STRING_SMALL_BUFFER 2
    #define GET_STRING_EMPTY_DATA 3
         int get_string(char **buf, size_t buf_len)
              size_t copy_len;
              int ret = GET_STRING_OK;
              if (buf == NULL) {
                   cerr << "Invalid input buffer to get_string" << endl;
                   return GET_STRING_INVALID_PARAM;
              // make sure the string is null terminated.
              memset(*buf, 0, buf_len);
              // if there is no string, just return.
              if (get_data() == NULL || get_size() == 0)
                   return GET_STRING_OK;
              if (get_size() >= buf_len) {
                   ret = GET_STRING_SMALL_BUFFER;
                   copy_len = buf_len - 1; // save room for a terminator.
              } else
                   copy_len = get_size();
              memcpy(*buf, get_data(), copy_len);
              return ret;
         size_t get_string_length()
              if (get_size() == 0)
                   return 0;
              return strlen((char *)get_data());
         void set_string(char *string)
              set_data(string);
              set_size((u_int32_t)strlen(string));
         StringDbt(char *string) :
         Dbt(string, (u_int32_t)strlen(string)) {};
         StringDbt() : Dbt() {};
         ~StringDbt() {};
         // Don't add extra data to this sub-class since we want it to remain
         // compatible with Dbt objects created internally by Berkeley DB.
    Db *g_repquote=NULL;
    RepQuoteExample::RepQuoteExample() : app_config(0), cur_env(0) {
         app_data.app_finished = 0;
         app_data.in_client_sync = 0;
         app_data.is_master = 0; // assume I start out as client
         app_data.no_dummy_wr = 0 ; //prevent to run dummy write
    int (*old_rep_process_message)
              __P((DB_ENV *, DBT *, DBT *, int, DB_LSN *));
    int my_rep_process_message __P((DB_ENV arg1, DBT arg2, DBT arg3, int arg4, DB_LSN arg5))
         printf("EZ->>> my_rep_process_message:%p\n",arg5);
         old_rep_process_message(arg1,arg2,arg3,arg4,arg5);
    void RepQuoteExample::init(RepConfigInfo *config) {
         app_config = config;
         cur_env.set_app_private(&app_data);
         cur_env.set_errfile(stderr);
         app_data.no_dummy_wr=config->no_dummy_wr;
         if (app_data.no_dummy_wr)
              printf("No dummy !!!\n");
         //EZ->cur_env.set_errpfx(progname);
         cur_env.set_event_notify(event_callback);
         // Configure bulk transfer to send groups of records to clients
         // in a single network transfer. This is useful for master sites
         // and clients participating in client-to-client synchronization.
         if (app_config->bulk)
              cur_env.rep_set_config(DB_REP_CONF_BULK, 1);
         // Set the total number of sites in the replication group.
         // This is used by repmgr internal election processing.
         if (app_config->totalsites > 0)
              cur_env.rep_set_nsites(app_config->totalsites);
         // Turn on debugging and informational output if requested.
         if (app_config->verbose)
              cur_env.set_verbose(DB_VERB_REPLICATION, 1);
         cur_env.set_verbose(DB_VERB_REPMGR_MISC, 1);
         cur_env.set_verbose(DB_VERB_RECOVERY, 1);
         cur_env.set_verbose(DB_VERB_REPLICATION, 1);
         cur_env.set_verbose(DB_VERB_REP_ELECT, 1);
         cur_env.set_verbose(DB_VERB_REP_LEASE, 1);
         cur_env.set_verbose(DB_VERB_REP_SYNC, 1);
         cur_env.set_verbose(DB_VERB_REPMGR_MISC, 1);
         // Set replication group election priority for this environment.
         // An election first selects the site with the most recent log
         // records as the new master. If multiple sites have the most
         // recent log records, the site with the highest priority value
         // is selected as master.
         cur_env.rep_set_priority(app_config->priority);
         // Set the policy that determines how master and client sites
         // handle acknowledgement of replication messages needed for
         // permanent records. The default policy of "quorum" requires only
         // a quorum of electable peers sufficient to ensure a permanent
         // record remains durable if an election is held. The "all" option
         // requires all clients to acknowledge a permanent replication
         // message instead.
         cur_env.repmgr_set_ack_policy(app_config->ack_policy);
         // Set the threshold for the minimum and maximum time the client
         // waits before requesting retransmission of a missing message.
         // Base these values on the performance and load characteristics
         // of the master and client host platforms as well as the round
         // trip message time.
         cur_env.rep_set_request(20000, 500000);
         // Configure deadlock detection to ensure that any deadlocks
         // are broken by having one of the conflicting lock requests
         // rejected. DB_LOCK_DEFAULT uses the lock policy specified
         // at environment creation time or DB_LOCK_RANDOM if none was
         // specified.
         cur_env.set_lk_detect(DB_LOCK_DEFAULT);
         // The following base replication features may also be useful to your
         // application. See Berkeley DB documentation for more details.
         // - Master leases: Provide stricter consistency for data reads
         // on a master site.
         // - Timeouts: Customize the amount of time Berkeley DB waits
         // for such things as an election to be concluded or a master
         // lease to be granted.
         // - Delayed client synchronization: Manage the master site's
         // resources by spreading out resource-intensive client
         // synchronizations.
         // - Blocked client operations: Return immediately with an error
         // instead of waiting indefinitely if a client operation is
         // blocked by an ongoing client synchronization.
         cur_env.repmgr_set_local_site(app_config->this_host.host,
         app_config->this_host.port, 0);
         for ( REP_HOST_INFO *cur = app_config->other_hosts; cur != NULL;
              cur = cur->next) {
              cur_env.repmgr_add_remote_site(cur->host, cur->port,
              NULL, cur->peer ? DB_REPMGR_PEER : 0);
         // Configure heartbeat timeouts so that repmgr monitors the
         // health of the TCP connection. Master sites broadcast a heartbeat
         // at the frequency specified by the DB_REP_HEARTBEAT_SEND timeout.
         // Client sites wait for message activity the length of the
         // DB_REP_HEARTBEAT_MONITOR timeout before concluding that the
         // connection to the master is lost. The DB_REP_HEARTBEAT_MONITOR
         // timeout should be longer than the DB_REP_HEARTBEAT_SEND timeout.
         cur_env.rep_set_timeout(DB_REP_HEARTBEAT_SEND, 5000000);
         cur_env.rep_set_timeout(DB_REP_HEARTBEAT_MONITOR, 10000000);
         // The following repmgr features may also be useful to your
         // application. See Berkeley DB documentation for more details.
         // - Two-site strict majority rule - In a two-site replication
         // group, require both sites to be available to elect a new
         // master.
         // - Timeouts - Customize the amount of time repmgr waits
         // for such things as waiting for acknowledgements or attempting
         // to reconnect to other sites.
         // - Site list - return a list of sites currently known to repmgr.
         // We can now open our environment, although we're not ready to
         // begin replicating. However, we want to have a dbenv around
         // so that we can send it into any of our message handlers.
         cur_env.set_cachesize(0, CACHESIZE, 0);
         cur_env.set_flags(DB_REP_PERMANENT, 1);
         //cur_env.set_flags(DB_TXN_WRITE_NOSYNC, 1);
    /*     u_int32_t maxlocks=300000;
         if (maxlocks != 0)
              cur_env.set_lk_max_locks(maxlocks);
         u_int32_t maxlocks_o=300000;
         if (maxlocks_o != 0)
              cur_env.set_lk_max_objects(maxlocks_o);
         u_int32_t maxmutex=300000;
         if (maxmutex != 0)
              cur_env.mutex_set_max(maxmutex);
         DbEnv          *m_env=&cur_env;
         m_env->set_flags(DB_TXN_NOSYNC, 1);
         m_env->set_lk_max_lockers(60000);
         m_env->set_lk_max_objects(60000);
         m_env->set_lk_max_locks(60000);
         m_env->set_tx_max(60000);
         //m_env->repmgr_set_ack_policy(DB_REPMGR_ACKS_NONE);
         m_env->rep_set_timeout(DB_REP_ACK_TIMEOUT, 50 * 1000); //50ms
         m_env->rep_set_timeout(DB_REP_CHECKPOINT_DELAY, 0);
         //m_env->rep_set_timeout(DB_REP_CONNECTION_RETRY, 30 * 1000 * 1000); // 30 seconds
         m_env->rep_set_timeout(DB_REP_ELECTION_TIMEOUT, 1 * 1000 * 1000); // 5 seconds
         m_env->rep_set_timeout(DB_REP_FULL_ELECTION_TIMEOUT, 5 * 1000 * 1000); // 5 seconds
         m_env->rep_set_timeout(DB_REP_CONNECTION_RETRY, 5 * 1000 * 1000);
         //m_env->rep_set_timeout(DB_REP_ELECTION_RETRY, 10 * 1000 * 1000); //10 seconds
         //m_env->rep_set_timeout(DB_REP_HEARTBEAT_MONITOR, 80 * 1000 * 1000); //80 seconds
         //m_env->rep_set_timeout(DB_REP_HEARTBEAT_SEND, 500 * 1000); //500 milli seconds
         //The minimum number of microseconds a client waits before requesting retransmission
         u_int32_t rep_req_min = 40000; //40 000 microsec = 40 mili
         //The maximum number of microseconds a client waits before requesting retransmission
         u_int32_t rep_req_max = 1280000;// 1 280 000 microsec = 1.28 sec
         u_int32_t rep_limit_gbytes = 0;
         u_int32_t rep_limit_bytes = 100 * 1024 * 1024; // 100MB
         m_env->rep_set_request(rep_req_min, rep_req_max);
         m_env->rep_set_limit(rep_limit_gbytes, rep_limit_bytes);
         cur_env.open(app_config->home, DB_CREATE | DB_RECOVER |
         DB_THREAD | DB_INIT_REP | DB_INIT_LOCK | DB_INIT_LOG |
         DB_INIT_MPOOL | DB_INIT_TXN , 0);
         //keep old function for chain
         //old_rep_process_message=cur_env.get_DB_ENV()->rep_process_message;
         //derouting
         //cur_env.get_DB_ENV()->rep_process_message=my_rep_process_message;
         /*int _i;
         cur_env.log_get_config(DB_LOG_DIRECT, &_i);printf ("DB_LOG_DIRECT = %d\n",_i);
         cur_env.log_get_config(DB_LOG_DSYNC, &_i);printf ("DB_LOG_DSYNC = %d\n",_i);
         cur_env.log_get_config(DB_LOG_AUTO_REMOVE, &_i);printf ("DB_LOG_AUTO_REMOVE = %d\n",_i);
         cur_env.log_get_config(DB_LOG_IN_MEMORY, &_i);printf ("DB_LOG_IN_MEMORY = %d\n",_i);
         cur_env.log_get_config(DB_LOG_ZERO,&_i);printf ("DB_LOG_ZERO = %d\n",_i);
         // Start checkpoint and log archive support threads.
         (void)thread_create(&ckp_thr, NULL, checkpoint_thread, &cur_env);
         (void)thread_create(&lga_thr, NULL, log_archive_thread, &cur_env);
         (void)thread_create(&dmy_thr, NULL, dummy_write_thread, &cur_env);
         cur_env.repmgr_start(3, app_config->start_policy);
    }

    int RepQuoteExample::terminate() {
         try {
              // Wait for checkpoint and log archive threads to finish.
              // Windows does not allow NULL pointer for exit code variable.
              thread_exit_status_t exstat;
              (void)thread_join(lga_thr, &exstat);
              (void)thread_join(ckp_thr, &exstat);
              (void)thread_join(dmy_thr, &exstat);
              // We have used the DB_TXN_NOSYNC environment flag for
              // improved performance without the usual sacrifice of
              // transactional durability, as discussed in the
              // "Transactional guarantees" page of the Reference
              // Guide: if one replication site crashes, we can
              // expect the data to exist at another site. However,
              // in case we shut down all sites gracefully, we push
              // out the end of the log here so that the most
              // recent transactions don't mysteriously disappear.
              cur_env.log_flush(NULL);
              cur_env.close(0);
         } catch (DbException dbe) {
              cout << "error closing environment: " << dbe.what() << endl;
         return 0;
    void RepQuoteExample::prompt() {
         cout << "QUOTESERVER";
         if (!app_data.is_master)
              cout << "(read-only)";
         cout << "> " << flush;
    void log(const char *msg) {
    time_t currentTime;
    // get and print the current time
    time (&currentTime); // fill now with the current time
         char buff[255];
         strncpy(buff,ctime(&currentTime),sizeof(buff));
         char *p;
         for(p =buff ; *p != '\n'; p++);
         *p = '\0';
         cerr << buff << " - " << msg << endl;
    // Simple command-line user interface:
    // - enter "<stock symbol> <price>" to insert or update a record in the
    //     database;
    // - just press Return (i.e., blank input line) to print out the contents of
    //     the database;
    // - enter "quit" or "exit" to quit.
    void RepQuoteExample::doloop() {
         DbHolder dbh1(&cur_env,DATABASE);
         DbHolder dbh2(&cur_env,DATABASE2);
         DbHolder *dbh=&dbh1;
         DbTxn *txn;
         string input;
    bool truncate = false;
         char *c;
         using_history();
         g_repquote=*dbh;
         int loop_rate = 0;
         int txn_rate = 500;
         while (prompt(), /*getline(cin, input)*/c=readline(NULL)) {
              input=std::string(c);
              add_history(c);
              free(c);
              int start_loop = 0;
              int end_loop = 0;
              int start_loop_d = 0;
              int end_loop_d = 0;
              istringstream is(input);
              string token1, token2, token3;
    truncate = false;
    start_loop = 0;
    end_loop = 0;
              // Read 0, 1 or 2 tokens from the input.
              int count = 0;
              if (is >> token1) {
                   count++;
                   if (is >> token2)
                   count++;
                   if (is >> token3)
                   count++;
              if (count == 1) {
         if (token1 == "truncate" ) {
                        truncate = true;     
                   else if (token1 == "env" ){
                        print_env(&cur_env);
                        continue;
         else if (token1 == "verbose" ) {
                        app_config->verbose = !app_config->verbose;
                        if (app_config->verbose)
                             cur_env.set_verbose(DB_VERB_REPLICATION, 1);
                             cur_env.set_verbose(DB_VERB_REPMGR_MISC, 1);
                             cur_env.set_verbose(DB_VERB_RECOVERY, 1);
                             cur_env.set_verbose(DB_VERB_REP_ELECT, 1);
                             cur_env.set_verbose(DB_VERB_REP_LEASE, 1);
                             cur_env.set_verbose(DB_VERB_REP_SYNC, 1);
                             cur_env.set_verbose(DB_VERB_REPMGR_MISC, 1);
                             log("verbose is on");
                        else
                             cur_env.set_verbose(DB_VERB_REPLICATION, 0);
                             cur_env.set_verbose(DB_VERB_REPMGR_MISC, 0);
                             cur_env.set_verbose(DB_VERB_RECOVERY, 0);
                             cur_env.set_verbose(DB_VERB_REP_ELECT, 0);
                             cur_env.set_verbose(DB_VERB_REP_LEASE, 0);
                             cur_env.set_verbose(DB_VERB_REP_SYNC, 0);
                             cur_env.set_verbose(DB_VERB_REPMGR_MISC, 0);
                             log("verbose is off");
                        continue;
         else if (token1 == "print" ) {
                   print_stocks(*dbh);
                        count = 0;      
         else if (token1 == "db1" ) {
                        dbh=&dbh1;
                        g_repquote=*dbh;
                        log( "switch to Db1");
                        count = 0;      
         else if (token1 == "db2" ) {
                        dbh=&dbh2;
                        g_repquote=*dbh;
                        log( "switch to Db2");
                        count = 0;      
                   else if (token1 == "exit" || token1 == "quit") {
                        app_data.app_finished = 1;
                        break;
                   } else {
                        log("Format: <stock> <price>");
                        continue;
    else if (count == 2)
                   if (token1 == "loop_rate" ){
         loop_rate = atoi(token2.c_str());
                        continue;
                   if (token1 == "txn_rate" ){
         txn_rate = atoi(token2.c_str());
                        continue;
    else if (count == 3)
    if (token1 == "loop" ) {
    start_loop = atoi(token2.c_str());
    end_loop = start_loop + atoi(token3.c_str());
    if (token1 == "delete" ) {
    start_loop_d = atoi(token2.c_str());
    end_loop_d = start_loop_d + atoi(token3.c_str());
              // Here we know count is either 0 or 2, so we're about to try a
              // DB operation.
              // Open database with DB_CREATE only if this is a master
              // database. A client database uses polling to attempt
              // to open the database without DB_CREATE until it is
              // successful.
              // This DB_CREATE polling logic can be simplified under
              // some circumstances. For example, if the application can
              // be sure a database is already there, it would never need
              // to open it with DB_CREATE.
              if (!dbh->ensure_open(app_data.is_master))
                   continue;
              try {
                   if (count == 0)
                        if (app_data.in_client_sync)
                             log( "Cannot read data during client initialization - please try again.");
                        else
                             print_stocks_size(*dbh);
                   else if (!app_data.is_master)
                        log("Can't update at client");
                   else {
                        if (truncate)
    u_int32_t no_remove;
                        txn = NULL;
    cur_env.txn_begin(NULL, &txn, DB_TXN_NOWAIT);
                             try
              (*dbh)->truncate(txn, &no_remove, 0);
    // commit
    txn->commit(0);
    txn = NULL;
    } catch (DbException &e) {
    std::cout << "Error on txn commit: " << e.what() << std::endl;
                        //     } catch (DbDeadlockException &) {
                        if (txn != NULL)
                             (void)txn->abort();
    // std::cout << "Error on txn commit: " << std::endl;
    else if (start_loop)
    int j=0;
    for (int i=start_loop; i<=end_loop; i=i+txn_rate)
    //transaction begin
                   txn = NULL;
                   cur_env.txn_begin(NULL, &txn, 0);
    for (j=i; j<=end_loop && j<=(i+txn_rate); j++)
                                  Dbt key, value;
         std::string key1, value1;
         std::stringstream sstrm;
         sstrm << "key" << j << ends;
         key1 = sstrm.str();
                   key.set_data((void *)key1.c_str());
                   key.set_size((u_int32_t)strlen(key1.c_str()));
         sstrm.str("");
         int payload = rand() + j;
                                  sstrm << "price" << payload << ends;
         value1 = sstrm.str();
                   value.set_data((void *)value1.c_str());
                   value.set_size((u_int32_t)strlen(value1.c_str()));
         // Perform the database put
         (*dbh)->put(txn, &key, &value, 0);
                             printf("Kill me !!\n");
                             kill(getpid(),-9);
                             exit(0);
         try
                                  // commit
                        txn->commit(0);
                        txn = NULL;
                   } catch (DbException &e) {
                        std::cout << "Error on txn commit: " << e.what() << std::endl;
                             if (loop_rate>0)
                                  usleep(txn_rate * 1000 * 1000 / loop_rate);
                        else if (start_loop_d)
    int j=0;
    for (int i=start_loop_d; i<=end_loop_d; i=i+100)
    //transaction begin
                   txn = NULL;
                   cur_env.txn_begin(NULL, &txn, 0);
    for (j=i; j<=end_loop_d && j<=(i+100); j++)
                                  Dbt key, value;
         std::string key1, value1;
         std::stringstream sstrm;
         sstrm << "key" << j << ends;
         key1 = sstrm.str();
                   key.set_data((void *)key1.c_str());
                   key.set_size((u_int32_t)strlen(key1.c_str()));
         // Perform the database put
         (*dbh)->del(txn, &key, 0);
         try
                                  // commit
                        txn->commit(0);
                        txn = NULL;
                   } catch (DbException &e) {
                        std::cout << "Error on txn commit: " << e.what() << std::endl;
                        else
                             const char *symbol = token1.c_str();
                             StringDbt key(const_cast<char*>(symbol));
                             const char *price = token2.c_str();
                             StringDbt data(const_cast<char*>(price));
                             (*dbh)->put(NULL, &key, &data, 0);
              } catch (DbDeadlockException e) {
                   log("please retry the operation");
                   dbh->close();
              } catch (DbRepHandleDeadException e) {
                   log("please retry the operation");
                   dbh->close();
              } catch (DbException e) {
                   if (e.get_errno() == DB_REP_LOCKOUT) {
                   log("please retry the operation");
                   dbh->close();
                   } else
                   throw;
         dbh->close();
    void RepQuoteExample::event_callback(DbEnv* dbenv, u_int32_t which, void *info)
         static char buf[256];
         APP_DATA app = (APP_DATA)dbenv->get_app_private();
         info = NULL;          /* Currently unused. */
         switch (which) {
         case DB_EVENT_REP_CLIENT:
              app->is_master = 0;
              app->in_client_sync = 1;
              sprintf(buf,"%s - %s",progname,"CLIENT");
              //EZ->dbenv->set_errpfx(buf);
              log("DB_EVENT_REP_CLIENT.");
              break;
         case DB_EVENT_REP_MASTER:
              app->is_master = 1;
              app->in_client_sync = 0;
              sprintf(buf,"%s - %s",progname,"MASTER");
              //EZ->dbenv->set_errpfx(buf);
              log("DB_EVENT_REP_MASTER.");
              break;
         case DB_EVENT_REP_NEWMASTER:
              log("DB_EVENT_REP_NEWMASTER.");
              app->in_client_sync = 1;
              break;
         case DB_EVENT_REP_PERM_FAILED:
              // Did not get enough acks to guarantee transaction
              // durability based on the configured ack policy. This
              // transaction will be flushed to the master site's
              // local disk storage for durability.
              log("DB_EVENT_REP_PERM_FAILED.");
              log("Insufficient acknowledgements to guarantee transaction durability.");
              break;
         case DB_EVENT_REP_STARTUPDONE:
              app->in_client_sync = 0;
              log("DB_EVENT_REP_STARTUPDONE.");
              break;
         case DB_EVENT_REP_ELECTION_FAILED:
              log("DB_EVENT_REP_ELECTION_FAILED.");
              //g_runner->init(g_config);
              printf("Kill me !!\n");
              kill(getpid(),-9);
              exit(0);
              break;
         case DB_EVENT_REP_DUPMASTER:
              log("DB_EVENT_REP_DUPMASTER.");
              break;
         default:
              dbenv->errx("ignoring event %d", which);
    void RepQuoteExample::print_stocks_size(Db *dbp) {
         DB_BTREE_STAT *statp;
    dbp->stat(NULL, &statp, 0);
         log("db_stat");
    cout << "***************************************** >>>>>>>>>>> : database contains " << (u_long)statp->bt_ndata << " records\n";
    void RepQuoteExample::print_env(DbEnv *dbenv) {
         dbenv->stat_print(DB_STAT_ALL);
    void RepQuoteExample::print_stocks(Db *dbp) {
         StringDbt key, data;
    #define     MAXKEYSIZE     10
    #define     MAXDATASIZE     20
         char keybuf[MAXKEYSIZE + 1], databuf[MAXDATASIZE + 1];
         char kbuf, dbuf;
         memset(&key, 0, sizeof(key));
         memset(&data, 0, sizeof(data));
         kbuf = keybuf;
         dbuf = databuf;
         DbcAuto dbc(dbp, 0, 0);
         cout << "\tSymbol\tPrice" << endl
              << "\t======\t=====" << endl;
    int no_records =0;
         for (int ret = dbc->get(&key, &data, DB_FIRST);
              ret == 0;
              ret = dbc->get(&key, &data, DB_NEXT)) {
              key.get_string(&kbuf, MAXKEYSIZE);
              data.get_string(&dbuf, MAXDATASIZE);
    no_records++;
              cout << "\t" << keybuf << "\t" << databuf << endl;
    cout << "********************** NO Records " << no_records << endl;
         cout << endl << flush;
         dbc.close();
    static void usage() {
         cerr << "usage: " << progname << " -h home -l host:port [-CM]"
         << "[-r host:port][-R host:port]" << endl
         << " [-a all|quorum][-b][-n nsites][-p priority][-v]" << endl;
         cerr << "\t -h home (required; h stands for home directory)" << endl
         << "\t -l host:port (required; l stands for local)" << endl
         << "\t -C or -M (optional; start up as client or master)" << endl
         << "\t -r host:port (optional; r stands for remote; any "
         << "number of these" << endl
         << "\t may be specified)" << endl
         << "\t -R host:port (optional; R stands for remote peer; only "
         << "one of" << endl
         << "\t these may be specified)" << endl
         << "\t -a all|quorum (optional; a stands for ack policy)" << endl
         << "\t -b (optional; b stands for bulk)" << endl
         << "\t -n nsites (optional; number of sites in replication "
         << "group; defaults " << endl
         << "\t     to 0 to try to dynamically compute nsites)" << endl
         << "\t -p priority (optional; defaults to 100)" << endl
         << "\t -v (optional; v stands for verbose)" << endl;
         exit(EXIT_FAILURE);
    int main(int argc, char **argv) {
         RepConfigInfo config;
         char ch, portstr, tmphost;
         int tmpport;
         bool tmppeer;
         config.no_dummy_wr = false;
         // Extract the command line parameters
         while ((ch = getopt(argc, argv, "E:a:bCh:l:Mn:p:R:r:vw")) != EOF) {
              tmppeer = false;
              switch (ch) {
              case 'a':
                   if (strncmp(optarg, "all", 3) == 0)
                        config.ack_policy = DB_REPMGR_ACKS_ALL;
                   else if (strncmp(optarg, "quorum", 6) != 0)
                        usage();
                   break;
              case 'b':
                   config.bulk = true;
                   break;
              case 'C':
                   config.start_policy = DB_REP_CLIENT;
                   break;
              case 'E':
    config.start_policy = DB_REP_ELECTION;
    break;
              case 'h':
                   config.home = optarg;
                   break;
              case 'l':
                   config.this_host.host = strtok(optarg, ":");
                   if ((portstr = strtok(NULL, ":")) == NULL) {
                        cerr << "Bad host specification." << endl;
                        usage();
                   config.this_host.port = (unsigned short)atoi(portstr);
                   config.got_listen_address = true;
                   break;
              case 'M':
                   config.start_policy = DB_REP_MASTER;
                   break;
              case 'n':
                   config.totalsites = atoi(optarg);
                   break;
              case 'p':
                   config.priority = atoi(optarg);
                   break;
              case 'R':
                   tmppeer = true; // FALLTHROUGH
              case 'r':
                   tmphost = strtok(optarg, ":");
                   if ((portstr = strtok(NULL, ":")) == NULL) {
                        cerr << "Bad host specification." << endl;
                        usage();
                   tmpport = (unsigned short)atoi(portstr);
                   config.addOtherHost(tmphost, tmpport, tmppeer);
                   break;
              case 'v':
                   config.verbose = true;
                   break;
              case 'w':
                   config.no_dummy_wr = true;
                   //config.priority = 2;
                   break;
              case '?':
              default:
                   usage();
         // Error check command line.
         if ((!config.got_listen_address) || config.home == NULL)
              usage();
         RepQuoteExample runner;
         g_runner=&runner;
         g_config=&config;
         try {
              runner.init(&config);
              runner.doloop();
         } catch (DbException dbe) {
              cerr << "Caught an exception during initialization or"
                   << " processing: " << dbe.what() << endl;
         runner.terminate();
         return 0;
    // This is a very simple thread that performs checkpoints at a fixed
    // time interval. For a master site, the time interval is one minute
    // plus the duration of the checkpoint_delay timeout (30 seconds by
    // default.) For a client site, the time interval is one minute.
    void checkpoint_thread(void args)
         DbEnv *env;
         APP_DATA *app;
         int i, ret;
         env = (DbEnv *)args;
         app = (APP_DATA *)env->get_app_private();
         for (;;) {
              // Wait for one minute, polling once per second to see if
              // application has finished. When application has finished,
              // terminate this thread.
              for (i = 0; i < 60; i++) {
                   sleep(1);
                   if (app->app_finished == 1)
                        return ((void *)EXIT_SUCCESS);
              // Perform a checkpoint.
              // original line
              if ((ret = env->txn_checkpoint(0, 0, 0)) != 0) {
              //if ((ret = env->txn_checkpoint(0, 0, DB_FORCE)) != 0) {
                   env->err(ret, "Could not perform checkpoint.\n");
                   return ((void *)EXIT_FAILURE);
    // This is a simple log archive thread. Once per minute, it removes all but
    // the most recent 3 logs that are safe to remove according to a call to
    // DBENV->log_archive().
    // Log cleanup is needed to conserve disk space, but aggressive log cleanup
    // can cause more frequent client initializations if a client lags too far
    // behind the current master. This can happen in the event of a slow client,
    // a network partition, or a new master that has not kept as many logs as the
    // previous master.
    // The approach in this routine balances the need to mitigate against a
    // lagging client by keeping a few more of the most recent unneeded logs
    // with the need to conserve disk space by regularly cleaning up log files.
    // Use of automatic log removal (DBENV->log_set_config() DB_LOG_AUTO_REMOVE
    // flag) is not recommended for replication due to the risk of frequent
    // client initializations.
    void log_archive_thread(void args)
         DbEnv *env;
         APP_DATA *app;
         char **begin, **list;
         int i, listlen, logs_to_keep, minlog, ret;
         env = (DbEnv *)args;
         app = (APP_DATA *)env->get_app_private();
         logs_to_keep = 3;
         for (;;) {
              // Wait for one minute, polling once per second to see if
              // application has finished. When application has finished,
              // terminate this thread.
              for (i = 0; i < 60; i++) {
                   sleep(1);
                   if (app->app_finished == 1)
                        return ((void *)EXIT_SUCCESS);
              // Get the list of unneeded log files.
              if ((ret = env->log_archive(&list, DB_ARCH_ABS)) != 0) {
                   env->err(ret, "Could not get log archive list.");
                   return ((void *)EXIT_FAILURE);
              if (list != NULL) {
                   listlen = 0;
                   // Get the number of logs in the list.
                   for (begin = list; *begin != NULL; begin++, listlen++);
                   // Remove all but the logs_to_keep most recent
                   // unneeded log files.
                   minlog = listlen - logs_to_keep;
                   for (begin = list, i= 0; i < minlog; list++, i++) {
                        if ((ret = unlink(*list)) != 0) {
                             env->err(ret,
                             "logclean: remove %s", *list);
                             env->errx(
                             "logclean: Error remove %s", *list);
                             free(begin);
                             return ((void *)EXIT_FAILURE);
                   free(begin);
    #define DATABASE_DUMMY "dummy.db"
    void create_dummy_db(DB_ENV env, DB *dbp)
    DB_ENV *dbenv=env;
    int ret;
    u_int32_t db_flags;
    if ((ret = db_create(dbp, dbenv, 0)) != 0)
    dbenv->err(dbenv, ret, "create_dummy_db: db_create");
    db_flags = DB_AUTO_COMMIT | DB_CREATE;
    //if ((ret = (*dbp)->open(*dbp,NULL, DATABASE, NULL, DB_BTREE, db_flags, 0)) != 0)
    if ((ret = (*dbp)->open(*dbp,NULL, NULL, DATABASE_DUMMY, DB_BTREE, db_flags, 0)) != 0)
    dbenv->err(dbenv, ret, "create_dummy_db: DB->open");
    void reopen_dummy_db(DB_ENV env, DB *dbp)
    DB_ENV *dbenv=env;
    int ret;
    u_int32_t db_flags;
    if ((ret = db_create(dbp, dbenv, 0)) != 0)
    dbenv->err(dbenv, ret, "create_dummy_db: db_create");
    db_flags = DB_AUTO_COMMIT | DB_CREATE;
    //if ((ret = (*dbp)->open(*dbp,NULL, DATABASE, NULL, DB_BTREE, db_flags, 0)) != 0)
    if ((ret = (*dbp)->open(*dbp,NULL, NULL, DATABASE_DUMMY, DB_BTREE, db_flags, 0)) != 0)
    dbenv->err(dbenv, ret, "reopen_dummy_db: DB->open");
    void perform_db_operation(DB_ENV env, DB *dbp, bool bRead)
    //main loop
    //DB *dbp=NULL;
    DB_ENV *dbenv=env;
    int ret;
    u_int32_t db_flags;
    DBT key, data;
    char buf[20]="dummy", *rbuf;
    rbuf=buf;
    if (*dbp == NULL)
    create_dummy_db(dbenv, dbp);
    if (! bRead)
         memset(&key, 0, sizeof(key));
         memset(&data, 0, sizeof(data));
         key.data = buf;
         key.size = (u_int32_t)strlen(buf);
         data.data = rbuf;
         data.size = (u_int32_t)strlen(rbuf);
         if ((ret = (*dbp)->put(*dbp, NULL, &key, &data, 0)) != 0)
              if (ret == DB_REP_HANDLE_DEAD)
                   //create_dummy_db(dbenv, dbp);
                   reopen_dummy_db(dbenv, dbp);
                   (*dbp)->err(*dbp, ret, "DB->put :");
              else
              if (ret != DB_KEYEXIST)
                   (*dbp)->err(*dbp, ret, "perform_db_operation: DB->put");
         else
              DB_BTREE_STAT *statp;
              (*dbp)->stat(*dbp,NULL, &statp, 0);
              std::cout<<"dbp read stats: key#"<< statp->bt_nkeys <<std::endl;
    void dummy_write_thread(void args)
         DbEnv *env;
         APP_DATA *app;
         char **begin, **list;
         int i, listlen, logs_to_keep, minlog, ret;
         DB *m_dbp; // a pointer
         env = (DbEnv *)args;
         app = (APP_DATA *)env->get_app_private();
         logs_to_keep = 3;
         for (;;) {
              if (! app->no_dummy_wr)
                   if (app->is_master)
                   perform_db_operation(env->get_DB_ENV(),&m_dbp,false);
                        //env->txn_checkpoint(0, 0, DB_FORCE);
              usleep(1 * 1000 * 1000);
              else
                   if (app->is_master)
                        //DB *db_quote=g_repquote->get_DB();
                        //perform_db_operation(env->get_DB_ENV(),&db_quote,true);
                        //if (g_repquote)
                        //     g_runner->print_stocks_size(g_repquote);
                        //env->txn_checkpoint(0, 0, DB_FORCE);
                        //perform_db_operation(env->get_DB_ENV(),&m_dbp,false);
                        env->rep_flush();
              usleep(4 * 1000 * 1000);
    my script to simulate the split brain
    #!/bin/sh
    [ -z "$node1" ] && node1=10.10.32.121
    [ -z "$node2" ] && node2=10.10.32.91
    trap myend 0 1 2 3 6 9 14 15
    myend()
         echo "Receive signal to stop test..."
         un_split_brain
         echo "done"
         exit 1
    split_brain()
         echo -n "Split-Brain at node $node..."
         snmpset -m ALL -v 2c -c svil 10.10.0.100 ifAdminStatus.41 i 2 >/dev/null 2>&1
         echo "done"
    un_split_brain()
         echo -n "Undo Split-Brain at node $node..."
         snmpset -m ALL -v 2c -c svil 10.10.0.100 ifAdminStatus.41 i 1 >/dev/null 2>&1
         echo "done"
    is_slave()
         local r=$(ssh root@$1 "tail -2 /tmp/BDB.log" | grep -c CLIENT)
         [ $r -gt 1 ] && ret=1 || ret=0
         return $ret
    is_master()
         local r=$(ssh root@$1 "tail -2 /tmp/BDB.log" | grep -c MASTER)
         [ $r -gt 1 ] && ret=1 || ret=0
         return $ret
    wait_for_master()
         echo -n "Waiting for MASTER at node $node ... "
         is_master $node
         r=$?
         while ( [ ! $r -eq 1 ] )
         do
         usleep 500000
         is_master $node
         r=$?
         echo -n "."
         done
         echo "done"
    wait_for_slave()
         local r
         local tm
         tm=0
         echo -n "Waiting for SLAVE at node $node ... "
         is_slave $node
         r=$?
         while ( [ ! $r -eq 1 ] )
         do
              usleep 500000
              is_slave $node
              r=$?
              echo -n "."
              tm=$((tm+1))
              [ $tm -gt 120 ] && break
         done
         [ $tm -gt 120 ] && ret=0 || ret=1
         echo "done"
         return $ret
    run_test_split_brain()
         local nt
         nt=1
         nfails=0
         x=4
         [ -z "$1" ] && node=$node2
         while ((1))
         do
              printf "*************** TEST [%02d] ********************\n" $nt
              split_brain
              wait_for_master
              x=$((RANDOM%9))
              echo -n " waiting $x sec ..."
              sleep $x
              echo "done"
              un_split_brain
              wait_for_slave
              r=$?
              [ ! $r -eq 1 ] && echo "`date` - test [$nt] - fails ..." || echo "`date` - test [$nt] - OK ."
              [ ! $r -eq 1 ] && nfails=$((nfails+1))
              perc_failure=$(echo "100.0 - $nfails / $nt * 100.0" | bc -l)
              echo "************************************************ [% Success test $perc_failure % ]"
              nt=$((nt+1))
              x=$((RANDOM%9))
              echo -n " waiting $x sec ..."
              sleep $x
         done
    run_test_split_brain
    here is the makefile to run to two environments
    i run:
    - make run
    and in another window sh test_split_brain.sh
    node1?=10.10.32.121
    node2?=10.10.32.91
    nsite?=2
    debug?=0
    all: RepQuoteExampleEric install
    RepConfigInfo.o: RepConfigInfo.cpp RepConfigInfo.h
         g++ -I/usr/local/BerkeleyDB.5.1/include/ -g -O0 -c RepConfigInfo.cpp -o RepConfigInfo.o
    RepQuoteExampleEric: RepQuoteExampleEric.cpp RepConfigInfo.o
         g++ -I/usr/local/BerkeleyDB.5.1/include/ -g -O0 RepQuoteExampleEric.cpp RepConfigInfo.o -o RepQuoteExampleEric -L /usr/local/BerkeleyDB.5.1/lib/ -lreadline -lcurses -ldb_cxx
    kill:
         -ssh -X root@$(node1) "killall -9 /root/RepQuoteExampleEric"
         -ssh -X root@$(node2) "killall -9 /root/RepQuoteExampleEric"
    run: RepQuoteExampleEric kill install clean_env
         ssh -X root@$(node1) "xterm -geom 100x20+100+100 -e \"LD_LIBRARY_PATH=/usr/local/BerkeleyDB.5.1/lib/ /root/RepQuoteExampleEric -h /opt/bdb/ -l 2.0.0.110:12345 -r 2.0.0.210:12345 -a quorum -b -n $(nsite) -v | tee /tmp/BDB.log\"" &
         ssh -X root@$(node2) "xterm -geom 100x20+800+100 -e \"LD_LIBRARY_PATH=/usr/local/BerkeleyDB.5.1/lib/ /root/RepQuoteExampleEric -h /opt/bdb/ -l 2.0.0.210:12345 -r 2.0.0.110:12345 -a quorum -b -n $(nsite) -v -w | tee /tmp/BDB.log\"" &
    run_node2: clean_env2
         ssh -X root@$(node2) "xterm -geom 100x20+800+100 -e \"LD_LIBRARY_PATH=/usr/local/BerkeleyDB.5.1/lib/ /root/RepQuoteExampleEric -h /opt/bdb/ -l 2.0.0.210:12345 -r 2.0.0.110:12345 -a quorum -b -n $(nsite) -v -w | tee /tmp/BDB.log\"" &
    debug_node2: clean_env2
         ssh -X root@$(node2) "xterm -geom 100x20+800+100 -e \"LD_LIBRARY_PATH=/usr/local/BerkeleyDB.5.1/lib/ /root/RepQuoteExampleEric -h /opt/bdb/ -l 2.0.0.210:12345 -r 2.0.0.110:12345 -a quorum -b -n $(nsite) -v -w | tee /tmp/BDB.log\"" &
         sleep 3
         ssh -X root@$(node2) /sbin/pidof RepQuoteExampleEric >/tmp/pid
         ssh -X root@$(node2) ~/kdbg /root/db-5.1.19/examples/cxx/excxx_repquote/RepQuoteExampleEric -p `cat /tmp/pid`
    run_debug_node1: RepQuoteExampleEric kill install clean_env
         ssh -X root@$(node1) "xterm -geom 100x20+100+100 -e \"LD_LIBRARY_PATH=/usr/local/BerkeleyDB.5.1/lib/ /root/kdbg /root/RepQuoteExampleEric\" " &
         ssh -X root@$(node2) "xterm -geom 100x20+800+100 -e \"LD_LIBRARY_PATH=/usr/local/BerkeleyDB.5.1/lib/ /root/RepQuoteExampleEric -h /opt/bdb/ -l 2.0.0.210:12345 -r 2.0.0.110:12345 -a quorum -b -n $(nsite) -v\"" &
    run_debug_node2: RepQuoteExampleEric kill install clean_env
         ssh -X root@$(node1) "xterm -geom 100x20+100+100 -e \"LD_LIBRARY_PATH=/usr/local/BerkeleyDB.5.1/lib/ /root/RepQuoteExampleEric -h /opt/bdb/ -l 2.0.0.110:12345 -r 2.0.0.210:12345 -a quorum -b -n $(nsite) -v\" " &
         ssh -X root@$(node2) "xterm -geom 100x20+800+100 -e \"LD_LIBRARY_PATH=/usr/local/BerkeleyDB.5.1/lib/ /root/kdbg /root/RepQuoteExampleEric\"" &
    install: RepQuoteExampleEric
         scp RepQuoteExampleEric root@$(node1):~
         scp RepQuoteExampleEric root@$(node2):~
    clean_env: clean_env1 clean_env2
    clean_env1:
         ssh -X root@$(node1) rm -rf /opt/bdb/*
    clean_env2:
         ssh -X root@$(node2) rm -rf /opt/bdb/*

  • Private interconnect oracle 10g RAC configuration

    Can you please answer below questions?
    Why does a private interconnect need a switch, and why is straight through cables not supported.

    Hi,
    Why do you need a switch between the nodes".When network plugs are pulled out from one node on a two node cluster, a split brain scenerio occurs. (just it's enough)
    If you are using a crossover cable and you shutdown node (A) you will loose the network (private) link from node (B) (this happens in some servers.), Oracle RAC will not work with private network link down, both nodes will down and will not start until you get link on network private. (Goodbye high availability)
    You will be very unhappy with the error ORA-29740. Prelude to suggest this note: Troubleshooting ORA-29740 in a RAC Environment [ID 219361.1]
    There is no "why" Oracle RAC does not support crossover cable because RAC depends of SWITCH (it's a Hardware Requirements) and any problems in your environment Oracle will force you to implement a supported solution.
    You will have implemented a poor environment, if you not use a GB switch for private network.
    Oracle Words:
    *Physical Layout of the Private Interconnect*
    The basic requirements are described in the Installation Guide for each platform. Additional information about certification can be found on Metalink Certify.
    The interconnect as identified by both subnet number and interface name must be configured on all clustered nodes.
    *A switch between the clustered nodes is an absolute requirement.*
    *Cluster Interconnect in Oracle 10g and 11g [ID 787420.1]*
    Regards,
    Levi Pereira

  • Oracle 10g rac installation on IBM power server

    Dear Gurus,
    I am installing Oracle 10g RAC on IBM power server but while running CRS getting the following errors. Please help to resolve this issue.
    OUTPUT from Installation log:
    INFO: Start output from spawned process:
    INFO: INFO:
    INFO: /oracle/app/oracle/product/crs/bin/genclntsh
    INFO: /usr/bin/ld: cannot find -lxl
    INFO: collect2: ld returned 1 exit status
    INFO: genclntsh: Failed to link libclntsh.so.10.1
    INFO: make:
    INFO: *** [client_sharedlib] Error 1
    INFO: End output from spawned process.
    INFO: INFO: Exception thrown from action: make
    Exception Name: MakefileException
    Exception String: Error in invoking target 'client_sharedlib' of makefile '/oracle/app/oracle/product/crs/network/lib/ins_net_client.mk'. See '/oracle/app/oracle/oraInventory/logs/installActions2011-12-26_04-40-48PM.log' for details.
    Exception Severity: 1
    INFO: Exception handling set to prompt user with options to Retry Ignore
    SYSTEM:
    IBM power server
    Linux 5.3
    I have already tried changing ORACLE_HOME,CRS_HOME,LD_LIBRARY_PATH but nothing worked.
    Regards,
    Prajash

    Those errors remind me on errors I encountered when I was trying to do installation that was not certified,
    so try to check if installation you are doing is actually supported.
    In the meantime , visit metalink and read document [ID 460969.1] , */usr/bin/ld: Cannot Find -lxml10, Genclntsh: Failed To Link Libclntsh.so.10.1*

  • Oracle 10G RAC - Public & Heartbeat on 1 NIC

    Hello all,
    actually I am installing Oracle 10G RAC on rhel 4 (4 node cluster). But the Cluster Verification Utility aborts with errors. I checked the configToolAllCommands and tried to run the failed commands manually:
    #/opt/oracle/crs/bin/oifcfg setif -global eth0.100/172.18.0.0:cluster_interconnect eth0.728/172.16.128.0:public eth0.498/172.17.1.0:cluster_interconnect
    PRIF-50: duplicate interface is given in the input
    PRIF-50: duplicate interface is given in the input
    PRIF-50: duplicate interface is given in the input
    Question:
    Is it possible to put Public & Heartbeat on one NIC (eth0.728 & eth0.498)
    If not is their any workaround for that issue?
    Output /etc/hosts
    # that require network functionality will fail.
    127.0.0.1 localhost.localdomain localhost
    172.18.253.48 eu0266.[company].net cfmaster
    172.16.128.11 eu0200.[company].net eu0200
    172.16.128.12 eu0201.[company].net eu0201
    172.16.128.13 eu0202.[company].net eu0202
    172.16.128.14 eu0203.[company].net eu0203
    172.18.13.11 eu0200m.[company].net eu0200m
    172.18.13.12 eu0201m.[company].net eu0201m
    172.18.13.13 eu0202m.[company].net eu0202m
    172.18.13.14 eu0203m.[company].net eu0203m
    # Private section
    172.17.1.11 eu0200-priv.[company].net eu0200-priv
    172.17.1.12 eu0201-priv.[company].net eu0201-priv
    172.17.1.13 eu0202-priv.[company].net eu0202-priv
    172.17.1.14 eu0203-priv.[company].net eu0203-priv
    # Virtual section
    172.16.128.16 eu0200-vip.[company].net eu0200-vip
    172.16.128.17 eu0201-vip.l[company].net eu0201-vip
    172.16.128.18 eu0202-vip.[company].net eu0202-vip
    172.16.128.19 eu0203-vip.[company].net eu0203-vip
    Output install log:
    Checking existence of VIP node application (required)
    Check failed.
    Check failed on nodes:
    eu0203,eu0202,eu0201,eu0200
    Checking existence of ONS node application (optional)
    Check ignored.
    Checking existence of GSD node application (optional)
    Check ignored.
    Post-check for cluster services setup was unsuccessful on all the nodes.
    Command = /opt/[company]/oracle/crs/bin/cluvfy has failed
    INFO: Configuration assistant "Oracle Cluster Verification Utility" failed
    *** Starting OUICA ***
    Oracle Home set to /opt/[company]/oracle/crs
    Configuration directory is set to /opt/[company]/oracle/crs/cfgtoollogs. All xml files under the directory will be processed
    INFO: The "/opt/[company]/oracle/crs/cfgtoollogs/configToolFailedCommands" script contains all commands that failed, were skipped or were cancelled. This file may be used to run these configuration assistants outside of OUI. Note that you may have to update this script with passwords (if any) before executing the same.
    SEVERE: OUI-25031:Some of the configuration assistants failed. It is strongly recommended that you retry the configuration assistants at this time. Not successfully running any "Recommended" assistants means your system will not be correctly configured.
    1. Check the Details panel on the Configuration Assistant Screen to see the errors resulting in the failures.
    2. Fix the errors causing these failures.
    3. Select the failed assistants and click the 'Retry' button to retry them.
    Thanks in advance for your help!
    Regards

    Tads wrote:
    Hi all,
    I have a question about Oracle RAC and network interface.
    We're using Oracle 10gR2 RAC with two nodes on Linux Red Hat.
    Let's assume that the public network interface goes down.
    I would like to know what happens with existing connections
    on node with network interface with problems.
    Are connections frozen, actives?
    Can the users continue to use theses existing connections using the another node of RAC?If the interface is down? what do you think? All connections to this node will die. How does your application handle fail-over, does it attempt to reconnect or just have a complete application failure?
    You should spend some time in a test lab where you can test this stuff for yourself. Read the documentation and there are tons of sites out there that purport to have all of your RAC/TAF/FAN/FAF questions. - I would read and trust the documentation first.
    >
    I know that the listener goes down and any other connections is allowed.
    Thank you very much!!!!

  • ORA-01722: invalid number error coming in Oracle 10g.

    Hi,
    We are getting the error "ORA-01722: invalid number" while opening a cursor using CURSOR FOR LOOP.
    This error has started coming only after we have migrated to Oracle 10g from Oracle 9i. Earlier the same code used to work properly. And also on Oracle 10g, its not happening every time. Sometimes it gives error while sometimes it works.
    Does anybody know about any such bug in Oracle 10g. Our cursor is a parametrized cursor accepting a VARCHAR2 parameter and the value we are passing to it is also character.
    Our database is Oracle 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production and is running on UNIX server.

    And also on Oracle 10g, its not happening every time. Sometimes it gives error while sometimes it works. This is typically due to
    a) environment settings that differ from session to session
    b) or more often, data
    The actual error means that Oracle expects a number and is unable to obtain a number from the input (data or SQL or bind variables) supplied. I agree with William that it looks a lot like an implicit TO_NUMBER() conversion failing.
    Why not add a debug exception handler to the code? When that exception occurs, dump the PL/SQL call stack and values of all variables and parameters to a debug/logging table (using an autonomous transaction).

  • Using Clob with TopLink 9.0.4.5 and Oracle 10g RAC

    I am trying to store an XML file in a Clob type field of a DB table using TopLink 9.0.4.5 and Oracle 10g RAC and need some guidance about how to do it. I got some directions to start it with the Tip "How-To: Map Large Objects (LOBs) to Oracle Databases with OracleAS TopLink" but still need some more helps.
    When using the Oracle JDBC OCI driver, the tip gives the code block for a Clob field:
    DirectToField scriptMapping = new DirectToField();
    scriptMapping.setAttributeName("script");
    scriptMapping.setFieldName("IMAGE.SCRIPT");
    descriptor.addMapping(scriptMapping);
    As I understand, TopLink creates instances of the Descriptor class at run time for each of the descriptor files and stores them in a database session, where is the proper place (in SessionEvent of TopLinkSessionEventHandler?) for me to get a reference to such an instance of my Descriptor class in Java code so that I can add the above mentioned additional Mapping? Are the above String values of "script" and "IMAGE.SCRIPT" predefined in TopLink API? Can I accomplish the same thing just using the TopLink Workbench tool instead? If yes, please advise the detailed steps to do so.
    The tip also states to call the following code in case of using Clob:
    DatabaseLogin login = session.getLogin();
    login.useStringBinding();
    Should the above 2 lines of code be called after the following lines of code?
    SessionManager sessionManager = SessionManager.getManager();
    Server serverSession = (Server)sessionManager.getSession("MY_SESSION_NAME");
    Besides the above extra coding for the Session and Descriptor Mapping, is there any special handling in between Data model and DB table mapping? Can I map a Java String type to a DB Clob field using the Direct-to-field mapping?
    Appreciate any help.

    Never mind ....... I had already figured it out .....

  • Long-term  retention backup strategy for Oracle 10g

    Hello,
    I need design a Archivelog, Level 0 and Level 1 long-term retention backup strategy for Oracle 10g, It must be based in Rman with tapes.
    ¿Somebody could tell me what it's the best option to configure long-term retention backups in Oracle?, "CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 1825 DAYS;" It's posibble ?
    Regards and thanks in advance

    Hello;
    RECOVERY WINDOW is an integer so yes its possible. Does it make sense?
    Later My bad, this is Oracle 11 only ( the gavinsoorma link ) . Getting harder and harder to think about 10. Don't think I would use RECOVERY WINDOW.
    I would let the tape backup system handle the retention of your rman backups.
    Using crosscheck and delete expired commands you can keep the catalog current with the available backups on tape.
    Oracle 10
    Keeping a Long-Term Backup: Example
    http://docs.oracle.com/cd/B19306_01/backup.102/b14191/rcmbackp.htm#i1006840
    http://web.njit.edu/info/oracle/DOC/backup.102/b14191/advmaint005.htm
    Oracle 11
    If you want to keep 5 years worth of backups I might just use KEEP specific date using the UNTIL TIME clause :
    keep until time 'sysdate+1825'More info :
    RMAN KEEP FOREVER, KEEP UNTIL TIME and FORCE commands
    http://gavinsoorma.com/2010/04/rman-keep-forever-keep-until-time-and-force-commands/
    Best Regards
    mseberg
    Edited by: mseberg on Dec 4, 2012 7:20 AM

  • Oracle 10g Installation Error on Windows 2008 server

    Hi Experts,
    I have tried to install Oracle 10g via "51036975 ORACLE 10.2.0.4 Failsafe Server 3.4.1.5 Windows 2008" as per note no. Note 1303262 - Oracle on Windows Server 2008. Server is 64 bit server.
    Oracle service is not getting started and showing as "starting" in services.msc. I cant stop the service. I have taken the restart of the server as well but no luck.
    Just before the end of installation of Oracle, one pop up box named isqlplussvc opened up stating as "Failed to start the service, Error:0, the operation completed successfully".
    Can you please suggest how to handle this error and start the service?
    Regards,
    Naresh.

    Hi All,
    Yes, I started with Failsafe server CD only. After trying installing Oracle couple of times, once the service didnt went into starting mode & it couldnt start as well. Then I started going ahead with SAP installation and later I found that the service got automatically started.
    I was under the impression that the service has to be started before going for SAP installation.
    Thanks all for your replies.
    Regards,
    Naresh.

  • [oracle 10g on SDK 1.6SE] Err Msg: Exhausted Resultset

    I am having problem obtaining proper Resultset from a excuteQuery() method.
    i can successfully create a JDBC connection to local oracle database (on my computer) and excute a query on a table named "emp" which contains two records; however when i print out my Resultset using getInt(String coloumName), compiler error says "Exhausted Resultset".
    my source code:
    import java.sql.*;
    import oracle.jdbc.pool.OracleDataSource;
    class GetEmpDetails
    public static void main(String args[])
    try
    // instantiate and initialize OracleDataSource
    OracleDataSource ods = new OracleDataSource();
    ods.setURL("jdbc:oracle:thin:usr/secret@Laptop:1521:oracle"); Connection x = ods.getConnection();
    Statement stmt = x.createStatement();
    ResultSet rset = stmt.executeQuery("select empno, ename, job from emp");
    rset.next();
    System.out.println("My ID is "+ rset.getInt("empno") );
    // System.out.println("My ID is" + rset.getInt(1); returns "invalid coloum index" error
    catch (SQLException e)
    System.err.println ("error message: " + e.getMessage() );
    e.printStackTrace();
    Runtime.getRuntime().exit(1);
    i am able to query my emp table under SQLPlus with the following output:
    EMPNO ENAME JOB
    1 Tim Student
    2 Tom Student
    JAVA compiler compiles upto excuteQuery("select ...") statement and at that point returns an empty Resultset objet (i think)
    i copied this code from an example giving in a programming book using oracle 10g on JDK1.5SE.
    can anyone make some suggestion about how to fix this error?
    ting

    Okay a few points here to clear some things up and move you forward.
    1) What you are getting are not compilation errors. They are runtime errors. You made this mistake at least twice (once in your original post and once in your follow up) and it's worth clarifying because runtime errors and compile time errors are not the same thing. And that's worth knowing if for no other reason then in future if you ask for help with "compilation" errors that are actually runtime errors you may well find that you confuse the heck out of people and can't be helped.
    2) What the good doctor was pointing out is that you should not be throwing away the value of rs.next no matter if you feel the table should have records or not. Your code should look like this
    if(rs.next()){
      // do something
    }else{
      // do something else
    }Because that will handle either way and your program won't come to a screeching halt because you tried to read a value that doesn't exist in the ResultSet.
    So no matter how you get this resolved your code should still end up looking like the above (or similar). At any rate make use of the boolean returned from the next method.
    3) On to your actual problem. Well first you need to cross off any silly mistakes that do happen from time to time. Like are you sure you are talking to the "right" database. Again this may seem silly but it does happen.
    If that checks out then almost certainly yes this is some sort of Oracle permissions related issue. This seems to happen with Oracle more than other databases, I suspect because Oracle more than others associates table statistics with user permissions, but this could be totally unwarranted and simply based on my experience here which is many people having similar types of problems. The best way to resolve this is to make sure that you are logging in with the same credentials as you use in SQLPlus.
    If all else fails talk to your DBA. This is an Oracle related issue of some sort but it's very difficult to diagnose specifically over the internet.

Maybe you are looking for