Multi-Master replica busy

I have a multi-master ldap server(5.1 service pack 2, build number:2003.028.2338). There are about 56k users in 1st server and 45k on the 2nd server when count the users in the 2 servers. Plus, I have a pure consumer which is working fine with the replica from 1st server.
But when activated replication between two multi-master machines, the secondary server got the error message like below. And the replica process is very slow and the cost CPU always 40% and up. Can anyone please help me with this issue? What the replica process suppose to do? Why the replica is slow? What happens if the replication queue is about 11k entries? How can I speed up the replica?
Thanks in advance for your help.
===error log====
[29/Sep/2003:12:52:05 -0400] NSMMReplicationPlugin - Warning: timed out waiting 600 seconds for add operation result from replica 10.30.250.146:389
[29/Sep/2003:12:52:05 -0400] NSMMReplicationPlugin - Failed to replay change (uniqueid e1181801-1dd111b2-8003d8a4-aa1f9c2b, CSN 3f729278000300030000) to replica "cn=PRI-2-SEC-USER-01, cn=replica, cn="o=sso", cn=mapping tree, cn=config (host 10.30.250.146, port 389)": Connection lost. Will retry later.
[29/Sep/2003:12:52:05 -0400] NSMMReplicationPlugin - Warning: unable to send endReplication extended operation to consumer "cn=PRI-2-SEC-USER-01, cn=replica, cn="o=sso", cn=mapping tree, cn=config (host 10.30.250.146, port 389)" - error 2
[29/Sep/2003:13:07:18 -0400] NSMMReplicationPlugin - Warning: timed out waiting 600 seconds for add operation result from replica 10.30.250.146:389
[29/Sep/2003:13:07:18 -0400] NSMMReplicationPlugin - Failed to replay change (uniqueid e1181804-1dd111b2-8003d8a4-aa1f9c2b, CSN 3f729293000200030000) to replica "cn=PRI-2-SEC-USER-01, cn=replica, cn="o=sso", cn=mapping tree, cn=config (host 10.30.250.146, port 389)": Connection lost. Will retry later.
[29/Sep/2003:13:07:19 -0400] NSMMReplicationPlugin - Warning: unable to send endReplication extended operation to consumer "cn=PRI-2-SEC-USER-01, cn=replica, cn="o=sso", cn=mapping tree, cn=config (host 10.30.250.146, port 389)" - error 2
[29/Sep/2003:13:27:32 -0400] NSMMReplicationPlugin - Warning: timed out waiting 600 seconds for add operation result from replica 10.30.250.146:389
===access log===
[29/Sep/2003:13:59:42 -0400] conn=292 op=1 RESULT err=0 tag=101 nentries=1 etime=0
[29/Sep/2003:13:59:43 -0400] conn=292 op=2 SRCH base="cn=config" scope=0 filter="(|(objectClass=*)(objectClass=ldapsuben
try))" attrs="nsslapd-instancedir nsslapd-security"
[29/Sep/2003:13:59:43 -0400] conn=292 op=2 RESULT err=0 tag=101 nentries=1 etime=0
[29/Sep/2003:13:59:43 -0400] conn=292 op=3 SRCH base="cn=options,cn=features,cn=config" scope=1 filter="(objectClass=dir
ectoryServerFeature)" attrs=ALL
[29/Sep/2003:13:59:43 -0400] conn=292 op=3 RESULT err=0 tag=101 nentries=0 etime=0
[29/Sep/2003:13:59:48 -0400] conn=292 op=4 SRCH base="cn=config" scope=0 filter="(|(objectClass=*)(objectClass=ldapsuben
try))" attrs="nsslapd-security"
[29/Sep/2003:13:59:48 -0400] conn=292 op=4 RESULT err=0 tag=101 nentries=1 etime=0
[29/Sep/2003:13:59:48 -0400] conn=292 op=5 SRCH base="cn=config" scope=0 filter="(|(objectClass=*)(objectClass=ldapsuben
try))" attrs="nsslapd-port nsslapd-secureport nsslapd-lastmod nsslapd-readonly nsslapd-schemacheck nsslapd-referral"
[29/Sep/2003:13:59:48 -0400] conn=292 op=5 RESULT err=0 tag=101 nentries=1 etime=0
[29/Sep/2003:13:59:51 -0400] conn=292 op=6 SRCH base="cn=replication, cn=config" scope=0 filter="(|(objectClass=*)(objec
tClass=ldapsubentry))" attrs=ALL
[29/Sep/2003:13:59:51 -0400] conn=292 op=6 RESULT err=0 tag=101 nentries=1 etime=0
[29/Sep/2003:13:59:51 -0400] conn=292 op=7 SRCH base="cn=replication,cn=config" scope=2 filter="(objectClass=*)" attrs=ALL

I have had a similar problem. and it was due to that I recreated a new replication agreement which pointed to a different server with the same id. Or to the same server with a different ID.
The solution provided by sun was to remove the reference to the old replicas in the dse.ldif and restart but it didn't work out. not even by ldapmodify on the replicas dn.
I had to reinstall...

Similar Messages

  • Multi-master replicated environment

    hi,
    what does it mean multi-master replicated environment? How could i benefit from it?
    thanks in advance.

    Hello,
    This is the environment in which inserts, updates and deletes on objects
    in any nodes, included in the environment will be replicated to remaining
    nodes which are defined to be part of the replicated environment.
    Sinces changes to any node are replicated to all other nodes, they all work
    as Master, which gives it the name, Master-2-Master and also advanced replication.
    Tahir.

  • Problem with Multi Master Replication

    Hello All,
    I've setup a multi-master replication with no consumers. i.e i'm having 2 suppliers which should update each other. The setup seems to be fine since the initialization of one supplier by other works very fine. But i couldn't get the synchronization btwn the suppliers get worked. I noticed in the error log that the syn-scan request arrived, but ignored. What are the possibilities of this error ?
    Please help me with this regard.
    Thanks in advance,
    Rajesh

    Hello All,
    Rich, you have been a support to most of us in the group(indeed much to my help)...Its splendid work....
    My problems disappeared after applying the Service pack ....the service pack in fact is mainly to sort out the replication issues.
    Advice from my experience - The patch may be more than enough for most of the replication issues.
    One observation - i had the replica busy error, but i didn't have to restart the replica as suggested by some of the previous threads. Seems the service pack did some fix for it.
    Thank you all,
    Best Regards,
    Rajesh

  • Multi Master Replication - Only works for some tables?

    I have a multi master replication between a 9i and an 816 database.
    All the tables are in the same tablespace and have the same owner. The replication user has full privs on all the tables.
    When setting up the replication some tables create properly, but others fail with a "table not found" error.
    Ideas anyone ?
    Andrew

    You said that you have a 9i replicated with a 816.
    I try the same thing but with two 9i enterprise version databases, free downloaded from www.oracle.com.
    when i ran
    exec dbms_repcat.add_master_database(gname=>'groupname', master=>'replica_link')
    this error appears
    ERROR at line 1:
    ORA-23375: feature is incompatible with database version at replica_link
    ORA-06512: at "SYS.DBMS_SYS_ERROR", line 86
    ORA-06512: at "SYS.DBMS_REPCAT_MAS", line 2159
    ORA-06512: at "SYS.DBMS_REPCAT", line 146
    ORA-06512: at line 1
    please help me if u have any idea.

  • Multi-master Replication and Metadirectory 5.0

    The Metadirectory 5.0 documentation states that it cannot work with a directory server configured for multi-master replication. We need to use Metadirectory since we are integrating the Directory Server with other systems. Does this mean that we'll be forced away from MMR configuration? What are some of the alternatives? Does iPlanet have any plans for supporting MMR in future versions of Metadirectory?

    I think you can enable the retro changelog on a consumer replica. I'm pretty sure that works.
    You might be able to enable it on a hub. You also might get it to work on a master, but the changelog on the master will contain some replication housekeeping changes that may confuse Meta Directory. I'm not sure what they are but they are obvious if you look at the changelog. If you can figure out some way to screen those changes out from Meta Directory, you might be able to use it.

  • Multi-master access from separate DSCC views

    Hi,
    I have two DS6.2 installations - one in test, one in production. They are both multi-master (two masters) environments.
    In test I can run DSCC on both masters, and can manage both from each DSCC.
    However, in production I can only access both masters from one DSCC at a time. If I enable access on the second DSCC, then the first DSCC shows them in red as Inaccessible until I "Enable Access" which works ok, but then the second DSCC shows them as inaccessible.
    I have been over the configs, and they appear the same in test & production.
    Does anyone know what could be stopping the production instances from being managed by more than one DSCC?
    Thanks.
    Terry.

    Replicating the ADS instance, ie cn=dscc is not supported and not supposed to work so what you are trying to do is futile.

  • Multi-Master Requires Cold Backup?

    Since the restore of a multi-master instance requires the restore of the backup and the changelog directories, and since the changelog cannot be backuped up with anything other than an OS copy command, and since an OS copy command will usually not create a valid backup of an open database file, is it not true that multi-master backups must therefore be cold backups?

    By "In flight changes" I mean changes which have occurred on one master replica, but have not been replicated to the other. My mindset is on multi-master, but I suppose much of this applies to any supplier / consumer backup/restore (not just multi-master).
    Your cold backup procedure is what I have implemented until 5.2. I still have not had time to test restores fully at this point with this backup procedure. I am somewhat concerned that after restoring both instances from their own backups, will replication work and will they synch up correctly.

  • Multi Master Site replication with Snapshot features

    Dear,
    I am implementing a Multi Master site replication. I want to replicate Table A in MASTER SITE A to Table A in MASTER SITE B.
    However, I don't want all the data from MASTER SITE A to be propagated to MASTER SITE B.I may want data that belongs to a certain criteria only to be replicated over.
    I know I can achieve this in a snapshot site replication. But can this be done in a Multi Master Site replication too ?

    Hai,
    As far under my observation, the tables that is marked for a replication from the MASTER SITE to Child Site is an exact replica of data. Your case could be achieved only thru SNAPSHOT VIEWS/GROUPS
    - [email protected]

  • Multi master replication between 5.2 and 6.3.1

    I have a setup in which I have a master running version 5.2 and about 15 consumers ( slaves) all of which have been upgraded to 6.3.1 . I now want to create a multi master topology by promoting one of these consumers to be a master and still keep the 5.2 in use as we have a bunch of other applications that depend on the 5.2 instance. Our master has two suffixes. The master server is also the CA cert authority for all the consumers . After reading the docs I narrowed down the procedure to be
    1. Promote one of the 6.3.1 consumers to hub and then to master using the dsconf promote-repl commands. The problem here is that I am not sure how I can create a single consumer that can slave both the suffixes. We currently have them being slaved to different consumers.
    Also do I need to stop the existing replication between the 5.2 master and the would be 6.3.1 master to promote to hub and master.
    2. Set the replication manager manually or using dsconf set-server-prop on the new 6.3.1 master .
    3. Create a new replication agreement from 5.2 to 6.3.1 master without initializing. (using java console)
    4. Create new replication agreement from 6.3.1 to 5.2 (using command line)
    5. Create new repl agreements between the new 6.3.1 master and all the other consumers. For this do I need to first disable all the agreements between 5.2 and 6.3 or can I create new agreements without disabling the old ones?
    6. Initialize 6.3.1 from the 5.2 master.
    My biggest concern at this point is surrounding the ssl certs and the existing trusts the consumers have with the 5.2 master. Currently my 5.2 server acts as the CA authority for our certificate management with the ldap slaves. How can I migrate this functionality to the new server and also will this affect how the slaves communicate to the new master server ?
    Thanks in advance.

    Thanks Marco and Chris for the replies.
    I was able to get around the message by first manually initialzing the new slave using an ldif of the ou from the master , using dscc to change the default replication manager account to connect and finally editing the dse.ldif to enter the correct crypt hash for the new repl manager password. After these steps I was able to successfully set up replication to the second ou and also promote it to hub and master ( I had to repeat the steps after promotion of the slave to master as somehow it reset replication manager settings when I did that).
    So right now, I have a 5.2 master with two ou's replicating to about 15 consumers.
    I promoted one of these to be a second master (from consumer to hub to master). Replication is setup from 5.2 to 6.3 master but not the other way round.
    I am a little bit nervous setting up replication the other way round as this is our production environment and do want to end up blowing up my production instance. The steps I plan on taking are , from the new master server
    1. dsconf create-repl-agmt -p 389 dc=xxxxx,dc=com <5.2-master>:389
    2. dsconf set-repl-agmt-prop -p 389 dc=xxxxx,dc=com <5.2-master>:389 auth-pwd-file:<passwd_file.txt>
    I am assuming I can do all of this while the instances are up. Also in the above, does create-repl-agmt just create the agreement or does it also initalize the consumer with the data ? I want to ensure I do not initialize my 5.2 master with my 6.3 data.
    Thanks again

  • Multi-master replication questions for iPlanet 5.0, gurus out there?

    hi:
    I'm using iPlanet Dir Server 5.0 and I note that many gurus out there has
    been able
    to get this to work, that's good, but I have yet to. I have several
    questions, maybe
    someone can spend a few minutes and save me hours...
    I have a suffix called dc=calient,dc=net. I followed the suggestions in
    the
    iPlanet install guide and created 2 directory servers
    a) suffix o=NetscapeRoot, at some arbitrary port, 4601
    b) suffix dc=calient,dc=net, at the usual port 389.
    All my searches/create/delete work fine. However, when I try to replicate
    with multi-master between 2 machines, I keep getting into problems.
    Here's one set of questions...
    Q1: do people out there really split their tree from the o=NetscapeRoot
    tree?
    Q2: The admin guide says the the unit of replication is a database, and
    that each replication can only have 1 suffix. Is this true? Can
    a replicated db have more than 1 suffix?
    Q3: If I also want to replicate the o=NetscapeRoot tree, I have to set
    up yet 2 more replication agreements. Isn't this more work? If
    I just lump the 2 suffixes together, wouldn't it be easier? But would
    it work?
    Q4: I followed the instructions to enable replicas on the masters.
    But then I tried to create this cn=Replication Manager, cn=config
    object.
    But what is the object class of this entry? An iPlanet user has uid
    as its RDN... I tried a person object class, and I added a password.
    But then I keep getting error code 32, object not found in the error
    log. What gives? such as
    WARNING: 'get_entry' can't find entry 'cn=replication
    manager,cn=config', err 32
    Q5: Also, are there any access control issues with this cn=Replication
    Manager,
    cn=config object? By this I mean, I cannot seem to see this object
    using
    ldapsearch, I can only see cn=SNMP, cn=config. Also, do I have
    to give all access via aci to my suffix dc=calient,dc=net? Also,
    given the fact that my o=NetscapeRoot tree is at a different port (say
    4601),
    not 389, could this be an issue?
    Q6: when replication fails, should the Dir Server still come up? Mine does
    not anymore
    which is strange. I keep getting things like this in my log file
    [08/Nov/2001:21:49:13 -0800] NSMMReplicationPlugin - Could not send consumer
    mufasa.chromisys.com:389 the bind request
    [08/Nov/2001:21:49:13 -0800] NSMMReplicationPlugin - Failed to connect to
    replication consumer mufasa.chromisys.com:389
    But why shouldn't the dir server itself come up even if replication
    fails?
    steve

    Hi Steve,
    First, please read the 'Deployment Guide'. I think that is easier to
    understand when you want to setup multi-master replication. The
    'Administrator's Guide' gives you step-by-step instructions, but it may
    not help you to understand how to design your directory services.
    Stephen Tsun wrote:
    I have a suffix called dc=calient,dc=net. I followed the suggestions in
    the
    iPlanet install guide and created 2 directory servers
    a) suffix o=NetscapeRoot, at some arbitrary port, 4601
    b) suffix dc=calient,dc=net, at the usual port 389.
    All my searches/create/delete work fine. However, when I try to replicate
    with multi-master between 2 machines, I keep getting into problems.I don't understand something: which backend do you want to replicate?
    The one holding 'o=NetscapeRoot' or the one holding 'dc=calient,dc=net'?
    Do you want to setup replication between these two instances of the
    directory server (i.e. between port 4601 and 389 in your example)?
    Q1: do people out there really split their tree from the o=NetscapeRoot
    tree?If you have multiple directory servers installed in your environment, it
    is probably worth dedicating (at least) one directory server for the
    o=netscaperoot tree.
    Q2: The admin guide says the the unit of replication is a database, and
    that each replication can only have 1 suffix. Is this true? Can
    a replicated db have more than 1 suffix?Well, it is normal, since in iDS 5.x you have 1 suffix per database.
    You can, however, replicate multiple databases.
    Q3: If I also want to replicate the o=NetscapeRoot tree, I have to set
    up yet 2 more replication agreements. Isn't this more work? If
    I just lump the 2 suffixes together, wouldn't it be easier? But would
    it work?You can't lump the 2 suffixes together, because each backend has 1
    suffix associated with.
    Q4: I followed the instructions to enable replicas on the masters.
    But then I tried to create this cn=Replication Manager, cn=config
    object.
    But what is the object class of this entry?Usually, it is organizationalperson or inetorgperson. In most of the
    cases you want an objectclass which can have userPassword attribute.
    An iPlanet user has uid
    as its RDN... I tried a person object class, and I added a password.
    But then I keep getting error code 32, object not found in the error
    log. What gives? such asYou must have misconfigured something. Or perhaps, it is not
    cn=replication manager, cn=config, but 'uid=replication manager,cn=config'
    Q5: Also, are there any access control issues with this cn=Replication
    Manager,
    cn=config object? By this I mean, I cannot seem to see this object
    using
    ldapsearch, I can only see cn=SNMP, cn=config.The configuration tree is protected by ACIs, so you can not see them
    using anonymous BINDs. Try binding as 'directory manager' and you will
    find your entry.
    Also, do I have
    to give all access via aci to my suffix dc=calient,dc=net?For what purpose? For replication, it is enough to set user DN in the
    replication agreement and this user can update the replicated backend.
    Q6: when replication fails, should the Dir Server still come up?Yes.
    Bertold

  • Partial transaction for multi-master asynchronous replication

    I have a fundamental question about multi-master asynchronous replication.
    Lets consider a situation where we have 2 servers participating in a multimaster asynchronous mode of replication.
    3 tables are part of an Oracle transaction. Now if I mark one of these tables for replication and the other 2 are not a part of any of the replication groups.
    Say as a part of Oracle transaction one record is inserted into each of the 3 tables.
    Now if I start replicating, will the change made in the table marked for replication be replicated on to the other server. Since the change made to the other 2 tables are not propogated by the deferred queue.
    Please reply.
    null

    MR.Bradd piontek is very much correct.If the tables involved are interdependent you have to place them in a group and all of them should exist at all sights in a multi master replication.
    If the data is updated(pushed) from a snapshot to a table at a master site it may get updated if it is not a child table in a relationship.
    But in multi master replication environment even this is not possible.

  • MMR Replica Busy Error on consumers

    hi,
    MMR environment solaris 2.9 ids 5.1sp2
    servers -
    aq001pd (M) aq002pd (M)
    aq003pd (S) aq004pd (S)
    frequently see the following error for both consumers from the second master aq002pd..
    cn=replicate-isp-to-aq004pd-slave-ro, cn=replica, cn="o=isp", cn=mapping tree, cn=config
    cn=replicate-isp-to-aq004pd-slave-ro
    nsds5replicaLastUpdateStart=20030522001505Z
    nsds5replicaLastUpdateEnd=20030522001504Z
    nsds5replicaLastUpdateStatus=1 Replication error acquiring replica: replica busy
    now i understand that this can sometimes happen if both masters attempt a syncon consumer, but it is always in this state only occasionally not being in it... replication is set to always keep replicas in synch..
    any ideas?
    also are there any hotfixes for ids51.sp2 and where can i ge t them (apart from sun tech support)...
    thanks

    Is it a bug?
    Iris

  • Using attribute uniqueness with multi-master replication?

    Hi,
    I'm trying to use attribute uniqueness in a iDS 5.1 multi-master replication env. I have created a plug-in instance for the attribute (memberID) on each directory instance (same installation on NT) and tested (if I try to create a duplicate value under the same instance I get a constraint error as expected). However if I create a entry under one instance and then create a second entry (different DN) with the same attribute value on the second instance, the entry is written with no complaints? If I create the entries with an identical DN, then the directory automatically adds nsuniqueID to the RDN of the second entry to maintain DN uniqueness but it doesn't seem to mind about duplicate values within the entry despite the plug-in?
    BTW I've tested MMR and it is working and I'm using a subtree to enforce uniqueness.
    Regards
    Simon

    Attribute uniqueness plugin only ensure uniqueness on a single master before the entry is added. It doesn't check replicated operation since they have already been accepted and a positive result was returned to the client. So in a multiMastered environment, it is still possible to add 2 identical attributes, if like you're saying you're adding the entries at the same time on both master servers.
    We're working on a solution to have Attribute Uniqueness working in a multiMastered environment. But we're worried about its impact on performances we may get with it.
    Regards,
    Ludovic.

  • Two problems after reinitializing multi-master replication.

    Hi,
    I recently set up multi-master replication between two machines, A and B, running iDS 5.1. Everything works well until I disable replication and re-enable replication. Here's the procedure that I follow:
    1. Delete agreement on machine A and B.
    2. Disable the replica on machine A and B.
    3. Disable the changelog on machine A and B.
    4. Enable the changelog on machine A and B.
    5. Enable the replica on machine A and B.
    6. Create replication agreement on A and B.
    7. Initialize B from A.
    Problem #1:
    After I follow this procedure, updates to A result in the error:
    <B>
    Data required to update replica "cn=rep1, cn=replica, cn="o=sylantro.com", cn=mapping tree, cn=config (host rosemary.sylantro.com, port 389)" has been purged. The replica must be reinitialized.
    [07/Feb/2002:10:43:28 -0800] NSMMReplicationPlugin - Replication: Incremental update failed and requires administrator action
    </B>
    At this point if I re-initialize B from A, the problem ceases and everything is fine.
    This problem still occurs even if I restart the server before Step 4.
    Problem #2.
    When I assign machine A the old replicaId of machine B and assign machine B the old replicaID of machine A in Step 5, I get the following error.
    <B>
    [07/Feb/2002:10:06:02 -0800] NSMMReplicationPlugin - Unable to aquire replica "cn=rep1, cn=replica, cn="o=s
    ylantro.com", cn=mapping tree, cn=config (host wasrocky.sylantro.com, port 389)": the replica has the same Replica ID as this one. Replication is aborting.
    </B>
    The server looks like it is remembering its old replicaID.
    This problem still occurs even if I restart the server before Step 4.
    Any insight would be greatly appreciated.
    Thanks!
    Gil

    For the problem #1, the error message is normal since you have actually remove the changelog. However, once you have done an initialization, Replication should work.
    For the problem #2, the message is correct, because, although you have changed the ReplicaId on both server, your data still contain references to the previous replicaId both on Server A and Server B.
    In general, Replica ID uniquely identify a Master in a replication topology. It should not change. The replicaID is also used in the timestamps associated with each change and they guarantee that the Replication can always distinguish 2 changes, and also can detect time synchronization issues between masters.
    One way to avoid the Replica ID issue when changing ReplicaIds, is to reload the data on one server and then re-initialize the second one from the first one.
    Regards,
    Ludovic.

  • DS 5 SP1 multi-master replication

    I am running DS5 SP1 on Solaris 8 with the recommended patch set. I am
    using replica id 1 for master1 and replica id 2 for master2.
    I am replicating 7 nodes in a multi-master scenario. I start with about
    1000 base entries in each db node, and add approximately 1000 to each
    node via a meta directory interface to master1. I run 9 passes, so I am
    adding approximately 10,000 entries to each node over the course of the
    passes.
    Occasionally, inconsistently, and without explanation, slapd stops on my
    master1 (sometimes master2). I have successfully run through all passes
    at differing times, and then it happens. There is no specific error
    message indicating it has died, but a ps reveals that it is no longer
    running. Once I restart it, the error log indicates that the directory
    had a disorderly shutdown. It has the nasty effect of resetting my
    changelogs to 0 and requiring me to reinitialize my replication
    agreements. Below is a relevant excerpt from my error log, hopefully
    someone can give me some insight as to why this is happening . . . . .
    [10/Dec/2001:17:13:55 +0000] NSMMReplicationPlugin - Failed to connect
    to replication consumer master2.domain.com:636
    [10/Dec/2001:17:13:57 +0000] NSMMReplicationPlugin - Could not send
    consumer master2.domain.com:636 the bind request
    [10/Dec/2001:17:13:57 +0000] NSMMReplicationPlugin - Failed to connect
    to replication consumer master2.domain.com:636
    [10/Dec/2001:17:14:02 +0000] NSMMReplicationPlugin - Could not send
    consumer master2.domain.com:636 the bind request
    [10/Dec/2001:17:14:02 +0000] NSMMReplicationPlugin - Failed to connect
    to replication consumer master2.domain.com:636
    [10/Dec/2001:17:14:08 +0000] NSMMReplicationPlugin - Could not send
    consumer master2.domain.com:636 the bind request
    [10/Dec/2001:17:14:08 +0000] NSMMReplicationPlugin - Failed to connect
    to replication consumer master2.domain.com:636
    [10/Dec/2001:20:29:03 +0000] NSMMReplicationPlugin - Could not send
    consumer master2.domain.com:636 the bind request
    [10/Dec/2001:20:29:03 +0000] NSMMReplicationPlugin - Failed to connect
    to replication consumer master2.domain.com:636
    [10/Dec/2001:20:29:03 +0000] NSMMReplicationPlugin - received error 81:
    NULL from replica master2.domain.com:636 for add operation
    [10/Dec/2001:20:29:03 +0000] NSMMReplicationPlugin - Failed to replay
    change (uniqueid 45e39dd9-1dd211b2-80a8c58f-66391b25, CSN
    3c151ad00001000
    10000) to replica "cn=to master2, cn=replica,
    cn="ou=group,o=company,c=us", cn=mapping tree, cn=config (host
    master2.domain.com, port 636)"
    : Local error. Will retry later.
    [10/Dec/2001:20:29:03 +0000] NSMMReplicationPlugin - failed to send
    extended operation to replica master2.domain.com:636; LDAP error - 81
    [10/Dec/2001:20:29:03 +0000] NSMMReplicationPlugin - Warning: unable to
    send endReplication extended operation to consumer "cn=to master2,
    cn=replica, cn="ou=group,o=company,c=us", cn=mapping tree, cn=config
    (host master2.domain.com, port 636)" - error 1
    [10/Dec/2001:20:29:13 +0000] NSMMReplicationPlugin - failed to send
    extended operation to replica master2.domain.com:636; LDAP error - 81
    [10/Dec/2001:20:29:13 +0000] NSMMReplicationPlugin - Unable to send a
    startReplication extended operation to consumer "cn=to master2,
    cn=replica, cn="ou=group,o=company,c=us", cn=mapping tree, cn=config
    (host master2.domain.com, port 636)". Will retry later.
    [10/Dec/2001:20:29:13 +0000] NSMMReplicationPlugin - Could not send
    consumer master2.domain.com:636 the bind request
    [10/Dec/2001:20:29:13 +0000] NSMMReplicationPlugin - Failed to connect
    to replication consumer master2.domain.com:636
    [10/Dec/2001:20:29:17 +0000] NSMMReplicationPlugin - Could not send
    consumer master2.domain.com:636 the bind request
    [10/Dec/2001:20:29:17 +0000] NSMMReplicationPlugin - Failed to connect
    to replication consumer master2.domain.com:636
    [10/Dec/2001:20:29:33 +0000] NSMMReplicationPlugin - failed to send
    extended operation to replica master2.domain.com:636; LDAP error - 81
    [10/Dec/2001:20:29:33 +0000] NSMMReplicationPlugin - Unable to send a
    startReplication extended operation to consumer "cn=to master2,
    cn=replica, cn="ou=group,o=company,c=us", cn=mapping tree, cn=config
    (host master2.domain.com, port 636)". Will retry later.
    [10/Dec/2001:20:30:13 +0000] NSMMReplicationPlugin - failed to send
    extended operation to replica master2.domain.com:636; LDAP error - 81
    [10/Dec/2001:20:30:13 +0000] NSMMReplicationPlugin - Unable to send a
    startReplication extended operation to consumer "cn=to master2,
    cn=replica, cn="ou=group,o=company,c=us", cn=mapping tree, cn=config
    (host master2.domain.com, port 636)". Will retry later.
    [10/Dec/2001:20:31:54 +0000] - iPlanet-Directory/5.0 ServicePack 1
    B2001.264.1425 starting up
    [10/Dec/2001:20:31:56 +0000] - Detected Disorderly Shutdown last time
    Directory Server was running, recovering database.
    [10/Dec/2001:20:34:38 +0000] NSMMReplicationPlugin - replica_reload_ruv:
    Warning: data for replica ou=node3,ou=group,o=company,c=us was reloaded
    and it no longer matches the data in the changelog. Recreating the
    changlog file. This could effect replication with replica's consumers in
    which case the consumers should be reinitialized.
    [10/Dec/2001:20:34:38 +0000] NSMMReplicationPlugin - replica_reload_ruv:
    Warning: data for replica ou=node1,ou=group,o=company,c=us
    was reloaded and it no longer matches the data in the changelog.
    Recreating the changlog file. This could effect replication with
    replica's consumers in which case the consumers should be reinitialized.
    [10/Dec/2001:20:34:38 +0000] NSMMReplicationPlugin - replica_reload_ruv:
    Warning: data for replica ou=node4,ou=group,o=company,c=us was reloaded
    and it no longer matches the data in the changelog. Recreating the
    changlog file. This could effect replication with replica's consumers in
    which case the consumers should be reinitialized.
    [10/Dec/2001:20:34:38 +0000] NSMMReplicationPlugin - replica_reload_ruv:
    Warning: data for replica ou=group,o=company,c=us was reloaded a
    nd it no longer matches the data in the changelog. Recreating the
    changlog file. This could effect replication with replica's consumers in
    which case the consumers should be reinitialized.
    [10/Dec/2001:20:34:38 +0000] NSMMReplicationPlugin - replica_reload_ruv:
    Warning: data for replica ou=node2,ou=group,o=company,c=us was reloaded
    and it no longer matches the data in the changelog. Recreating the
    changlog file. This could effect replication with replica's consumers in
    which case the consumers should be reinitialized.
    [10/Dec/2001:20:34:38 +0000] NSMMReplicationPlugin - replica_reload_ruv:
    Warning: data for replica ou=node5,ou=group,o=company,c=us was reloaded
    and it no longer matches the data in the changelog. Recreating the
    changlog file. This could effect replication with replica's consumers in
    which case the consumers should be reinitialized.
    [10/Dec/2001:20:34:38 +0000] NSMMReplicationPlugin - replica_reload_ruv:
    Warning: data for replica ou=node6,ou=group,o=company,c=us was reloaded
    and it no longer matches the data in the changelog. Recreating the
    changlog file. This could effect replication with replica's consumers in
    which case the consumers should be reinitialized.
    [10/Dec/2001:20:34:39 +0000] - slapd started. Listening on all
    interfaces port 389 for LDAP requests
    [10/Dec/2001:20:34:39 +0000] - Listening on all interfaces port 636 for
    LDAPS requests
    [10/Dec/2001:20:34:39 +0000] - cos_cache_getref: no cos cache created
    [10/Dec/2001:20:34:44 +0000] NSMMReplicationPlugin - Data required to
    update replica "cn=to master2, cn=replica,
    cn="ou=node1,ou=group,o=company,c=us", cn=mapping tree, cn=config (host
    master2.domain.com, port 636)" has been purged. The replica must be
    reinitialized.
    [10/Dec/2001:20:34:44 +0000] NSMMReplicationPlugin - Replication:
    Incremental update failed and requires administrator action
    [10/Dec/2001:20:35:02 +0000] NSMMReplicationPlugin - Data required to
    update replica "cn=to master2, cn=replica, cn="ou=node2,ou=group,o=
    company,c=us", cn=mapping tree, cn=config (host master2.domain.com, port
    636)" has been purged. The replica must be reinitialized.
    [10/Dec/2001:20:35:03 +0000] NSMMReplicationPlugin - Replication:
    Incremental update failed and requires administrator action
    [10/Dec/2001:20:36:50 +0000] NSMMReplicationPlugin - Data required to
    update replica "cn=to master2, cn=replica,
    cn="ou=node3,ou=group,o=company,c=us", cn=mapping tree, cn=config (host
    master2.domain.com, port 636)" has been purged. The replica must be
    reinitialized.
    [10/Dec/2001:20:36:51 +0000] NSMMReplicationPlugin - Replication:
    Incremental update failed and requires administrator action
    [10/Dec/2001:20:38:34 +0000] NSMMReplicationPlugin - Data required to
    update replica "cn=to master2, cn=replica, cn="ou=node4,ou=group,o=
    company,c=us", cn=mapping tree, cn=config (host master2.domain.com, port
    636)" has been purged. The replica must be reinitialized.
    [10/Dec/2001:20:38:34 +0000] NSMMReplicationPlugin - Replication:
    Incremental update failed and requires administrator action
    [10/Dec/2001:20:42:31 +0000] NSMMReplicationPlugin - Data required to
    update replica "cn=to master2, cn=replica, cn="ou=node5,ou=group,o=
    company,c=us", cn=mapping tree, cn=config (host master2.domain.com, port
    636)" has been purged. The replica must be reinitialized.
    [10/Dec/2001:20:42:31 +0000] NSMMReplicationPlugin - Replication:
    Incremental update failed and requires administrator action
    [10/Dec/2001:20:44:18 +0000] NSMMReplicationPlugin - Data required to
    update replica "cn=to master2, cn=replica,
    cn="ou=node6,ou=group,o=company,c=us", cn=mapping tree, cn=config (host
    master2.domain.com, port 636)" has been purged. The replica must be
    reinitialized.
    [10/Dec/2001:20:44:19 +0000] NSMMReplicationPlugin - Replication:
    Incremental update failed and requires administrator action
    [10/Dec/2001:20:46:06 +0000] NSMMReplicationPlugin - Data required to
    update replica "cn=to master2, cn=replica, cn="ou=group,o=company,c=us",
    cn=mapping tree, cn=config (host master2.domain.com, port 636)" has been
    purged. The replica must be reinitialized.
    [10/Dec/2001:20:46:06 +0000] NSMMReplicationPlugin - Replication:
    Incremental update failed and requires administrator action

    Hi Jennifer,
    There's nothing in the logs that explain why your server is disappearing.
    I think that the reseting of the changelog is a known problem that should be fixed in 5.0SP2 and is fixed in iDS 5.1 which is now available.
    Just one quick question, when you say that you have 7 nodes, you mean 7 container entries or 7 suffixes that are separated replication areas ?
    Regards,
    Ludovic.

Maybe you are looking for