Regarding number of nodes in endeca cluster

Hi,
I have a question regarding number of nodes in the endeca server cluster.
Our solution contains one data domain running in a endeca cluster with two nodes.
Endeca server documentation recommends to run the cluster with atleast 3 nodes however our solution can't accomdate another server straight away.
Can anyone please suggest what are the implication of running the cluster with two nodes like
1. Can the cluster still serve the request if one node goes down?
2, How the leader promoting works if a node goes down?
Thank you,
regards,
rp

Hi rp,
You can definitely start with two nodes and then add another Endeca Server node later, if needed. It is recommended to run a cluster of three, for increased availability.
Here are some answers to you questions about the cluster behavior:
Q: Can the cluster still serve the request if one node goes down?
A: Quoting from this portion of the Endeca Server Cluster Guide > How enhanced availability is achieved:
Availability of Endeca Server nodes
In an Endeca Server cluster with more than one Endeca Server instance, an ensemble of the Cluster Coordinator services running on a subset of nodes in the Endeca Server cluster ensures enhanced availability of the Endeca Server nodes in the Endeca Server cluster.
When an Endeca Server node in an Endeca Server cluster goes down, all Dgraph nodes hosted on it, and the Cluster Coordinator service (which may also be running on this node) also go down. As long as the Endeca Server cluster consists of more than one node, this does not disrupt the processing of non-updating user requests for the data domains. (It may negatively affect the Cluster Coordinator services. For information on this, see Availability of Cluster Coordinator services.)
If an Endeca Server node fails, the Endeca Server cluster is notified and stops routing all requests to the data domain nodes hosted on that Endeca Server node, until you restart the Endeca Server node.
Let's consider an example that helps illustrate this case. Consider a three-node single data domain cluster hosted on the Endeca Server cluster consisting of three nodes, where each Endeca Server node hosts one Dgraph node for the data domain. In this case:
If one Endeca Server node fails, incoming requests will be routed to the remaining nodes.
If the Endeca Server node that fails happens to be the node that hosts the leader node for the data domain cluster, the Endeca Server cluster selects a new leader node for the data domain from the remaining Endeca Server nodes and routes subsequent requests accordingly. This ensures availability of the leader node for a data domain.
If the Endeca Server node goes down, the data domain nodes (Dgraphs) it is hosting are not moved to another Endeca Server node. If your data domain has more than two nodes dedicated to processing queries, the data domain continues to function. Otherwise, query processing for this data domain may stop until you restart the Endeca Server node.
When you restart the failed Endeca Server node, its processes are restarted by the Endeca Server cluster. Once the node rejoins the cluster, it will rejoin any data domain clusters for the data domains it hosts. Additionally, if the node hosts a Cluster Coordinator, it will also rejoin the ensemble of Cluster Coordinators.
Q: How the leader promoting works if a node goes down? See part of the answer above. Also, this: (from the same topic, but later in text)
Failure of the leader node. When the leader node goes offline, the Endeca Server cluster elects a new leader node and starts sending updates to it. During this stage, follower nodes continue maintaining a consistent view of the data and answering queries. When the node that was the leader node is restarted and joins the cluster, it becomes one of the follower nodes. Note that is also possible that the leader node is restarted and joins the cluster before the Endeca Server cluster needs to appoint a new leader node. In this case, the node continues to serve as the leader node.If the leader node in the data domain changes, the Endeca Server continues routing those requests that require the leader node to the Endeca Server cluster node hosting the newly appointed leader node.
Note: If the leader node in the data domain cluster fails, and if an outer transaction has been in progress, the outer transaction is not applied and is automatically rolled back. In this case, a new outer transaction must be started. For information on outer transactions, see the section about the Transaction Web Service in the Oracle Endeca Server Developer's Guide.
Failure of a follower node. When one of the follower nodes goes offline, the Endeca Server cluster starts routing requests to other available nodes, and attempts to restart the Dgraph process for this follower node. Once the follower node rejoins the cluster, the Endeca Server adjusts its routing information accordingly.
You may ask, why do you need three nodes then? This is to achieve the high availability of the cluster services themselves.
Quoting:
If you do not configure at least three Endeca Server nodes to run the Cluster Coordinator service, the Cluster Coordinator service will be a single point of failure. Should the Cluster Coordinator service fail, access to the data domain clusters hosted in the Endeca Server cluster becomes read-only. This means that it is not possible to change the data domains in any way. You cannot create, resize, start, stop, or change data domains; you also cannot define data domain profiles. You can send read queries to the data domains and perform read operations with the Cluster and Manage Web Services, such as listing data domains or listing nodes. No updates, writes, or changes of any kind are possible while the Cluster Coordinator service in the Endeca Server cluster is down — this applies to both the Endeca Server cluster and data domain clusters. To recover from this situation, the Endeca Server instance that was running a failed Cluster Coordinator must be restarted or replaced (the action required depends on the nature of the failure).
Julia

Similar Messages

  • Maximum number of nodes in a Weblogic cluster on RedHat Linux?

    Is there a limitation of the number of nodes in a weblogic cluster
              running under RedHat Linux?
              Can I start with 5 nodes and in a year scale up to 500 or 5000 nodes?
              Thanks!
              Ralf.
              

    Ralf,
              > Is there a limitation of the number of nodes in a weblogic cluster
              > running under RedHat Linux?
              A realistic limit, of course.
              > Can I start with 5 nodes and in a year scale up to 500 or 5000 nodes?
              If your app is completely stateless, then it can scale to 40 maybe 80
              servers.
              The problem is that stateless apps typically manage state that sits behind
              them, and there's basically no database in the world that can handle the
              load that 40 servers can put on it. Depending on the app, you can easily
              saturate 4 database CPUs per 1 app server CPU, but usually the factor is
              closer to 1:1, and with agressive caching in the app tier even less.
              Things like stateful session bean replication and HTTP session replication
              in a cluster ... well, YMMV ... but I would hypothesize that it won't scale
              up anywhere close to 40 servers under load.
              Peace,
              Cameron Purdy
              Tangosol, Inc.
              http://www.tangosol.com/coherence.jsp
              Tangosol Coherence: Clustered Replicated Cache for Weblogic
              "Ralf Reddin" <[email protected]> wrote in message
              news:[email protected]..
              >
              

  • Adding node back into cluster after removal...

    Hi,
    I removed a cluster node using "scconf -r -h <node>" (carried out all the other usual removal steps before getting this command to work).
    Because this is a pair+1 cluster and the node i was trying to remove was physically attached to the quroum device (scsi), I had to create a dummy node before the removal command above would work.
    I reinstalled solaris, SC3.1u4 framwork, patches etc. and then tried to run scsinstall again on the node (reintroduced the node to the cluster again first using scconf -a -T node=<node>).
    However! during the scsinstall i got the following problem:
    Updating file ("ntp.conf.cluster") on node n20-2-sup ... done
    Updating file ("hosts") on node n20-2-sup ... done
    Updating file ("ntp.conf.cluster") on node n20-3-sup ... done
    Updating file ("hosts") on node n20-3-sup ... done
    scrconf: RPC: Unknown host
    scinstall:  Failed communications with "bogusnode"
    scinstall: scinstall did NOT complete successfully!
    Press Enter to continue:
    Was not sure what to do at this point, but since the other clusternodes could now see my 'new' node again, i removed the dummy node, rebooted the new node and said a little prayer...
    Now, my node will not boot as part of the cluster:
    Rebooting with command: boot
    Boot device: /pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w21000004cfa3e691,0:a File and args:
    SunOS Release 5.10 Version Generic_127111-06 64-bit
    Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved.
    Use is subject to license terms.
    Hostname: n20-1-sup
    /usr/cluster/bin/scdidadm: Could not load DID instance list.
    Cannot open /etc/cluster/ccr/did_instances.
    Booting as part of a cluster
    NOTICE: CMM: Node n20-1-sup (nodeid = 1) with votecount = 0 added.
    NOTICE: CMM: Node n20-2-sup (nodeid = 2) with votecount = 2 added.
    NOTICE: CMM: Node n20-3-sup (nodeid = 3) with votecount = 1 added.
    NOTICE: CMM: Node bogusnode (nodeid = 4) with votecount = 0 added.
    NOTICE: clcomm: Adapter qfe5 constructed
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 being constructed
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 being constructed
    NOTICE: clcomm: Adapter qfe1 constructed
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 being constructed
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 being constructed
    NOTICE: CMM: Node n20-1-sup: attempting to join cluster.
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 being initiated
    NOTICE: CMM: Node n20-2-sup (nodeid: 2, incarnation #: 1205318308) has become reachable.
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 online
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 being initiated
    NOTICE: CMM: Node n20-3-sup (nodeid: 3, incarnation #: 1205265086) has become reachable.
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 online
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 being initiated
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 online
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 being initiated
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 online
    NOTICE: CMM: Cluster has reached quorum.
    NOTICE: CMM: Node n20-1-sup (nodeid = 1) is up; new incarnation number = 1205346037.
    NOTICE: CMM: Node n20-2-sup (nodeid = 2) is up; new incarnation number = 1205318308.
    NOTICE: CMM: Node n20-3-sup (nodeid = 3) is up; new incarnation number = 1205265086.
    NOTICE: CMM: Cluster members: n20-1-sup n20-2-sup n20-3-sup.
    NOTICE: CMM: node reconfiguration #18 completed.
    NOTICE: CMM: Node n20-1-sup: joined cluster.
    NOTICE: CMM: Node (nodeid = 4) with votecount = 0 removed.
    NOTICE: CMM: Cluster members: n20-1-sup n20-2-sup n20-3-sup.
    NOTICE: CMM: node reconfiguration #19 completed.
    WARNING: clcomm: per node IP config clprivnet0:-1 (349): 172.16.193.1 failed with 19
    WARNING: clcomm: per node IP config clprivnet0:-1 (349): 172.16.193.1 failed with 19
    cladm: CLCLUSTER_ENABLE: No such device
    UNRECOVERABLE ERROR: Sun Cluster boot: Could not initialize cluster framework
    Please reboot in non cluster mode(boot -x) and Repair
    syncing file systems... done
    WARNING: CMM: Node being shut down.
    Program terminated
    {1} ok
    Any ideas how i can recover this situation without having to reinstall the node again?
    (have a flash with OS, sc3.1u4 framework etc... so not the end of the world but...)
    Thanks a mil if you can help here!
    - headwrecked

    Hi - got sorted with this problem...
    basically just removed (scinstall -r) the sc3.1u4 software from the node which was not booting, and then re-installed the software (this time the dummy node had been removed so it did not try to contact this node and the scinstall completed without any errors)
    I think the only problem with the procedure i used to remove and readd the node was that i forgot to remove the dummy node before re-adding the actaul cluster node again...
    If anyone can confirm this to be the case then great - if not... well its working now so this thread can be closed.
    root@n20-1-sup # /usr/cluster/bin/scinstall -r
    Verifying that no unexpected global mounts remain in /etc/vfstab ... done
    Verifying that no device services still reference this node ... done
    Archiving the following to /var/cluster/uninstall/uninstall.1036/archive:
    /etc/cluster ...
    /etc/path_to_inst ...
    /etc/vfstab ...
    /etc/nsswitch.conf ...
    Updating vfstab ... done
    The /etc/vfstab file was updated successfully.
    The original entry for /global/.devices/node@1 has been commented out.
    And, a new entry has been added for /globaldevices.
    Mounting /dev/dsk/c3t0d0s6 on /globaldevices ... done
    Attempting to contact the cluster ...
    Trying "n20-2-sup" ... okay
    Trying "n20-3-sup" ... okay
    Attempting to unconfigure n20-1-sup from the cluster ... failed
    Please consider the following warnings:
    scrconf: Failed to remove node (n20-1-sup).
    scrconf: All two-node clusters must have at least one shared quorum device.
    Additional housekeeping may be required to unconfigure
    n20-1-sup from the active cluster.
    Removing the "cluster" switch from "hosts" in /etc/nsswitch.conf ... done
    Removing the "cluster" switch from "netmasks" in /etc/nsswitch.conf ... done
    ** Removing Sun Cluster framework packages **
    Removing SUNWkscspmu.done
    Removing SUNWkscspm..done
    Removing SUNWksc.....done
    Removing SUNWjscspmu.done
    Removing SUNWjscspm..done
    Removing SUNWjscman..done
    Removing SUNWjsc.....done
    Removing SUNWhscspmu.done
    Removing SUNWhscspm..done
    Removing SUNWhsc.....done
    Removing SUNWfscspmu.done
    Removing SUNWfscspm..done
    Removing SUNWfsc.....done
    Removing SUNWescspmu.done
    Removing SUNWescspm..done
    Removing SUNWesc.....done
    Removing SUNWdscspmu.done
    Removing SUNWdscspm..done
    Removing SUNWdsc.....done
    Removing SUNWcscspmu.done
    Removing SUNWcscspm..done
    Removing SUNWcsc.....done
    Removing SUNWscrsm...done
    Removing SUNWscspmr..done
    Removing SUNWscspmu..done
    Removing SUNWscspm...done
    Removing SUNWscva....done
    Removing SUNWscmasau.done
    Removing SUNWscmasar.done
    Removing SUNWmdmu....done
    Removing SUNWmdmr....done
    Removing SUNWscvm....done
    Removing SUNWscsam...done
    Removing SUNWscsal...done
    Removing SUNWscman...done
    Removing SUNWscgds...done
    Removing SUNWscdev...done
    Removing SUNWscnmu...done
    Removing SUNWscnmr...done
    Removing SUNWscscku..done
    Removing SUNWscsckr..done
    Removing SUNWscu.....done
    Removing SUNWscr.....done
    Removing the following:
    /etc/cluster ...
    /dev/did ...
    /devices/pseudo/did@0:* ...
    The /etc/inet/ntp.conf file has not been updated.
    You may want to remove it or update it after uninstall has completed.
    The /var/cluster directory has not been removed.
    Among other things, this directory contains
    uninstall logs and the uninstall archive.
    You may remove this directory once you are satisfied
    that the logs and archive are no longer needed.
    Log file - /var/cluster/uninstall/uninstall.1036/log
    root@n20-1-sup #
    Ran the scinstall again:
    >>> Confirmation <<<
    Your responses indicate the following options to scinstall:
    scinstall -ik \
    -C N20_Cluster \
    -N n20-2-sup \
    -M patchdir=/var/cluster/patches \
    -A trtype=dlpi,name=qfe1 -A trtype=dlpi,name=qfe5 \
    -m endpoint=:qfe1,endpoint=switch1 \
    -m endpoint=:qfe5,endpoint=switch2
    Are these the options you want to use (yes/no) [yes]?
    Do you want to continue with the install (yes/no) [yes]?
    Checking device to use for global devices file system ... done
    Installing patches ... failed
    scinstall: Problems detected during extraction or installation of patches.
    Adding node "n20-1-sup" to the cluster configuration ... skipped
    Skipped node "n20-1-sup" - already configured
    Adding adapter "qfe1" to the cluster configuration ... skipped
    Skipped adapter "qfe1" - already configured
    Adding adapter "qfe5" to the cluster configuration ... skipped
    Skipped adapter "qfe5" - already configured
    Adding cable to the cluster configuration ... skipped
    Skipped cable - already configured
    Adding cable to the cluster configuration ... skipped
    Skipped cable - already configured
    Copying the config from "n20-2-sup" ... done
    Copying the postconfig file from "n20-2-sup" if it exists ... done
    Copying the Common Agent Container keys from "n20-2-sup" ... done
    Setting the node ID for "n20-1-sup" ... done (id=1)
    Verifying the major number for the "did" driver with "n20-2-sup" ... done
    Checking for global devices global file system ... done
    Updating vfstab ... done
    Verifying that NTP is configured ... done
    Initializing NTP configuration ... done
    Updating nsswitch.conf ...
    done
    Adding clusternode entries to /etc/inet/hosts ... done
    Configuring IP Multipathing groups in "/etc/hostname.<adapter>" files
    IP Multipathing already configured in "/etc/hostname.qfe2".
    Verifying that power management is NOT configured ... done
    Ensure that the EEPROM parameter "local-mac-address?" is set to "true" ... done
    Ensure network routing is disabled ... done
    Updating file ("ntp.conf.cluster") on node n20-2-sup ... done
    Updating file ("hosts") on node n20-2-sup ... done
    Updating file ("ntp.conf.cluster") on node n20-3-sup ... done
    Updating file ("hosts") on node n20-3-sup ... done
    Log file - /var/cluster/logs/install/scinstall.log.938
    Rebooting ...
    Mar 13 13:59:13 n20-1-sup reboot: rebooted by root
    Terminated
    root@n20-1-sup # syncing file systems... done
    rebooting...
    R
    LOM event: +103d+20h44m26s host reset
    screen not found.
    keyboard not found.
    Keyboard not present. Using lom-console for input and output.
    Sun Netra T4 (2 X UltraSPARC-III+) , No Keyboard
    Copyright 1998-2003 Sun Microsystems, Inc. All rights reserved.
    OpenBoot 4.10.1, 4096 MB memory installed, Serial #52960491.
    Ethernet address 0:3:ba:28:1c:eb, Host ID: 83281ceb.
    Initializing 15MB Rebooting with command: boot
    Boot device: /pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w21000004cfa3e691,0:a File and args:
    SunOS Release 5.10 Version Generic_127111-06 64-bit
    Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved.
    Use is subject to license terms.
    Hostname: n20-1-sup
    Configuring devices.
    devfsadm: minor_init failed for module /usr/lib/devfsadm/linkmod/SUNW_scmd_link.so
    Loading smf(5) service descriptions: 24/24
    /usr/cluster/bin/scdidadm: Could not load DID instance list.
    Cannot open /etc/cluster/ccr/did_instances.
    Booting as part of a cluster
    NOTICE: CMM: Node n20-1-sup (nodeid = 1) with votecount = 0 added.
    NOTICE: CMM: Node n20-2-sup (nodeid = 2) with votecount = 2 added.
    NOTICE: CMM: Node n20-3-sup (nodeid = 3) with votecount = 1 added.
    NOTICE: clcomm: Adapter qfe5 constructed
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 being constructed
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 being constructed
    NOTICE: clcomm: Adapter qfe1 constructed
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 being constructed
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 being constructed
    NOTICE: CMM: Node n20-1-sup: attempting to join cluster.
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 being initiated
    NOTICE: CMM: Node n20-2-sup (nodeid: 2, incarnation #: 1205318308) has become reachable.
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 online
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 being initiated
    NOTICE: CMM: Node n20-3-sup (nodeid: 3, incarnation #: 1205265086) has become reachable.
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 online
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 being initiated
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 online
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 being initiated
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 online
    NOTICE: CMM: Cluster has reached quorum.
    NOTICE: CMM: Node n20-1-sup (nodeid = 1) is up; new incarnation number = 1205416931.
    NOTICE: CMM: Node n20-2-sup (nodeid = 2) is up; new incarnation number = 1205318308.
    NOTICE: CMM: Node n20-3-sup (nodeid = 3) is up; new incarnation number = 1205265086.
    NOTICE: CMM: Cluster members: n20-1-sup n20-2-sup n20-3-sup.
    NOTICE: CMM: node reconfiguration #23 completed.
    NOTICE: CMM: Node n20-1-sup: joined cluster.
    ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast
    NOTICE: CMM: Votecount changed from 0 to 1 for node n20-1-sup.
    NOTICE: CMM: Cluster members: n20-1-sup n20-2-sup n20-3-sup.
    NOTICE: CMM: node reconfiguration #24 completed.
    Mar 13 14:02:23 in.ndpd[351]: solicit_event: giving up on qfe1
    Mar 13 14:02:23 in.ndpd[351]: solicit_event: giving up on qfe5
    did subpath /dev/rdsk/c1t3d0s2 created for instance 2.
    did subpath /dev/rdsk/c2t3d0s2 created for instance 12.
    did subpath /dev/rdsk/c1t3d1s2 created for instance 3.
    did subpath /dev/rdsk/c1t3d2s2 created for instance 6.
    did subpath /dev/rdsk/c1t3d3s2 created for instance 7.
    did subpath /dev/rdsk/c1t3d4s2 created for instance 8.
    did subpath /dev/rdsk/c1t3d5s2 created for instance 9.
    did subpath /dev/rdsk/c1t3d6s2 created for instance 10.
    did subpath /dev/rdsk/c1t3d7s2 created for instance 11.
    did subpath /dev/rdsk/c2t3d1s2 created for instance 13.
    did subpath /dev/rdsk/c2t3d2s2 created for instance 14.
    did subpath /dev/rdsk/c2t3d3s2 created for instance 15.
    did subpath /dev/rdsk/c2t3d4s2 created for instance 16.
    did subpath /dev/rdsk/c2t3d5s2 created for instance 17.
    did subpath /dev/rdsk/c2t3d6s2 created for instance 18.
    did subpath /dev/rdsk/c2t3d7s2 created for instance 19.
    did instance 20 created.
    did subpath n20-1-sup:/dev/rdsk/c0t6d0 created for instance 20.
    did instance 21 created.
    did subpath n20-1-sup:/dev/rdsk/c3t0d0 created for instance 21.
    did instance 22 created.
    did subpath n20-1-sup:/dev/rdsk/c3t1d0 created for instance 22.
    Configuring DID devices
    t_optmgmt: System error: Cannot assign requested address
    obtaining access to all attached disks
    n20-1-sup console login:

  • Leaf nodes in Flex cluster

    Hi,
      can somebody explain what leaf nodes in flex cluster are?from documentation I see that they are nodes which dont have access to storage and they communicate with hub nodes.
      Can they have oracle db instances?If so how is data transfered between hub and leaf nodes?Through interconnect?Doesn't it overload the interconnect?
    Thanks
    Sekar

    Sekar_BLUE4EVER wrote:
    Thanks Aman...Still confused about this...Consider the following scenario
    |       H1       |<------->  |        H2         |   <------> |       H3        |
    |                   |              |                      |               |                    |
    | L1  L2  L3  |              | L1   L2   L3   |               | L1  L 2 L3   |
    | _________|              |___________|               |__________|
    H depicts the hub nodes and L depict the leaf nodes.Assume each Hub node has 3 leaf nodes attached to them.
    Suppose L1 connected to H1 needs a block and modifies it and after sometime L1 connected to H2 needs the same block then it must follow the same 2 way/3 way grant as in normal cache fusion right?
    Does this actually increase the number of hops since the leaf nodes are not directly connected?
    Do we have any control over the leaf node to hub node mapping or is it all automatically managed?
    Thanks
    The blocks are going to be accessed, modified at the Hub nodes only AFAIK as the Hub nodes are considered as DB Nodes. The Leaf Nodes are going to be considered as the Application Nodes. That's the reason, it's better to set up the instances running on the Hub Nodes only rather than the Leaf Nodes. Even if the instance runs on a Leaf Node, the communication is between the Hub and Leaf node only and it won't do any harm as both the nodes-Hub and Leaf(and the other nodes in the Leaf group) would be talking to each other directly. There is no VIP required on the Leaf Nodes so the connections by the database users would be only on the Hub Nodes, I guess and that means, the block movement would remain essentially the same.
    The number of network hops are reduced as you won't be having a requirement to have too many Hub Nodes since each Hub node can connect to  64(?) Leaf Nodes. So essentially, in your case, you would need only 4 Interconnects (2 on one Hub Node and 1 each on the remaining two) for the private interconnect and just 3 network links for the storage for each Hub node.
    I am not sure that I understood the last question of yours.
    HTH
    Aman....

  • [b]Migrating the DB-Tier (DB and CM) to Two node non RAC cluster[/b]

    Hi,
    The current set-up of our E-business suite is a two node install:-
    The DB Tier (Database and Concurrent Manager) on one node
    The Apps Tier (Forms /Web Server) on another node.
    For the HA solution (NON ORACLE RAC) we are planning to:-
    Move the DB Tier (Database and Concurrent Manager) to a three node hardware sun cluster managed by veritas cluster manager (NOT ORACLE RAC). We need to know will the Database Tier (Database and Concurrent Manager) work on Hardware cluster node and will it support COLD FAILOVER from one Node to another. We know the database on its own would be fine with a cold failover because we have tested the database cold failover on the three node cluster for non E-business suite database. But here we have the added thing of the Concurrent manger sitting on the node along with the Database on the DB Tier.
    The Apps Tier (Forms / Web server) will be put on a separate set of server using Load balances etc.
    Has anybody implemented similar HA set-up and will this planned set-up work or are there any issues with this.
    Any help / info would be appreciated.
    Thanks

    Hi,
    Yes, you can do the cold failover the database and all the 11i services also.
    1. In the concurrent manager service
    == when you failover the concurrent manager do the following things before failover.
    create the listener/tnsnames files which includes the new hostname and keep it with the veritas failover service. i mean, when you failover, these files should replace the existing files, before the existing files should backedup. and create a script to change the hostname,logfile_hostname,outfile_hostname in fnd_concurent_processes table.
    add the nodes in the install->nodes navigation
    do the failover manually and check the listener files are properly pinging using tnsping. and start the concurrent manager.
    so tatally, you have to prepare two sql scripts
    one for change the hostname from node b to node a
    one for node a to node b.
    and 2 listener + 2 tnsnames files which contains the seperate hostname accordingly.
    use the adcmctl.sh top stop and start
    finally, create a shell script to kill the sysmgr for the current managers when the manager takes long time to shutdown. before run the kill script, wait for 5 mts atleast.
    I done the same scenario many times, with veritas failover service with 2 sun v880 servers.
    regards,
    Pandian

  • Maximum number of nodes reached

    Hello all,
    I am currently trying to explain a problem with one of our WF run and I just found out, under the "Step History" tab of a Work Item, the error message "Maximum number of nodes reached" was logged.  That certainly explains the problem but I have some questions regarding such a message:
    1) If WF-A calls a subworkflow WF-B, does this limit of 10,000 nodes apllies for the whole WF run (i.e. WF-A nodes + WF-B nodes) or is it 10,000 nodes for WF-A and 10,000 nodes for WF-B?
    2) In our situation, the error was logged within the WF-B (subworkflow) work item which I beleive caused WF-A to fall into an error status.  Some people have tried to re-execute the WF-B workitem without any success... Always the same error was reported.  Is there anyway to re-start a WF that is in Error if it has reached this limit of nodes?
    Thanks.
    José

    Hello Rick,
    You are right, this worfwlow run was started back in May 2010 and yes, it is for the creation of a new point of sale.  I understand when you say that obviously, the business managed to survive and we should cancel and restart and this is exactly what we are investigating at the moment but this customer wanted us to try to restart it considering, I suppose, the amount of work already completed in this workflow (which contains many subworkflows that are also in progress and stopped because of this problem).
    To your question "How come this error didn't affect other wokflow instances?", I would say that the customer have been lucky so far because there was a design issue in this workflow definition but the "context" required to create this error happened only once since its implementation.  However, they are still at risk and this WF definition requires a modification for sure to prevent this from happening again in the futur.
    Concerning this 10,000 value, I've asked myself the same question since the change in the customizing made no differences, it has to be kept somewhere inside the workflow instance but where?.  I have chekced all the elements of the WF container but, not there!
    I will investigate with SWI2_DIAG, thanks for the hint and I will let you know through this thread.
    Merci ! (Thanks)
    José

  • Maximum  Number of Nodes 10000 reached

    Hi Im using a loop condition for deadline rendering with system time then the worlkflow showing error with the message
    "Maximum  Number of Nodes 10000 reached" where im going wrong
    im checking my dealine requirement for every 3 minutes by using loop condition  original requirement is to check for every 3 days
    im confused over here where im going wrong
    Thanks

    Hi Prasad,
    As suggested by our friends, there is no termination condition in the loop , and it has become an infinite loop.
    To elaborate this error description, find extarcts from help :
    Maximum number of nodes that can be processed at runtime before the workflow runtime system assumes an endless loop and cancels the current workflow.
    The presetting for this value is 10000.
    Regards,
    Akshay

  • Workflow erro: maximum number of nodes for a session reached

    Hi
       a number of workflows went into error with message maximum number of nodes for the session reached 11000 . what can be the reason . is there some settings thats need to be made in production . because it is working in dr3 and qr3

    Hi,
    Check the condition on the loop for you workflow. It seems its getting into infinite loop.
    Also check for the possibility FM which might be triggering your workflow is getting executed in an infinite loop. So it might trigger multiple instances of the same workflow.
    Regards,
    Niraj

  • Create secondary node on BOBJ cluster environment

    Hi All,
    I have done migration from old BOBJ server to new BOBJ server. The primary nodes has successfully completed and manage to view reports. I currently having issue to proceed installing second node on BOBJ cluster environment. i stuck at phase repository database as no database entry was found.
    Please help. I also attach my steps to install the second node.
    Thanks.
    Regards
    Aiman

    Hi,
    im guessing you do an Expand installation on the second node and want to enter the credentials of your existing CMS DB, the one you configured during the installation of Node 1. Correct?
    What is your CMS DB of a Kind? MS SQL? Oracle?
    It looks like that the ODBC DSN entries are missing on Node 2.
    Regards
    -Seb.

  • Replacing NOde in a cluster

    Replacing a node in a 2008r2 SQL cluster.Do you recommend storage validation? If so any idea how long that validation tajkes? how do I validate disks that are online?

    The amount of time required to complete a validation run is directly proportional to the number of nodes and shared disks.  If you have two nodes and two disks, a complete validation should take under five minutes.  If you have 4 nodes and 4 disks,
    you could be looking at a 10-15 minute validation run.
    Validating disks that are online will cause brief outages.  What the validation is doing is testing failover to ensure it works correctly.  Therefore, if you have an application, such as SQL, running when you run a full validation against its disks,
    that disk will appear to have failed to SQL, so it will also go through its failure mechanisms.  This is likely to cause an outage to the clients.
    If you have a maintenance window available, it would be better to ensure nobody is accessing SQL at the time to ensure the outage does not impact any application.
    .:|:.:|:. tim

  • Disk Management shows 'unallocated' and 'online' basic disk on a 4 node Windows 2003 Cluster - Is this disk reclaimable?

    We've a 4 node Windows 2003 File Share Cluster. I logged onto one of the nodes and found there are a lot of SAN connected disks that show 'Unallocated' in Disk Management as below,
    Please could someone advice if these disks are unused? and reclaimable? From what I heard from Adminstrator is that, its the default behavior of Cluster and will be in use by a different node on the same cluster. If so, is there an easier way to identify
    which nodes are using these disks? since it appears as though these disks are mapped to server but not being used, many thanks.

    As expected.
    Things a bit clearer in current versions of Windows Server, but back in the 2003 days, that was how the shared disk was shown on the nodes that did not own the disk.  If you go to each node in the cluster and look at the same thing in each node, you
    will have the same number of disks.  On the node that owns the disk, you will see it represented as you would expect.  On the nodes that do not own the disk, you will see it displayed as you have shown in your screen shot.
    . : | : . : | : . tim

  • Can I use one transport adapter on the nodes of the cluster?

    Hi
    I am new to sun cluster, in the cluster documentation they mentioned that each node should have 2 network cards one for public connections and one for private connection. what if I do not want the nodes to have public connections except for one node. In other words, I want to use one network card on each node except for the first node in the cluster, users can access the rest of the nodes through the fist node . Is that possible? If yes, what should be the name of the second transport adapter while installing the cluster software on the nodes.
    Thank You for the help

    Dear
    We are using cluster for HA on failover condition, If you have only one network adapter so how you work in failover, and you can't assign one adaptor to two node as same, you have min 2 network adapter for 2 node cluster..
    :)GooDLucK
    Mohammed Tanvir

  • How to get total number of nodes in a JTree?

    Hi,
    I am trying to get total number of nodes in a JTree, and cannot find a way to do it.
    The current getRowCount() method returns the number of rows that are currently being displayed.
    Is there a way to do this or I am missing something?
    thanks,

    How many nodes does this tree have?
    import java.awt.EventQueue;
    import javax.swing.*;
    import javax.swing.event.TreeModelListener;
    import javax.swing.tree.*;
    public class BigTree {
        public static void main(String[] args) {
            EventQueue.invokeLater(new Runnable() {
                public void run() {
                    TreeModel model = new TreeModel() {
                        private String node = "Node!";
                        @Override
                        public void valueForPathChanged(TreePath path,
                                Object newValue) {
                            // not mutable
                        @Override
                        public void removeTreeModelListener(TreeModelListener l) {
                            // not mutable
                        @Override
                        public boolean isLeaf(Object node) {
                            return false;
                        @Override
                        public Object getRoot() {
                            return node;
                        @Override
                        public int getIndexOfChild(Object parent, Object child) {
                            return child == node ? 0 : -1;
                        @Override
                        public int getChildCount(Object parent) {
                            return 1;
                        @Override
                        public Object getChild(Object parent, int index) {
                            return node;
                        @Override
                        public void addTreeModelListener(TreeModelListener l) {
                            // not mutable
                    JFrame frame = new JFrame("Test");
                    frame.setDefaultCloseOperation(JFrame.DISPOSE_ON_CLOSE);
                    frame.getContentPane().add(new JScrollPane(new JTree(model)));
                    frame.pack();
                    frame.setLocationRelativeTo(null);
                    frame.setVisible(true);
    }But for bounded tree model using DefaultMutableTreeNode look at bread/depth/preorder enumeration methods to walk the entire tree. Or look at the source code for those and adapt them to work with the TreeModel interface.

  • What will happen if adding a new node with current cluster, while new node's CPU is slower quality?

    Hello,
    Say, I have a 3 nodes RAC, I want to add a new node to current cluster... while the new node's CPUs are slower than the others.. what will happen?
    (my concern is : can I add this new node successfully? if yes, can it anyway improve the whole cluster performance or not?)
    Thank you
    s9225

    Also you can refer MOS note : RAC: Frequently Asked Questions (Doc ID 220970.1)
    Can I have different servers in my Oracle RAC? Can they be from different vendors? Can they be different sizes?

  • How to find out the IP@s of all nodes in a cluster?

    Is there any way to retrieve the IP addresses of all nodes in a cluster?
    The problem is the following. We intend to write an administration program
    that administers all nodes of a cluster using rmi (e.g. tell all singletons
    in the cluster to reload configuration values etc.). My understanding is
    that rmi only talks to a single node in a cluster. It would be a convenient
    feature if the administration program could figure out all nodes in a
    cluster by itself and then administers each node sequentially. So far we're
    planning to pass all IP addresses to the administration program e.g. as
    command line arguments but what if a node gets left out due to human error?
    Thanks for your help.
    Bernie

    There is no public interface to inquire about the IP addresses of the servers in a cluster. If you use WLS 6.0, there is an administrative console that uses JMX to manage the cluster. Perhaps that would be of use to you?
    Bernhard Lenz wrote:
    Is there any way to retrieve the IP addresses of all nodes in a cluster?
    The problem is the following. We intend to write an administration program
    that administers all nodes of a cluster using rmi (e.g. tell all singletons
    in the cluster to reload configuration values etc.). My understanding is
    that rmi only talks to a single node in a cluster. It would be a convenient
    feature if the administration program could figure out all nodes in a
    cluster by itself and then administers each node sequentially. So far we're
    planning to pass all IP addresses to the administration program e.g. as
    command line arguments but what if a node gets left out due to human error?
    Thanks for your help.
    Bernie

Maybe you are looking for

  • How to identify if a user master record is locked

    Hi, I want to use function module BAPI_USER_CHANGE to change a user's master record (transaction SU01).  But if the user that I want to change is being modified by another user, this BAPI doesn't update, correctly so.  Is there a way to identify if a

  • Connecting to Oracle Lite on iPaq

    Hi all, sorry to bother you, but I have quite a problem. I am running an application on the Creme VM on my iPaq, and trying to connect to Oracle Lite. Unfortunately, I get this classic error: UnSatisfiedLinkError: No oljdbc40 in shared library path.

  • How to invoke web services in sequence in ESB?

    Is there any way to invoke multiple services (web, adapter) in a sequence using only the ESB? (I don't want to bring in BEPL). Something like this: Read File using File Adapter into schema | v Invoke database adapter to retrieve data and use it to en

  • [svn:osmf:] 17162: config file and relevant changes for flex sdk3 to flex sdk 4 migration

    Revision: 17162 Revision: 17162 Author:   [email protected] Date:     2010-08-02 18:31:23 -0700 (Mon, 02 Aug 2010) Log Message: config file and relevant changes for flex sdk3 to flex sdk 4 migration Modified Paths:     osmf/trunk/apps/certification/

  • Regarding Lock objects concept

    Hi,    I am interested in knowing what is the significance of Lock objects in SAP.    Can we co-relate SAP lock objects with Oracle.     Frnds, I dont need any material i hav enough but i want to know the cocept behind this. Thanking u all. regards,