Memory sharing among nodes in a cluster

Hi,
Let's say I have two nodes A and B in the cluster. Node A has defined in config file and is using just cacheA, while node B has defined in config file and is using just cacheB.
From the fact that those two nodes are in the same cluster, does it mean that memory on nodeA is used for both cacheA and cacheB? Is memory on a node used only for caches, that are locally defined in config files and used by that node?
The reason I am asking is the following: I have many nodes, each of them has some caches defined and is using some caches, but I am expecting that some of them will be fairly small, while other big. It would be great, if from the fact, that they are in the same cluster it meant, that each node participates with it's resources to all caches in the cluster.
Best regards
Jarek

Marie,
it is not possible to dedicate an "own node" to specific caches in a cache service which has other caches.
To dedicate certain nodes to certain caches, you have to separate those caches to their own distributed cache service which would only be storage-enabled on the dedicated nodes. For this, you also have to explicitly configure whether that cache service is storage-enabled or storage-disabled independently of the cluster-wide tangosol.coherence.distributed.localstorage override.
Also, high-units does not tell you when you ran out of space. It tells you when you exceeded an configuration value, but that configuration value may not always correctly correspond to the actual free capacity in the JVM.
Also, high-units does not define cache capacity, which is service-wide (all storage-enabled nodes in the service contribute to it). It defines backing map capacity, which is per-node, and unfortunately the behaviour is not deterministic (unless you measured the backing map usage and defined the capacity per partition, which I believe is possible with partition-split backing maps only). Due to this the high-units does not correspond to the capacity and exceeding that capacity across the entire cache service.
Also, high-units paired with an eviction-policy is a dangerous beast on a backing map, as it will possibly lead to data loss.
Best regards,
Robert

Similar Messages

  • RAC Instalation Problem (shared accross all the nodes in the cluster)

    All experts
    I am trying for installing Oracle 10.2.0 RAC on Redhat 4.7
    reff : http://www.oracle-base.com/articles/10g/OracleDB10gR2RACInstallationOnLinux
    All steps successfully completed on all nodes (rac1,rac2) every thing is okey for each node
    on single node rac instalation successfull.
    when i try to install on two nodes
    on specify Oracle Cluster Registry (OCR) location showing error
    the location /nfsmounta/crs.configuration is not shared accross all the nodes in the cluster. Specify a shared raw partation or cluster file system file that is visible by the same name on all nodes of the cluster.
    I create shared disks on all nodes as:
    1 First we need to set up some NFS shares. Create shared disks on NAS or a third server if you have one available. Otherwise create the following directories on the RAC1 node.
    mkdir /nfssharea
    mkdir /nfsshareb
    2. Add the following lines to the /etc/exports file. (edit /etc/exports)
    /nfssharea *(rw,sync,no_wdelay,insecure_locks,no_root_squash)
    /nfsshareb *(rw,sync,no_wdelay,insecure_locks,no_root_squash)
    3. Run the following command to export the NFS shares.
    chkconfig nfs on
    service nfs restart
    4. On both RAC1 and RAC2 create some mount points to mount the NFS shares to.
    mkdir /nfsmounta
    mkdir /nfsmountb
    5. Add the following lines to the "/etc/fstab" file. The mount options are suggestions from Kevin Closson.
    nas:/nfssharea /nfsmounta nfs rw,bg,hard,nointr,tcp,vers=3,timeo=300,rsize=32768,wsize=32768,actimeo=0 0 0
    nas:/nfsshareb /nfsmountb nfs rw,bg,hard,nointr,tcp,vers=3,timeo=300,rsize=32768,wsize=32768,actimeo=0 0 0
    6. Mount the NFS shares on both servers.
    mount /mount1
    mount /mount2
    7. Create the shared CRS Configuration and Voting Disk files.
    touch /nfsmounta/crs.configuration
    touch /nfsmountb/voting.disk
    Please guide me what is wrong

    I think you did not really mount it on the second server. what is the output of 'ls /nfsmounta'.
    step 6 should be 'mount /nfsmounta', not 'mount 1'. I also don't know if simply creating a zero-size file is sufficient for ocr (i have always used raw devices, not nfs for this)

  • Adding node back into cluster after removal...

    Hi,
    I removed a cluster node using "scconf -r -h <node>" (carried out all the other usual removal steps before getting this command to work).
    Because this is a pair+1 cluster and the node i was trying to remove was physically attached to the quroum device (scsi), I had to create a dummy node before the removal command above would work.
    I reinstalled solaris, SC3.1u4 framwork, patches etc. and then tried to run scsinstall again on the node (reintroduced the node to the cluster again first using scconf -a -T node=<node>).
    However! during the scsinstall i got the following problem:
    Updating file ("ntp.conf.cluster") on node n20-2-sup ... done
    Updating file ("hosts") on node n20-2-sup ... done
    Updating file ("ntp.conf.cluster") on node n20-3-sup ... done
    Updating file ("hosts") on node n20-3-sup ... done
    scrconf: RPC: Unknown host
    scinstall:  Failed communications with "bogusnode"
    scinstall: scinstall did NOT complete successfully!
    Press Enter to continue:
    Was not sure what to do at this point, but since the other clusternodes could now see my 'new' node again, i removed the dummy node, rebooted the new node and said a little prayer...
    Now, my node will not boot as part of the cluster:
    Rebooting with command: boot
    Boot device: /pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w21000004cfa3e691,0:a File and args:
    SunOS Release 5.10 Version Generic_127111-06 64-bit
    Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved.
    Use is subject to license terms.
    Hostname: n20-1-sup
    /usr/cluster/bin/scdidadm: Could not load DID instance list.
    Cannot open /etc/cluster/ccr/did_instances.
    Booting as part of a cluster
    NOTICE: CMM: Node n20-1-sup (nodeid = 1) with votecount = 0 added.
    NOTICE: CMM: Node n20-2-sup (nodeid = 2) with votecount = 2 added.
    NOTICE: CMM: Node n20-3-sup (nodeid = 3) with votecount = 1 added.
    NOTICE: CMM: Node bogusnode (nodeid = 4) with votecount = 0 added.
    NOTICE: clcomm: Adapter qfe5 constructed
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 being constructed
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 being constructed
    NOTICE: clcomm: Adapter qfe1 constructed
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 being constructed
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 being constructed
    NOTICE: CMM: Node n20-1-sup: attempting to join cluster.
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 being initiated
    NOTICE: CMM: Node n20-2-sup (nodeid: 2, incarnation #: 1205318308) has become reachable.
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 online
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 being initiated
    NOTICE: CMM: Node n20-3-sup (nodeid: 3, incarnation #: 1205265086) has become reachable.
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 online
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 being initiated
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 online
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 being initiated
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 online
    NOTICE: CMM: Cluster has reached quorum.
    NOTICE: CMM: Node n20-1-sup (nodeid = 1) is up; new incarnation number = 1205346037.
    NOTICE: CMM: Node n20-2-sup (nodeid = 2) is up; new incarnation number = 1205318308.
    NOTICE: CMM: Node n20-3-sup (nodeid = 3) is up; new incarnation number = 1205265086.
    NOTICE: CMM: Cluster members: n20-1-sup n20-2-sup n20-3-sup.
    NOTICE: CMM: node reconfiguration #18 completed.
    NOTICE: CMM: Node n20-1-sup: joined cluster.
    NOTICE: CMM: Node (nodeid = 4) with votecount = 0 removed.
    NOTICE: CMM: Cluster members: n20-1-sup n20-2-sup n20-3-sup.
    NOTICE: CMM: node reconfiguration #19 completed.
    WARNING: clcomm: per node IP config clprivnet0:-1 (349): 172.16.193.1 failed with 19
    WARNING: clcomm: per node IP config clprivnet0:-1 (349): 172.16.193.1 failed with 19
    cladm: CLCLUSTER_ENABLE: No such device
    UNRECOVERABLE ERROR: Sun Cluster boot: Could not initialize cluster framework
    Please reboot in non cluster mode(boot -x) and Repair
    syncing file systems... done
    WARNING: CMM: Node being shut down.
    Program terminated
    {1} ok
    Any ideas how i can recover this situation without having to reinstall the node again?
    (have a flash with OS, sc3.1u4 framework etc... so not the end of the world but...)
    Thanks a mil if you can help here!
    - headwrecked

    Hi - got sorted with this problem...
    basically just removed (scinstall -r) the sc3.1u4 software from the node which was not booting, and then re-installed the software (this time the dummy node had been removed so it did not try to contact this node and the scinstall completed without any errors)
    I think the only problem with the procedure i used to remove and readd the node was that i forgot to remove the dummy node before re-adding the actaul cluster node again...
    If anyone can confirm this to be the case then great - if not... well its working now so this thread can be closed.
    root@n20-1-sup # /usr/cluster/bin/scinstall -r
    Verifying that no unexpected global mounts remain in /etc/vfstab ... done
    Verifying that no device services still reference this node ... done
    Archiving the following to /var/cluster/uninstall/uninstall.1036/archive:
    /etc/cluster ...
    /etc/path_to_inst ...
    /etc/vfstab ...
    /etc/nsswitch.conf ...
    Updating vfstab ... done
    The /etc/vfstab file was updated successfully.
    The original entry for /global/.devices/node@1 has been commented out.
    And, a new entry has been added for /globaldevices.
    Mounting /dev/dsk/c3t0d0s6 on /globaldevices ... done
    Attempting to contact the cluster ...
    Trying "n20-2-sup" ... okay
    Trying "n20-3-sup" ... okay
    Attempting to unconfigure n20-1-sup from the cluster ... failed
    Please consider the following warnings:
    scrconf: Failed to remove node (n20-1-sup).
    scrconf: All two-node clusters must have at least one shared quorum device.
    Additional housekeeping may be required to unconfigure
    n20-1-sup from the active cluster.
    Removing the "cluster" switch from "hosts" in /etc/nsswitch.conf ... done
    Removing the "cluster" switch from "netmasks" in /etc/nsswitch.conf ... done
    ** Removing Sun Cluster framework packages **
    Removing SUNWkscspmu.done
    Removing SUNWkscspm..done
    Removing SUNWksc.....done
    Removing SUNWjscspmu.done
    Removing SUNWjscspm..done
    Removing SUNWjscman..done
    Removing SUNWjsc.....done
    Removing SUNWhscspmu.done
    Removing SUNWhscspm..done
    Removing SUNWhsc.....done
    Removing SUNWfscspmu.done
    Removing SUNWfscspm..done
    Removing SUNWfsc.....done
    Removing SUNWescspmu.done
    Removing SUNWescspm..done
    Removing SUNWesc.....done
    Removing SUNWdscspmu.done
    Removing SUNWdscspm..done
    Removing SUNWdsc.....done
    Removing SUNWcscspmu.done
    Removing SUNWcscspm..done
    Removing SUNWcsc.....done
    Removing SUNWscrsm...done
    Removing SUNWscspmr..done
    Removing SUNWscspmu..done
    Removing SUNWscspm...done
    Removing SUNWscva....done
    Removing SUNWscmasau.done
    Removing SUNWscmasar.done
    Removing SUNWmdmu....done
    Removing SUNWmdmr....done
    Removing SUNWscvm....done
    Removing SUNWscsam...done
    Removing SUNWscsal...done
    Removing SUNWscman...done
    Removing SUNWscgds...done
    Removing SUNWscdev...done
    Removing SUNWscnmu...done
    Removing SUNWscnmr...done
    Removing SUNWscscku..done
    Removing SUNWscsckr..done
    Removing SUNWscu.....done
    Removing SUNWscr.....done
    Removing the following:
    /etc/cluster ...
    /dev/did ...
    /devices/pseudo/did@0:* ...
    The /etc/inet/ntp.conf file has not been updated.
    You may want to remove it or update it after uninstall has completed.
    The /var/cluster directory has not been removed.
    Among other things, this directory contains
    uninstall logs and the uninstall archive.
    You may remove this directory once you are satisfied
    that the logs and archive are no longer needed.
    Log file - /var/cluster/uninstall/uninstall.1036/log
    root@n20-1-sup #
    Ran the scinstall again:
    >>> Confirmation <<<
    Your responses indicate the following options to scinstall:
    scinstall -ik \
    -C N20_Cluster \
    -N n20-2-sup \
    -M patchdir=/var/cluster/patches \
    -A trtype=dlpi,name=qfe1 -A trtype=dlpi,name=qfe5 \
    -m endpoint=:qfe1,endpoint=switch1 \
    -m endpoint=:qfe5,endpoint=switch2
    Are these the options you want to use (yes/no) [yes]?
    Do you want to continue with the install (yes/no) [yes]?
    Checking device to use for global devices file system ... done
    Installing patches ... failed
    scinstall: Problems detected during extraction or installation of patches.
    Adding node "n20-1-sup" to the cluster configuration ... skipped
    Skipped node "n20-1-sup" - already configured
    Adding adapter "qfe1" to the cluster configuration ... skipped
    Skipped adapter "qfe1" - already configured
    Adding adapter "qfe5" to the cluster configuration ... skipped
    Skipped adapter "qfe5" - already configured
    Adding cable to the cluster configuration ... skipped
    Skipped cable - already configured
    Adding cable to the cluster configuration ... skipped
    Skipped cable - already configured
    Copying the config from "n20-2-sup" ... done
    Copying the postconfig file from "n20-2-sup" if it exists ... done
    Copying the Common Agent Container keys from "n20-2-sup" ... done
    Setting the node ID for "n20-1-sup" ... done (id=1)
    Verifying the major number for the "did" driver with "n20-2-sup" ... done
    Checking for global devices global file system ... done
    Updating vfstab ... done
    Verifying that NTP is configured ... done
    Initializing NTP configuration ... done
    Updating nsswitch.conf ...
    done
    Adding clusternode entries to /etc/inet/hosts ... done
    Configuring IP Multipathing groups in "/etc/hostname.<adapter>" files
    IP Multipathing already configured in "/etc/hostname.qfe2".
    Verifying that power management is NOT configured ... done
    Ensure that the EEPROM parameter "local-mac-address?" is set to "true" ... done
    Ensure network routing is disabled ... done
    Updating file ("ntp.conf.cluster") on node n20-2-sup ... done
    Updating file ("hosts") on node n20-2-sup ... done
    Updating file ("ntp.conf.cluster") on node n20-3-sup ... done
    Updating file ("hosts") on node n20-3-sup ... done
    Log file - /var/cluster/logs/install/scinstall.log.938
    Rebooting ...
    Mar 13 13:59:13 n20-1-sup reboot: rebooted by root
    Terminated
    root@n20-1-sup # syncing file systems... done
    rebooting...
    R
    LOM event: +103d+20h44m26s host reset
    screen not found.
    keyboard not found.
    Keyboard not present. Using lom-console for input and output.
    Sun Netra T4 (2 X UltraSPARC-III+) , No Keyboard
    Copyright 1998-2003 Sun Microsystems, Inc. All rights reserved.
    OpenBoot 4.10.1, 4096 MB memory installed, Serial #52960491.
    Ethernet address 0:3:ba:28:1c:eb, Host ID: 83281ceb.
    Initializing 15MB Rebooting with command: boot
    Boot device: /pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w21000004cfa3e691,0:a File and args:
    SunOS Release 5.10 Version Generic_127111-06 64-bit
    Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved.
    Use is subject to license terms.
    Hostname: n20-1-sup
    Configuring devices.
    devfsadm: minor_init failed for module /usr/lib/devfsadm/linkmod/SUNW_scmd_link.so
    Loading smf(5) service descriptions: 24/24
    /usr/cluster/bin/scdidadm: Could not load DID instance list.
    Cannot open /etc/cluster/ccr/did_instances.
    Booting as part of a cluster
    NOTICE: CMM: Node n20-1-sup (nodeid = 1) with votecount = 0 added.
    NOTICE: CMM: Node n20-2-sup (nodeid = 2) with votecount = 2 added.
    NOTICE: CMM: Node n20-3-sup (nodeid = 3) with votecount = 1 added.
    NOTICE: clcomm: Adapter qfe5 constructed
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 being constructed
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 being constructed
    NOTICE: clcomm: Adapter qfe1 constructed
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 being constructed
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 being constructed
    NOTICE: CMM: Node n20-1-sup: attempting to join cluster.
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 being initiated
    NOTICE: CMM: Node n20-2-sup (nodeid: 2, incarnation #: 1205318308) has become reachable.
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-2-sup:qfe1 online
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 being initiated
    NOTICE: CMM: Node n20-3-sup (nodeid: 3, incarnation #: 1205265086) has become reachable.
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-3-sup:qfe5 online
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 being initiated
    NOTICE: clcomm: Path n20-1-sup:qfe5 - n20-2-sup:qfe5 online
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 being initiated
    NOTICE: clcomm: Path n20-1-sup:qfe1 - n20-3-sup:qfe1 online
    NOTICE: CMM: Cluster has reached quorum.
    NOTICE: CMM: Node n20-1-sup (nodeid = 1) is up; new incarnation number = 1205416931.
    NOTICE: CMM: Node n20-2-sup (nodeid = 2) is up; new incarnation number = 1205318308.
    NOTICE: CMM: Node n20-3-sup (nodeid = 3) is up; new incarnation number = 1205265086.
    NOTICE: CMM: Cluster members: n20-1-sup n20-2-sup n20-3-sup.
    NOTICE: CMM: node reconfiguration #23 completed.
    NOTICE: CMM: Node n20-1-sup: joined cluster.
    ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast
    NOTICE: CMM: Votecount changed from 0 to 1 for node n20-1-sup.
    NOTICE: CMM: Cluster members: n20-1-sup n20-2-sup n20-3-sup.
    NOTICE: CMM: node reconfiguration #24 completed.
    Mar 13 14:02:23 in.ndpd[351]: solicit_event: giving up on qfe1
    Mar 13 14:02:23 in.ndpd[351]: solicit_event: giving up on qfe5
    did subpath /dev/rdsk/c1t3d0s2 created for instance 2.
    did subpath /dev/rdsk/c2t3d0s2 created for instance 12.
    did subpath /dev/rdsk/c1t3d1s2 created for instance 3.
    did subpath /dev/rdsk/c1t3d2s2 created for instance 6.
    did subpath /dev/rdsk/c1t3d3s2 created for instance 7.
    did subpath /dev/rdsk/c1t3d4s2 created for instance 8.
    did subpath /dev/rdsk/c1t3d5s2 created for instance 9.
    did subpath /dev/rdsk/c1t3d6s2 created for instance 10.
    did subpath /dev/rdsk/c1t3d7s2 created for instance 11.
    did subpath /dev/rdsk/c2t3d1s2 created for instance 13.
    did subpath /dev/rdsk/c2t3d2s2 created for instance 14.
    did subpath /dev/rdsk/c2t3d3s2 created for instance 15.
    did subpath /dev/rdsk/c2t3d4s2 created for instance 16.
    did subpath /dev/rdsk/c2t3d5s2 created for instance 17.
    did subpath /dev/rdsk/c2t3d6s2 created for instance 18.
    did subpath /dev/rdsk/c2t3d7s2 created for instance 19.
    did instance 20 created.
    did subpath n20-1-sup:/dev/rdsk/c0t6d0 created for instance 20.
    did instance 21 created.
    did subpath n20-1-sup:/dev/rdsk/c3t0d0 created for instance 21.
    did instance 22 created.
    did subpath n20-1-sup:/dev/rdsk/c3t1d0 created for instance 22.
    Configuring DID devices
    t_optmgmt: System error: Cannot assign requested address
    obtaining access to all attached disks
    n20-1-sup console login:

  • Add node to the cluster

    Dear Experts,
    Please help me, I am new in RAC and I have a question please;
    We have 6 nodes cluster of 11.2.0.2.3 version and my manager wants me to add a new node to the cluster.
    Could you please send me the steps 1, 2 , 3 ...
    we have not shared oracle homes (oracle user), not shared grid homes (grid user) and they are all patched with patch set update 3 (11.2.0.2.3) all under Enterprise Redhat Linux 5.5.
    Please also tell me with which system user (oracle, root or grid) I have to run each command...
    Could you help me?
    S

    I have the following errors, could you help me please? (Oracle and grid homes are not shared)
    -bash-3.2$ ./addNode.sh "CLUSTER_NEW_NODES={db07}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={db07-vip}"
    Performing pre-checks for node addition
    Checking node reachability...
    Node reachability check passed from node "db01"
    Checking user equivalence...
    User equivalence check passed for user "grid"
    Checking node connectivity...
    Checking hosts config file...
    Verification of the hosts config file successful
    Check: Node connectivity for interface "bond1"
    Node connectivity passed for interface "bond1"
    Node connectivity check passed
    Checking CRS integrity...
    CRS integrity check passed
    Checking shared resources...
    Checking CRS home location...
    ERROR:
    PRVF-4864 : Path location check failed for: "/opt/11.2.0/grid"
    Shared resources check for node addition failed
    Check failed on nodes:
            db07
    Checking node connectivity...
    Checking hosts config file...
    Verification of the hosts config file successful
    Check: Node connectivity for interface "bond0"
    Node connectivity passed for interface "bond0"
    Check: Node connectivity for interface "bond1"
    Node connectivity passed for interface "bond1"
    Node connectivity check passed
    Total memory check passed
    Available memory check passed
    Swap space check failed
    Check failed on nodes:
            db01,db07
    Free disk space check failed for "db01:/tmp"
    Check failed on nodes:
            db01
    Free disk space check failed for "db07:/tmp"
    Check failed on nodes:
            db07
    Check for multiple users with UID value 501 passed
    User existence check passed for "grid"
    Run level check passed
    Hard limits check passed for "maximum open file descriptors"
    Soft limits check passed for "maximum open file descriptors"
    Hard limits check passed for "maximum user processes"
    Soft limits check passed for "maximum user processes"
    System architecture check passed
    Kernel version check passed
    Kernel parameter check passed for "semmsl"
    Kernel parameter check passed for "semmns"
    Kernel parameter check passed for "semopm"
    Kernel parameter check passed for "semmni"
    Kernel parameter check passed for "shmmax"
    Kernel parameter check passed for "shmmni"
    Kernel parameter check passed for "shmall"
    Kernel parameter check passed for "file-max"
    Kernel parameter check passed for "ip_local_port_range"
    Kernel parameter check passed for "rmem_default"
    Kernel parameter check passed for "rmem_max"
    Kernel parameter check passed for "wmem_default"
    Kernel parameter check passed for "wmem_max"
    Kernel parameter check passed for "aio-max-nr"
    Package existence check passed for "make-3.81( x86_64)"
    Package existence check passed for "binutils-2.17.50.0.6( x86_64)"
    Package existence check passed for "gcc-4.1.2 (x86_64)( x86_64)"
    Package existence check passed for "libaio-0.3.106 (x86_64)( x86_64)"
    Package existence check passed for "glibc-2.5-24 (x86_64)( x86_64)"
    Package existence check passed for "compat-libstdc++-33-3.2.3 (x86_64)( x86_64)"
    Package existence check passed for "elfutils-libelf-0.125 (x86_64)( x86_64)"
    Package existence check passed for "elfutils-libelf-devel-0.125( x86_64)"
    Package existence check passed for "glibc-common-2.5( x86_64)"
    Package existence check passed for "glibc-devel-2.5 (x86_64)( x86_64)"
    Package existence check passed for "glibc-headers-2.5( x86_64)"
    Package existence check passed for "gcc-c++-4.1.2 (x86_64)( x86_64)"
    Package existence check passed for "libaio-devel-0.3.106 (x86_64)( x86_64)"
    Package existence check passed for "libgcc-4.1.2 (x86_64)( x86_64)"
    Package existence check passed for "libstdc++-4.1.2 (x86_64)( x86_64)"
    Package existence check passed for "libstdc++-devel-4.1.2 (x86_64)( x86_64)"
    Package existence check passed for "sysstat-7.0.2( x86_64)"
    Package existence check passed for "ksh-20060214( x86_64)"
    Check for multiple users with UID value 0 passed
    Current group ID check passed
    Checking OCR integrity...
    OCR integrity check passed
    Checking Oracle Cluster Voting Disk configuration...
    Oracle Cluster Voting Disk configuration check passed
    Time zone consistency check passed
    Starting Clock synchronization checks using Network Time Protocol(NTP)...
    NTP Configuration file check started...
    NTP Configuration file check passed
    Checking daemon liveness...
    Liveness check passed for "ntpd"
    Check for NTP daemon or service alive passed on all nodes
    NTP daemon slewing option check passed
    NTP daemon's boot time configuration check for slewing option passed
    NTP common Time Server Check started...
    Check of common NTP Time Server passed
    Clock time offset check from NTP Time Server started...
    PRVF-5413 : Node "db01" has a time offset of -165613.0 that is beyond permissible limit of 1000.0 from NTP Time Server "192.168.248.253"
    PRVF-5413 : Node "db07" has a time offset of -161593.0 that is beyond permissible limit of 1000.0 from NTP Time Server "192.168.248.253"
    PRVF-5424 : Clock time offset check failed
    Clock synchronization check using Network Time Protocol(NTP) failed
    User "grid" is not part of "root" group. Check passed
    Checking consistency of file "/etc/resolv.conf" across nodes
    File "/etc/resolv.conf" does not have both domain and search entries defined
    domain entry in file "/etc/resolv.conf" is consistent across nodes
    search entry in file "/etc/resolv.conf" is consistent across nodes
    All nodes have one search entry defined in file "/etc/resolv.conf"
    The DNS response time for an unreachable node is within acceptable limit on all nodes
    File "/etc/resolv.conf" is consistent across nodes
    Checking VIP configuration.
    Checking VIP Subnet configuration.
    Check for VIP Subnet configuration passed.
    Checking VIP reachability
    Check for VIP reachability passed.
    Pre-check for node addition was unsuccessful on all the nodes.
    -bash-3.2$

  • Enable Shadow Copies for Shared Folders in a Failover Cluster (2012 R2)

    I cannot find a reference on how to Enable Shadow Copies for Shared Folders in a Failover Cluster
    using separate disks for Data and Shadow Copies, using Windows Server 2012 R2.  I found an article for Server 2003, but that doesn't solve my issue. 
    Enable Shadow Copies for Shared Folders in a cluster.  Can someone point me to documentation on how to setup properly?
    I have a 3 server failover cluster, a single fileserver role with a 2 TB data storage drive and a 200 GB disk for shadow copy drive.  The 2 TB is ReFS and being used by Home Directories.  The 200 GB is also ReFS, but I didn't have a letter
    assigned, but it is associated with the fileserver role.
    Any help is appreciated.
    Find this post helpful? Does this post answer your question? Be sure to mark it appropriately to help others find answers to their searches.

    As mentioned above I spotted Shadow Copy configuration problems in clustered environment on Windows 2012 R2 with latest updates.
    Enabling Shadow Copy via Cluster Manager on disk will enable schedule and snapshots but schedule is not-cluster aware (is created locally on node) and "Previous Versions" are visible only locally but not via shared folders (UNC path accesible by
    clients)
    When I spotted this behavior I tried different way going via Computer Management and cluster resource name but console always hangs for me when using "Configure Shadow Copies". After many minutes configuration window is opened (Hurray!) but any
    action there will hang whole console again. Enabling Shadow Copies will fail with "0x80004005: Unspecified
    error" for creating scheduled task but "Next Run Time" column still shows "Disabled". Checking disk resource via Cluster Manager shows it is enabled and ?local? scheduled task is created.
    Computer management console still shows DISABLED and any action freezes whole console (only when cluster resource name is used).
    This fix I found, unfortunately, is not applicable to 2012 R2 OS: https://support.microsoft.com/en-us/kb/2894464
    To sum up there is a major bug in MS implementation in Shadow Copies for Shared Folders when clustering is used or it is still not supported in 2012 R2? Doing it outside cluster works as it should (without freezes and Previous Versions are available via
    UNC)
    There is no documentation about FileServer Clustering on 2012(R2) unfortunately.
    Could someone from MS clarify it?
    Many thanks
    Filip
    EDIT:
    Ok, one correction:
    Scheduled task is not created in "Failover Clustering" hive of Scheduled tasks as user may expect but failover of fileshare to other node will unregister/reregister it on active node only.

  • Create directory on a two-node Oracle server cluster

    Hi,
    How can I create directory on a two-node Oracle server cluster (2 identical servers running the same Oracle database instances with shared disks)? Both of them run Windows 2008.
    I know this works for Oracle running in a single server. How about in failover cluster environment?
    CREATE OR REPLACE DIRECTORY g_vid_lib AS 'X:\video\library\g_rated';
    Thanks.
    Andy

    Using 11.2.0.? Using ASM? use ACFS - it is a SHARED file system.
    http://docs.oracle.com/cd/E16338_01/server.112/e10500/asmfilesystem.htm
    create [big empty disk group]
    create ACFS volume {container file that lives in that diskgroup]
    create ACFS file system
    mount the file system.
    Now, all nodes in the cluster can access that shared device at the OS level just as if it were any other device.

  • SQLAlwaysOn 2012-Query server name shows NODE name not Cluster name

    Hi guys,
    I just finished setting up my SQLAlwaysOnLab for our Development team. I'm having an issue understanding the AlwaysOn HA feature.... When I run  the "SELECT @@SERVERNAME AS 'Server
    Name'" The result show the local server name and not the ClusterName or the  Availabilty Group Listener name. We need it to show the actual Cluster name or AG Listner game. Our 2008 r2 2005 clustering we have a shared storage cluster
    and we setup the config services and application, so when i run a query in that environment it shows "NTS-PROD"(not the clustername). Which is what we want.. Thanks for any input at all! My dev config below:
    CLustername=ADM034SQLC050
    3 nodes in the cluster bellow: ADM034SQL051, ADM034SQL052, ADM034SQL053

    Hello,
    Edwin is correct, this is the expected and correct behavior as it's returning the current server that it is executing on.
    What you could do is create a function that could be executed to return the value you want inside of the databases involved in the AGs. That would give you what you want, but the built in functions are working correctly.
    Edit:
    I wanted to expand on what I wrote and qualify it a little more so that it adds some extra information and understanding.
    With a clustered instance, the servername will come back with the clustered instance name. This is because when using clustering, the resource for this is setup both at the SQL Server level (by choosing setup as a clustered instance) and at a windows level
    (resources in the cluster). In a cluster, each instance has a VCO (virtual computer object) created for it in AD and that is actually what is used, so servername comes back with the VCO as we would expect.
    This differs with AGs. AG instances are locally installed instances and they can be connected to without using a listener (in fact a listener isn't even needed for AGs). Listeners exist ONLY at the windows level as a resource, there is no VCO associated
    with a listener or any other AG resource. Each instance can be connected to just like you would any other stand alone instance. In this situation SERVERNAME returns the name of the server that it is currently on as there is nothing special about these servers.
    The clustering is only done at the windows layer and SQL Server installation are simply stand alone.
    Sean Gallardy | Blog |
    Twitter

  • Disk Management shows 'unallocated' and 'online' basic disk on a 4 node Windows 2003 Cluster - Is this disk reclaimable?

    We've a 4 node Windows 2003 File Share Cluster. I logged onto one of the nodes and found there are a lot of SAN connected disks that show 'Unallocated' in Disk Management as below,
    Please could someone advice if these disks are unused? and reclaimable? From what I heard from Adminstrator is that, its the default behavior of Cluster and will be in use by a different node on the same cluster. If so, is there an easier way to identify
    which nodes are using these disks? since it appears as though these disks are mapped to server but not being used, many thanks.

    As expected.
    Things a bit clearer in current versions of Windows Server, but back in the 2003 days, that was how the shared disk was shown on the nodes that did not own the disk.  If you go to each node in the cluster and look at the same thing in each node, you
    will have the same number of disks.  On the node that owns the disk, you will see it represented as you would expect.  On the nodes that do not own the disk, you will see it displayed as you have shown in your screen shot.
    . : | : . : | : . tim

  • Cluster Node Joining other cluster

    Because of a network problem, two of our clusters shared their interconnects. This of course lead to duplicated IPs on the interconnects and reboots of the nodes. Now one of the cluster nodes of cluster a tried to join the ClusterB:
    cluster.name   
    cluster.state   enabled
    cluster.properties.cluster_id   0x48FDxxxx  [ Different from Node A ]
    cluster.properties.installmode  disabled
    cluster.properties.auth_joinlist_type   sys
    cluster.properties.auth_joinlist_hostslist      ,
    cluster.properties.cmm_version  1
    cluster.nodes.1.name    I am really surprised this node tried to join the other cluster, seems it got the ccr from there during one of the reboots.
    The real question i have now is jow to get out of this as soon as we have fixed the network problem, how can we bring back this node to the ClusterA, must we reinstall it.
    Fritz

    I'm surprised that an invalid node picked up a CCR update from a cluster that it wasn't part of. I would have expected the cluster ids to be different and thus prevent this, but to be honest I haven't checked to see how much prevention there is against this.
    Anyway, to get out of it you could hack the CCRs on the other clusters and try and put them onto a different subnet with different private addresses. You'll need to use ccradm. Messy though.
    Tim
    ---

  • Can I use one transport adapter on the nodes of the cluster?

    Hi
    I am new to sun cluster, in the cluster documentation they mentioned that each node should have 2 network cards one for public connections and one for private connection. what if I do not want the nodes to have public connections except for one node. In other words, I want to use one network card on each node except for the first node in the cluster, users can access the rest of the nodes through the fist node . Is that possible? If yes, what should be the name of the second transport adapter while installing the cluster software on the nodes.
    Thank You for the help

    Dear
    We are using cluster for HA on failover condition, If you have only one network adapter so how you work in failover, and you can't assign one adaptor to two node as same, you have min 2 network adapter for 2 node cluster..
    :)GooDLucK
    Mohammed Tanvir

  • What will happen if adding a new node with current cluster, while new node's CPU is slower quality?

    Hello,
    Say, I have a 3 nodes RAC, I want to add a new node to current cluster... while the new node's CPUs are slower than the others.. what will happen?
    (my concern is : can I add this new node successfully? if yes, can it anyway improve the whole cluster performance or not?)
    Thank you
    s9225

    Also you can refer MOS note : RAC: Frequently Asked Questions (Doc ID 220970.1)
    Can I have different servers in my Oracle RAC? Can they be from different vendors? Can they be different sizes?

  • How to find out the IP@s of all nodes in a cluster?

    Is there any way to retrieve the IP addresses of all nodes in a cluster?
    The problem is the following. We intend to write an administration program
    that administers all nodes of a cluster using rmi (e.g. tell all singletons
    in the cluster to reload configuration values etc.). My understanding is
    that rmi only talks to a single node in a cluster. It would be a convenient
    feature if the administration program could figure out all nodes in a
    cluster by itself and then administers each node sequentially. So far we're
    planning to pass all IP addresses to the administration program e.g. as
    command line arguments but what if a node gets left out due to human error?
    Thanks for your help.
    Bernie

    There is no public interface to inquire about the IP addresses of the servers in a cluster. If you use WLS 6.0, there is an administrative console that uses JMX to manage the cluster. Perhaps that would be of use to you?
    Bernhard Lenz wrote:
    Is there any way to retrieve the IP addresses of all nodes in a cluster?
    The problem is the following. We intend to write an administration program
    that administers all nodes of a cluster using rmi (e.g. tell all singletons
    in the cluster to reload configuration values etc.). My understanding is
    that rmi only talks to a single node in a cluster. It would be a convenient
    feature if the administration program could figure out all nodes in a
    cluster by itself and then administers each node sequentially. So far we're
    planning to pass all IP addresses to the administration program e.g. as
    command line arguments but what if a node gets left out due to human error?
    Thanks for your help.
    Bernie

  • Regarding number of nodes in endeca cluster

    Hi,
    I have a question regarding number of nodes in the endeca server cluster.
    Our solution contains one data domain running in a endeca cluster with two nodes.
    Endeca server documentation recommends to run the cluster with atleast 3 nodes however our solution can't accomdate another server straight away.
    Can anyone please suggest what are the implication of running the cluster with two nodes like
    1. Can the cluster still serve the request if one node goes down?
    2, How the leader promoting works if a node goes down?
    Thank you,
    regards,
    rp

    Hi rp,
    You can definitely start with two nodes and then add another Endeca Server node later, if needed. It is recommended to run a cluster of three, for increased availability.
    Here are some answers to you questions about the cluster behavior:
    Q: Can the cluster still serve the request if one node goes down?
    A: Quoting from this portion of the Endeca Server Cluster Guide > How enhanced availability is achieved:
    Availability of Endeca Server nodes
    In an Endeca Server cluster with more than one Endeca Server instance, an ensemble of the Cluster Coordinator services running on a subset of nodes in the Endeca Server cluster ensures enhanced availability of the Endeca Server nodes in the Endeca Server cluster.
    When an Endeca Server node in an Endeca Server cluster goes down, all Dgraph nodes hosted on it, and the Cluster Coordinator service (which may also be running on this node) also go down. As long as the Endeca Server cluster consists of more than one node, this does not disrupt the processing of non-updating user requests for the data domains. (It may negatively affect the Cluster Coordinator services. For information on this, see Availability of Cluster Coordinator services.)
    If an Endeca Server node fails, the Endeca Server cluster is notified and stops routing all requests to the data domain nodes hosted on that Endeca Server node, until you restart the Endeca Server node.
    Let's consider an example that helps illustrate this case. Consider a three-node single data domain cluster hosted on the Endeca Server cluster consisting of three nodes, where each Endeca Server node hosts one Dgraph node for the data domain. In this case:
    If one Endeca Server node fails, incoming requests will be routed to the remaining nodes.
    If the Endeca Server node that fails happens to be the node that hosts the leader node for the data domain cluster, the Endeca Server cluster selects a new leader node for the data domain from the remaining Endeca Server nodes and routes subsequent requests accordingly. This ensures availability of the leader node for a data domain.
    If the Endeca Server node goes down, the data domain nodes (Dgraphs) it is hosting are not moved to another Endeca Server node. If your data domain has more than two nodes dedicated to processing queries, the data domain continues to function. Otherwise, query processing for this data domain may stop until you restart the Endeca Server node.
    When you restart the failed Endeca Server node, its processes are restarted by the Endeca Server cluster. Once the node rejoins the cluster, it will rejoin any data domain clusters for the data domains it hosts. Additionally, if the node hosts a Cluster Coordinator, it will also rejoin the ensemble of Cluster Coordinators.
    Q: How the leader promoting works if a node goes down? See part of the answer above. Also, this: (from the same topic, but later in text)
    Failure of the leader node. When the leader node goes offline, the Endeca Server cluster elects a new leader node and starts sending updates to it. During this stage, follower nodes continue maintaining a consistent view of the data and answering queries. When the node that was the leader node is restarted and joins the cluster, it becomes one of the follower nodes. Note that is also possible that the leader node is restarted and joins the cluster before the Endeca Server cluster needs to appoint a new leader node. In this case, the node continues to serve as the leader node.If the leader node in the data domain changes, the Endeca Server continues routing those requests that require the leader node to the Endeca Server cluster node hosting the newly appointed leader node.
    Note: If the leader node in the data domain cluster fails, and if an outer transaction has been in progress, the outer transaction is not applied and is automatically rolled back. In this case, a new outer transaction must be started. For information on outer transactions, see the section about the Transaction Web Service in the Oracle Endeca Server Developer's Guide.
    Failure of a follower node. When one of the follower nodes goes offline, the Endeca Server cluster starts routing requests to other available nodes, and attempts to restart the Dgraph process for this follower node. Once the follower node rejoins the cluster, the Endeca Server adjusts its routing information accordingly.
    You may ask, why do you need three nodes then? This is to achieve the high availability of the cluster services themselves.
    Quoting:
    If you do not configure at least three Endeca Server nodes to run the Cluster Coordinator service, the Cluster Coordinator service will be a single point of failure. Should the Cluster Coordinator service fail, access to the data domain clusters hosted in the Endeca Server cluster becomes read-only. This means that it is not possible to change the data domains in any way. You cannot create, resize, start, stop, or change data domains; you also cannot define data domain profiles. You can send read queries to the data domains and perform read operations with the Cluster and Manage Web Services, such as listing data domains or listing nodes. No updates, writes, or changes of any kind are possible while the Cluster Coordinator service in the Endeca Server cluster is down — this applies to both the Endeca Server cluster and data domain clusters. To recover from this situation, the Endeca Server instance that was running a failed Cluster Coordinator must be restarted or replaced (the action required depends on the nature of the failure).
    Julia

  • Find nodes in a cluster

    How to find all the nodes in a cluster.
    olsnodes -n gives the the info of nodes which are up . If any of the nodes is down it will not show.Is there any command or file from which i can check all nodes of a cluster even if it is down.

    In 11.2.0.3 i use:
    crsctl stat res -tHope it helps.

  • Cloned two vm cluster nodes from development cluster to act as template to create production cluster

    Morning,
    There was so much done to setup the development cluster, I thought it would be easy to have the two nodes in the cluster cloned.  To my surprise, the development cluster was up and happily running on the new vm servers.  Stopping resources verifies
    it is stopping and starting resources on the original cluster.  I am not sure how to safely have the two new servers not manage the development cluster and create a new production cluster on them.  
    I am hesitant to destroy the cluster as I suspect it will destroy the real development cluster.  How do I do this?  How do I delete the windows cluster software and re-install without affecting the development cluster?
    Note that I tried to create a new cluster in the failover cluster manager and specify the new vm cluster servers, but it says they are part of a cluster already.  I do not see them listed as nodes.  I am not sure how to see what cluster it thinks
    the new servers are part of or how to not make them part of the development cluster.  That might be the path to my solution.

    This actually has worked out okay.  I found these steps and did them on both of the nodes that were claiming to be in a cluster already:
    powershell
    Import-Module FailoverClusters;
    clear-clusternode

Maybe you are looking for

  • Stop auto recharge

    Please stop auto-recharge for my account. I didn't select any auto recharge option but Skype keeps recharging once the program expired. This is DECEPTION. I am considering raise a sue against Skype. So please stop any auto-recharge for my account.

  • F4 help issue with Webdynpro ABAP application

    Hi, I am facing a strange issue with the F4 help in my webdynpro abap application. I have few fields for Purchase Order and Line item on the screen and they have corresponding F4 helps associated with them. When i select for F4 help, sometimes the sc

  • Crackly Sound on L755 Laptop, no headphone jack output

    I have a L755 - S5366 laptop.   Windows 7 Home Premium.     Crackly sound comes from onboard speakers when using internet explorer or any of the "TEST" functions located in the Sound properties dialog boxes.  When playing YouTube, no sound at all.  U

  • Re: Problem in recording VF02 billing transaction

    Hi all,            I am having a problem in recording the VFO2 for the processing of Despatch details in third screen Header Texts. How to write the BDC code for uploading data for this TCODE VF02.particularly with header texts.....                  

  • Flakey vpn-addr-assign dhcp -- anyone else?

    When we use vpn-address-assign dhcp we sometimes (often actually) get a failure to assign an address. This shows up as either an error 738 or error 720 on windows native clients. Reconnecting immediately after the error generally works fine. Does any