Quorum Server Bug?

I have a 2 node cluster using a quorum server running on a third server. I turned off one node this morning, and then used svcadm to disable the quorum server on the third server. THis was in an attempt to panic the remaining node for testing purposes.
It's been about 20 minutes now, and the remaining node has not panicked from losing quorum. clq status and scstat -q show essentially the same things, though in diffent formats. There are 2 votes needed, 2 present, and 3 possible. Under the details, the first node has 0 votes present, the up node has 1, and the disabled quorum server has 0.
cletus# clq status
Cluster Quorum ===
--- Quorum Votes Summary ---
            Needed   Present   Possible
            2        2         3
--- Quorum Votes by Node ---
Node Name       Present       Possible       Status
brandine        0             1              Offline
cletus          1             1              Online
--- Quorum Votes by Device ---
Device Name       Present      Possible      Status
hwi_san_qs        0            1             Offline
cletus# scstat -q
-- Quorum Summary --
  Quorum votes possible:      3
  Quorum votes needed:        2
  Quorum votes present:       2
-- Quorum Votes by Node --
                    Node Name           Present Possible Status
  Node votes:       brandine            0        1       Offline
  Node votes:       cletus              1        1       Online
-- Quorum Votes by Device --
                    Device Name         Present Possible Status
  Device votes:     hwi_san_qs          0        1       Offline
cletus# cletus# clq status
    1             1              Online
--- Quorum Votes by Device ---
Device Name       Present      Possible      Status
hwi_san_qs        0            1             OfflineCould this be a bug with quorum math when used with quorum server? Is it related to me using svcadm to shut down the quorum server gracefully?

To add what Tim has described. The improved quorum monitoring feature will detect that you shut down the quorum server, but will not panic the remaining node. Why should it? It knows it is the only node left and can safely continue to run. When the second node failed it had enough votes to reconfigure!
Regards
hartmut

Similar Messages

  • SC3.2 Quorum Server in S10 container?

    Hi,
    Is it possible to have a single server act as a quorum device for several (test / development) clusters? If it's not supported by the software, perhaps by installing multiple copies of the software within S10 zones?
    TIA. Tom

    Hi Tom,
    It is supported and documented in http://docs.sun.com/app/docs/doc/819-5360
    you just have to add separate lines in /etc/scqsd/scqsd.conf and differentiate by instance name and port.
    You only have to be aware that a single quorum server is s single point of failure and thatt's it.
    Cheers
    Detlef

  • Quorum Server Redundancy Question

    Hi All,
    I'm just investigating my options for a new cluster configuration and was trying to find out about multiple quorum servers hosts. All the examples I have come across in the documentation have 1 physical host acting as a quorum server for an n+1 node cluster. I'm ssuming that there will be quorum issues in the event the physical host, hosting the quorum server, is down and the cluster nodes performed a reconfiguration/switch when the quorum server was unavailable.
    Is it possible to have 2 physical hosts, with quorum servers defined on each, that can then be confidured in to the cluster. Effectivly pointing at two different quorum servers for votes?

    Correct, the QS is only used if the cluster changes state, i.e. nodes leave or join. However, having more than 1 QS for a single cluster does not help. You simply lower your overall availability because there are more failure scenarios where one of these is down, leading to insufficient votes for the remaining cluster node to obtain.
    Active monitoring and prompt repair of the QS (or QD) is the right approach.
    Tim
    ---

  • Lync 2013 FE quorum server

    Hi,
    1. how i can check which is the perimary quorum server in the FE pool lync 2013 ?
    2. how i can change the quorum server before restart the FE server ?
    Thanks,

    1. Get-CsPoolFabricState will get you the node order. 2. You can't. Well not that I've been able to work out. The only option is to reset the fabric, which you can do of you run into issues with services not starting as a result of fabric issues.
    If this helped you please click "Vote As Helpful" if it answered your question please click "Mark As Answer" | Blog
    www.lynced.com.au | Twitter
    @imlynced

  • Clq status shows quorum server offline even though the clq service is runni

    Hi,
    In a 2 Node + 1 QS sun cluster 3.2 cluster, clq status is showing quorum sever offline even though the clq process is running on the quorum server. to make the quorum server online, i have to either remove and add the quorum server from cluster, or incase if there is a failure on any one of the node's both th nodes will reboot and once both joined to the cluster, I can see clq status showing quorum server online!!!
    Why is the quorum server going offline automatically?
    Any help would be highly appreciated
    Many thanks in advance
    Ushas Symon

    Hi,
    I asssume you mean the scqsd process is running on the QS, right?
    A QS is shown as offline, it the monitor could not reach it when it last tried. This is usually due to a networking problem.
    If you issue a clq status, the monitor checks again and if it can reach the QS will change its status back to online.
    If this does not happen, check your logs, what kind of error message showed up.
    Does clqs show on the QS show the correct information?
    It is obvious, that if a node dies and the QS has been offline prior to the node death, that the other node will die as well due to lack of quorum, i.e. it has less votes than needed. You seem to have a basic networking problem or something is really wrong with your QS.
    Regards
    Hartmut

  • Lost quorum using quorum server

    I have a two node cluster with a third node being used as a quorum server. The quorum server is on a different VLAN than the cluster nodes.
    If I shut down or reboot node 1, node 2 dies. I can reboot node 2 without any impact on node 1, it works as it's supposed to. In this scenario the nodes seem to be unable to talk to the quorum server so the cluster thinks its lost quorum and it dies. After it is back up the quorum server is corrupt and shown as offline so the only way to restore it is to remove it from the cluster and readd it.
    Node 2 never seems to be able to see the quorum server, it always shows it as offline, even when I add it from node 2. Node 3 is not a cluster configured server, it only has the quorum server installed on it to support the cluster of node 1 & 2. Node 3 is on another vlan, therefore there is a router in the path.
    Does the quorum server have to be on the same VLAN so that there's no router in the loop?
    Does anybody have any ideas on what could be wrong here? Does the quorum server really work or do I have to add a third server to the cluster just to have a stable cluster?
    --- Further info, the clqs show command on node 3 shows disabled false, and keys for node 1, but nothing for node 2 and node 2 can't open the qs.
    Thanks,
    Terry
    Edited by: taccooper on Jan 21, 2010 4:00 AM
    Edited by: taccooper on Jan 26, 2010 3:58 AM
    Edited by: taccooper on Jan 26, 2010 6:19 AM

    I may have answered my own question by applying patch 127406-04. With this patch applied both cluster nodes are now seeing the quorum server. The patch among other things solves an issue with simultaneous opens failing. I will have to do some failure testing to confirm this.
    Update: failure testing complete, this patch solved the issue.
    Terry
    Edited by: taccooper on Jan 27, 2010 5:20 AM

  • Clq status shows quorum server offline eventhough it is online

    Hi,
    I am facing a problem in my cluster that when I bring all the three nodes in a cluster (2 app-nodes and one quorum server) at almost the same time, the clq status on any of the cluster nodes is showing quorum server as offline. when I do the clqs show on the quorum server, i ge the below output.
    clqs show
    === Quorum Server on port 9000 ===
    --- Cluster beacluster (id 0x4916625B) Reservation ---
    Node ID: 1
    Reservation key: 0x4916625b00000001
    --- Cluster beacluster (id 0x4916625B) Registrations ---
    Node ID: 1
    Registration key: 0x4916625b00000001
    Node ID: 2
    Registration key: 0x4916625b00000002
    this is cluster 3.2
    any inputs will be appreciated
    Thaks in advance
    Ushas Symon

    Hi, this is solaris cluster 3.2u1..
    I got the quorum server online by
    # clq status
    --- Quorum Votes by Device ---
    Device Name Present Possible Status
    rac1 1 1 Offline
    # clq add -t quorum_server -p qshost=xxxxxxxxxx -p port 9001 rac2
    # clq status
    --- Quorum Votes by Device ---
    Device Name Present Possible Status
    rac1 1 1 Online
    rac2 1 1 Online
    By just adding one another QS both the QS status came online !!!!!!!!!!!!
    no IDEA, what is happening...
    anyways I have deleted the second QS by #clq remove rac2 and #clq reset
    now it is fine..
    Thanking you all
    Ushas Symon

  • Quorum Server Question

    Hi All,
    If using two or more clusters, can cluster nodes be quorum servers for
    for other clusters?
    /Regards
    Ulf

    Yes, if is feasible. There have been discussions internally about the possibility of making an HA quorum server. Personally, I'm not sure of the value of doing this as you need to guard against the possibility of correlated failured that cause everything to fail in a cascading manner.
    Tim
    ---

  • ODBC BI Server Bug - arithmetic operation resulted in an overflow

    I am trying to write some really simple .NET code access the Oracle BI Server ODBC driver and it's not working at all.  I've connected fine, however it seems like anything I try to do related to getting database information spits up an error "arithmetic operation resulted in an overflow".
    Here is the code:
    Dim ConnectString As String
    Dim FactoryType As String
    Dim Factory As System.Data.Common.DbProviderFactory
    Dim Connection As System.Data.Common.DbConnection = Nothing
    Dim TablesData As System.Data.DataTable = Nothing
    Dim err As String = ""
    Dim nl As String = Chr(13) + Chr(10)
    Try
        ' Connect to the database via ODBC
        ConnectString = "DSN=BSODBC_7;uid=TheUser10;pwd=************"
        FactoryType = "System.Data.Odbc"
        Factory = System.Data.Common.DbProviderFactories.GetFactory(FactoryType)
        Connection = Factory.CreateConnection
        Connection.ConnectionString = ConnectString
        Connection.Open()
        ' Request a list of tables from the database
        ' ** Tried both with restrictions and without
        ' ERROR on this line:
        ' “Arithmetic operation resulted in an overflow.”
        TablesData = Connection.GetSchema("Tables")
        ' Show the list of tables on the screen in a grid
        ' If it was successful.
        OnScreenGrid.AutoGenerateColumns = True
        OnScreenGrid.DataSource = TablesData
    Catch ex As Exception
        ' Report the error
        err = ex.Message
        If Not (ex.InnerException Is Nothing) Then
            If Not (ex.InnerException.Message Is Nothing) Then
                err = err + nl + nl + ex.InnerException.Message
            End If
        End If
        MsgBox(err, MsgBoxStyle.OkOnly + MsgBoxStyle.Exclamation, "Error")
    Finally
        ' Clean up and Close the DB Connection
        If Not (Connection Is Nothing) Then
            Connection.Close()
            Connection.Dispose()
            Connection = Nothing
        End If
    End Try
    Any Thoughts?  Is this a known bug?  Is there a fix?

    I doubt on line
    OnScreenGrid.DataSource = TablesData
    instead of array as TablesData try to take List object and assign it to OnScreenGrid.DataSource
    just in case check this
    DataGridView.AutoGenerateColumns Property (System.Windows.Forms)
    I might be wrong but just check it

  • BPEL Server bug?

    Hi all,
    I experienced a curious problem using the test Console of BPEL Server 10.1.2. I created a process where, besides other activities, there are 2 receives (the first has createInstance="yes") with correlationSet, on the same partnerLink, portType, operation, and using the same BPEL variable. These 2 receives are in a sequence (not in concurrence). The problem is that when I send the second message on the second receive through the BPEL Console, the Server initiates a new instance of the same process. I went over the BPEL Specification and I think this should be possible according to that document.
    Note, if I try the same think with 2 non ititial receives (without createInstance="yes") it works properly. Even with 2 invokes on the same partnerLink, portType and operation (not in concurrence).
    I suppose this is a bug of the BPEL Server.
    Could you check please?
    Thanks a lot

    While if I use, for the 2nd receive, another 'operation', declared into the same portType, it works properly. I think maybe the BPEL Server maps every process with the operation name used by the first receive activity, in order to create process instances.
    Thanks

  • Where could I report SAPWAS server bug

    Hi All,
    I find very interesting bug in SAPWAS server. I would like to confirm or report with SAP Team. What is the procedure for that? Please any one guides me.
    Thanks & Regards
    -AW

    Hi Adam,
    The user ID and pwd is available to customers and partners.
    Also for SAP certified developers and consultant.
    You have to be one of them to get the access.
    Any way !
    You can share the bug on the forum. Probably somebody from SAP WAS team will pick it up.
    Regards,
    Ashwani Kr Sharma

  • SWF load a XML on an other server = BUG

    Hi AS3 fellaz,
    All sources availbale here : http://www.tapiocadesign.com/prods/xstrata/_CROSS_DOMAIN_EXEMPLE.zip
    MY GOAL :
    from a flash(swf) on server A, I want to download an XML file on server B.
    MY PROBLEM : it doesn t work
    1 - I use the Class URLLoader
    2 - I set both these domains ok with Security.allowDomain() method
    3 - I upload a "crossdomain.xml" (in folder and root server) file who allow crossdomain communication (adviced by Adobe)
    Here is the comparaison :
    LOCAL WORKS (swf and xml are on the same server) :
    http://www.tapiocadesign.com/prods/xstrata/_CROSS_DOMAIN_EXEMPLE/URLLoader_localXML.html
    var request:URLRequest = new URLRequest("XML_exemple.xml");
    DISTANT DOESN'T WORK (swf on server A, xml on server B) :
    http://www.tapiocadesign.com/prods/xstrata/_CROSS_DOMAIN_EXEMPLE/URLLoader_distantXML.html
    var request:URLRequest = new URLRequest("http://www.nullepart.com/prods/xstrata/_CROSS_DOMAIN_EXEMPLE/XML_exemple.xml");
    This is a very important project, I would worship your brain if you can solve this. And maybe a litle gift... so desperate...

    i used var request:URLRequest = new URLRequest("http://www.xstrata.com/operations/xstrata_map.xml"); and had no problem.

  • Microsoft Lync Server bug

    Dear Sirs!
    One of our customer have a MS Lync Server problem.
    They are use Windows Server 2008R2 SP1 and configured windows Active Directory, Exchange Server 2013 standard, Lync Server 2013 standard
    They have 100 client in MS Lync Server 2013 standard and these 100 users sliced into 7 group. Name of
    Бүгд /it mean all/ group there have all users.
    And when they start group conversation to Бүгд group the issue is that, some of the users see a blank conversation window after joining the group conversation these users cannot see the first message of the group conversation.
    In the below link there is have same problem but they can’t find solution. So please help us how we can fix this problem?
    http://social.technet.microsoft.com/Forums/lync/en-US/33c0135e-3862-42e5-885a-873583bd79b9/group-conversation-missing-first-message?forum=ocspresenceim
    Best regards, Azbayar Bat-Ochir

    Hi,
    As mentioned in the other post, in Lync Server 2010 this was by Design. I don't know if this changed in Lync Server 2013.
    You can ask users to remove off-line contacts but I doubt that they want this solution.
    David

  • Mail not delivered to outside recipients with same name on server - Bug?

    Hi,
    I'm testing the mail server before telling my boss it's OK to move from our old sun "appliance". I have set up all users names and passords on our new mac mini the same as they are on that old server.
    The server will be thisdomain,com and the recipients will be thatdomain.com.
    I am off site and testing from home using mail.thisdomain,com as my outgoing server. I have all spam controls off for testing. In this example I am scott @ thisdomain.com trying to send an email to carol @ thatdomain.com.
    What's happening is that carol @ thatdomain.com is not getting the message. instead it is being dlivered to the carol on the server at carol @ thisdomain.com
    Does anybody know why this is happening and how to fix it? The log even says "orig_to=<carol @ thatdomain.com>" Here is the complete session:
    Sep 24 13:18:08 thisdomain.com postfix/postscreen[33490]: CONNECT from [50.174.118.164]:50680 to [173.164.166.20]:25
    Sep 24 13:18:14 thisdomain.com postfix/postscreen[33490]: PASS OLD [50.174.118.164]:50680
    Sep 24 13:18:14 thisdomain.com postfix/smtpd[33493]: connect from c-50-174-118-164.hsd1.ca.comcast.net[50.174.118.164]
    Sep 24 13:18:15 thisdomain.com postfix/smtpd[33493]: 9069B18C6CA: client=c-50-174-118-164.hsd1.ca.comcast.net[50.174.118.164], sasl_method=CRAM-MD5, sasl_username=scott
    Sep 24 13:18:15 thisdomain.com postfix/cleanup[33502]: sacl_check: mbr_user_name_to_uuid([email protected]) failed: No such file or directory
    Sep 24 13:18:15 thisdomain.com postfix/cleanup[33502]: 9069B18C6CA: message-id=<CE674190.1E707%[email protected]>
    Sep 24 13:18:15 thisdomain.com postfix/qmgr[33388]: 9069B18C6CA: from=<[email protected]>, size=723, nrcpt=1 (queue active)
    Sep 24 13:18:15 thisdomain.com postfix/pipe[33504]: 9069B18C6CA: to=<[email protected]>, orig_to=<[email protected]>, relay=dovecot, delay=0.59, delays=0.43/0.03/0/0.14, dsn=2.0.0, status=sent (delivered via dovecot service)
    Sep 24 13:18:15 thisdomain.com postfix/qmgr[33388]: 9069B18C6CA: removed
    Sep 24 13:18:21 thisdomain.com postfix/smtpd[33493]: disconnect from c-50-174-118-164.hsd1.ca.comcast.net[50.174.118.164]
    I also worry about the line:
    sacl_check: mbr_user_name_to_uuid(carol @ thisdomain.com) failed: No such file or directory
    This is driving me batty, so any help appreciated.
    Scott

    This is supposed to be the server for  thisdomain,com  and I have not entered anything in for  thatdomain.com. So that is really puzzling. Is there anything you can help me with in the postconf below?  Thanks!
    postconf -n
    biff = no
    command_directory = /usr/sbin
    config_directory = /Library/Server/Mail/Config/postfix
    daemon_directory = /usr/libexec/postfix
    data_directory = /Library/Server/Mail/Data/mta
    debug_peer_level = 2
    debugger_command = PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin xxgdb $daemon_directory/$process_name $process_id & sleep 5
    dovecot_destination_recipient_limit = 1
    html_directory = /usr/share/doc/postfix/html
    imap_submit_cred_file = /Library/Server/Mail/Config/postfix/submit.cred
    inet_interfaces = loopback-only
    inet_protocols = all
    mail_owner = _postfix
    mailbox_size_limit = 0
    mailq_path = /usr/bin/mailq
    manpage_directory = /usr/share/man
    message_size_limit = 10485760
    mydomain_fallback = localhost
    mynetworks = 127.0.0.0/8, [::1]/128
    newaliases_path = /usr/bin/newaliases
    queue_directory = /Library/Server/Mail/Data/spool
    readme_directory = /usr/share/doc/postfix
    recipient_delimiter = +
    sample_directory = /usr/share/doc/postfix/examples
    sendmail_path = /usr/sbin/sendmail
    setgid_group = _postdrop
    smtpd_client_restrictions = permit_mynetworks permit_sasl_authenticated permit
    smtpd_tls_ciphers = medium
    smtpd_tls_exclude_ciphers = SSLv2, aNULL, ADH, eNULL
    tls_random_source = dev:/dev/urandom
    unknown_local_recipient_reject_code = 550
    use_sacl_cache = yes

  • 10.5.4 now available - fixes file saving on server bug

    Title says it all. Downloading now.
    PS
    Here's Apple's description: "Resolves an issue with saving and reopening Adobe Creative Suite 3 files on a remote server."

    >What I have always assumed to be happening is that files with damaged headers are partially over-written or masked by other damaged files. Only after you have cleared-up the overlying files, can DW reach and repair the underlying files.
    Ann,
    DiskWarrior only deals with the directory, not files.
    Disk directories are the area of a hard disk that the Mac OS uses to "map" all the information stored on the drive so that the Finder can find it. The directory records the number, names, locations, and sizes of all files and folders stored on the disk. If any of this information becomes corrupted - incorrectly updated or not updated - the directory is considered to be damaged.
    Quite a lot of cumulative directory damage can occur without being immediately noticeable to the user, and this is especially likely to happen if the computer crashes, suffers a kernel panic, or encounters another problem requiring a force restart or reset without properly shutting down. Poorly written programs can also write data erroneously into the portion of the disk reserved for directories.
    If directory damage is left unresolved, it tends to become worse and could eventually result in permanent data loss and/or system instability.
    DiskWarrior doesn't only rebuild the directory; it also optimizes it for performance by defragmenting it and packing nodes, making the physical order equal to the linked (logical) order. Packing combines nodes that are not full so you end up with fewer nodes.
    DiskWarrior doesn't do anything at all to files on your hard drive. All it deals with is the directory.

Maybe you are looking for