Probe skipping

Hello,
I am running into a rather interesting issue and I was curious if anyone may have seen it before or if anyone had any insight into what the problem could be.  On one of my ACE 4710's (running sw A5(1.2) , I am running a fairly large number of layer 7 probes (71) across both 80 and 443.  At seemingly random points in the day, the system reports that the probes are being skipped due to an internal error.  I have seen this before when the system runs out of sockets for the probes, but I am not seeing any indication that is the case.
Here is an example probe config:
probe https CHECK-SOME-SITE
  port 443
  interval 10
  faildetect 2
  passdetect interval 30
  receive 5
  ssl version all
  request method get url /some/url
  header Host header-value "www.somesite.com"
  expect regex "SOMEREGEX"
Here is the relevant output from ''show probe detail'
     real      : some-rserver
                          x.x.x.x  443 PROBE   3093610 1749563 1344047 SUCCESS
   Socket state        : CLOSED
   No. Passed states   : 49         No. Failed states : 49
   No. Probes skipped  : 479         Last status code  : 200
   No. Out of Sockets  : 0         No. Internal error: 0
   Last disconnect err :  -
   Last probe time     : Tue Mar  4 16:45:03 2014
   Last fail time      : Fri Feb 28 13:30:37 2014
   Last active time    : Mon Mar  3 22:08:53 2014
Here are the log messages that are popping up:
Mar  4 2014 14:36:41 : %ACE-3-251014: Could not probe server x.x.x.x on port 443 for 4 consecutive tries - Internal error
The log messages appear for all rservers being probed for about 30 seconds, then they go away until the next event.  Considering the probes are skipped, I do not believe this is actually causing failures at the moment.  I have read that the ACE platform can only run 200 concurrent scripted probes, however I am at a loss as to how to check if that is what I am running into here.  The real confusing thing here is the lack of internal error and out of socket counters. 
Any help or insight would be very appreciated.  Thanks in advance.
-Ed

Hi Ed,
Two things:
Number of skipped probes. A skipped probe occurs when the ACE does not send out a probe because the scheduled interval to send a probe is shorter than it takes to complete the execution of the probe; the send interval is shorter than the open timeout or receive timeout interval.
In your case the interval is 10 which is little aggressive but still less than receive. But if the probe execution is greater than 10 seconds you may see probes getting skipped. Increasing the interval time by another 10 seconds can be helpful for testing to see if this mitigates the issue.
If you have  UDP probes then you need to check this as well:
For UDP probes or UDP-based probes, we recommend a time interval value of 30 seconds. The reason for this recommendation is that the ACE data plane has a management connection limit of 100,000. Management connections are used by all probes as well as Telnet, SSH, SNMP, and other management applications. In addition, the ACE has a default timeout for UDP connections of 120 (ACE module) or 15 (ACE appliance) seconds. This means that the ACE does not remove the UDP connections even though the UDP probe has been closed for two minutes. Using a time interval less than 30 seconds may limit the number of UDP probes that can be configured to run without exceeding the management connection limit, which may result in skipped probes
Are you running any scripted probes?
It could be a stupid bug as well but i would suggest increasing the interval timeout and see how it goes.
You can also alo try debug hm errors/events/all etc and see if you get any detailed output there which can be sent to TAC for further investigation.
Regards,
Kanwal

Similar Messages

  • ACE 4710 HTTP Probes

    Using the ACE 4710 for loadbalancing a Sharepoint site.
    We currently have a HTTP probe setup to check the port 80 status of the rserver.
    Is there anyway to get the HTTP probe to check a DNS entry for each of the application sites? For instance http://info vs http://site are two different web sites running on the same IP. One site could have a problem but the actual port 80 for the IP may be still alive.
    Thanks for any information.

    Has anyone figure this out?  I am tring to get healthchecks/probes setup in this same fashion.  I have 2 servers with 1 IP but have many sites.  I want to probe each side and ensure I get a 200 code.  I also have to provide credentials to the site.  It seems that if i open IE I can log in just fine to the site with the credentials.  However there is an active x control box that is wanting to be installed.  When I set this up on my ACE it seems I am getting a http 401 unauthorized error.  I have done a wireshark capture while I was browsing and I see the 401 however it also reports a 200 code after that.  Do you think this is a problem because of the active x control wanting to be downloaded?  Or is this an issue with the first http code that is recieved by the probe, that being the 401 and then the 200? Below is my config (cleaned of course).
    probe http HTTP-80-OUR.DOMAIN.COM
      interval 15
      passdetect interval 60
      credentials
      request method get url http://our.domain.com/default.aspx
      expect status 200 200
      header Host header-value "our.domain.com"
      open 1
    rserver host SERVER-A
      ip address X.X.X.47
      inservice
    rserver host SERVER-B
      ip address X.X.X.48
      inservice
    serverfarm host FARM-AB
      predictor leastconns
      probe HTTP-80-OUR.DOMAIN.COM
      rserver SERVER-A
        inservice
      rserver SERVER-B
        inservice
    ACE4710# show probe HTTP-80-OUR.DOMAIN.COM detail
    probe       : HTTP-80-OUR.DOMAIN.COM
    type        : HTTP
    state       : ACTIVE
    description :
       port      : 80      address     : 0.0.0.0         addr type  : -
       interval  : 15      pass intvl  : 60              pass count : 3
       fail count: 3       recv timeout: 10
       http method      : GET
       http url         : http://our.domain.com
       conn termination : GRACEFUL
       expect offset    : 0         , open timeout     : 1
       expect regex     : -
       send data        : -
                    ------------------ probe results ------------------
       associations ip-address      port  porttype probes   failed   passed   health
       ------------ ---------------+-----+--------+--------+--------+--------+------
       serverfarm  : OUR.DOMAIN.COM-10.25.4.12-L3-FARM
         real      : SERVER-A[0]
                    X.X.X.47      80    DEFAULT  414      406      8        FAILED
       Socket state        : CLOSED
       No. Passed states   : 1         No. Failed states : 2
       No. Probes skipped  : 0         Last status code  : 401
       No. Out of Sockets  : 0         No. Internal error: 0
       Last disconnect err : Received invalid status code
       Last probe time     : Wed Jun  2 17:44:18 2010
       Last fail time      : Wed Jun  2 13:37:04 2010
       Last active time    : Wed Jun  2 13:34:19 2010
         real      : SERVER-B[0]
                    X.X.X.48      80    DEFAULT  414      406      8        FAILED
       Socket state        : CLOSED
       No. Passed states   : 1         No. Failed states : 2
       No. Probes skipped  : 0         Last status code  : 401
       No. Out of Sockets  : 0         No. Internal error: 0
       Last disconnect err : Received invalid status code
       Last probe time     : Wed Jun  2 17:44:20 2010
       Last fail time      : Wed Jun  2 13:37:06 2010
       Last active time    : Wed Jun  2 13:34:21 2010

  • Issue with Scripted Probe for LDAP

    I have the script LDAP_PROBE loaded into memory on my ACE 4710 (A4(2.0)) and th Probe is name is configured for the LDAP port the servers are listening on. So here is th econfiguration.
    probe scripted LDAP_PROBE_3389
      port 3389
      interval 5
      passdetect interval 5
      passdetect count 2
      receive 5
      script LDAP_PROBE 3389
    I have tried removing the argument of 3389 at the bottom as well but I continue to get the result:
    real      : LDAP02[3389]
                    10.220.31.81    3389  PROBE    2491     2491     0        FAILED
       Socket state        : RESET
       No. Passed states   : 0         No. Failed states : 1
       No. Probes skipped  : 0         Last status code  : 30002
       No. Out of Sockets  : 0         No. Internal error: 0
       Last disconnect err : Probe error: Server did not respond as expected
       Last probe time     : Thu Jul 12 16:24:41 2012
       Last fail time      : Thu Jul 12 12:56:59 2012
       Last active time    : Never
    The server log states this was successful however...
    Admin Acct Status: Not Locked
    AuditV3--2012-07-11-14:18:21.428+00:00DST--V3 anonymous Bind--bindDN: <*CN=NULLDN*>--client: 10.220.31.217:56908--connectionID: 8--received: 2012-07-11-14:18:21.428+00:00DST--Success
    name: <*CN=NULLDN*>
    authenticationChoice: simple
    Admin Acct Status: Not Locked
    Am I missing an argument? I have run debug on LDAP but really don't know what I am looking at...

    To update the script
    ==============
    Extract the Cisco-supplied LDAP script from the tar.gz or zip file. Rename it to something unique. Update it to use the
    new length and offset.
    Import the script into the LDAP contexts on both ACEs. Remember, scripts are not replicated and having mismatched scripts will cause replication to fail.
    ACE1/ldap# copy tftp: disk0:
    Enter source filename[]? UoN-LDAP_PROBE-iLDAP2
    Enter the destination filename[]? [UoN-LDAP_PROBE-iLDAP2]
    Address of remote host[]? [redacted]
    Trying to connect to tftp server......
    TFTP get operation was successful
    ACE2/ldap# copy tftp: disk0:
    Enter source filename[]? UoN-LDAP_PROBE-iLDAP2
    Enter the destination filename[]? [UoN-LDAP_PROBE-iLDAP2]
    Address of remote host[]? [redacted]
    Trying to connect to tftp server......
    TFTP get operation was successful
    script file 13 UoN-LDAP_PROBE-iLDAP2
    If you look at (for example) packet 651 in the capture in wireshark you'll see a
    successful bind response. You will need to tell wireshark to decode the packet as LDAP.
    The payload is:
    30 84 00 00 00 10 02 01 01 61 84 00 00 00 07 0a 01 00 04 00 04 00
    You need to have a basic understanding of ASN.1 and something called Basic Encoding Rules (BER) - whicj comes down to TLV format structures.
    The key to understanding this output is that there are three ways of specifying a length in ASN.1. The first way we have already seen in the Cisco script is to use a single byte. This known as the "definite" form and can be used for lengths of 127 bytes or less. Otherwise if the high bit is set to one, the low seven bits define the length of length. The length is then encoded in that many bytes. This is the "length of the length field" form. It looks like Microsoft Active Directory uses the indefinite form for all length encoding. The third form (for completeness is "indefinite" where the length is coded as x'80' and the end of the content is marked by x'0000'. Deconstructing the data:
    0x30    The start of a universal constructed sequence
    0x84    The length of the sequence in "length of the length" format. The next 4 bytes give the length.
    0x00000010    sequence length of 16 bytes
    0x02    Integer
    0x01    The length of the next field (1 byte)
    0x01    Value (this is the message ID which agrees with the ID in the BIND Request)
    0x61    Application, number 0, use RFC2251 to decode. This is a Bind Response
    0x84    The length of the sequence in "length of the length" format. The next 4 bytes give the length.
    0x00000007    bind response length of 7 bytes   
    0x0a    Enumeration
    0x01    Length 1
    0x03    0 - Success
    0x04    String
    0x00    Length 0 (null string)
    0x04    String
    0x00    Length 0 (null string)
    The patch given takes in 20 bytes from the bitstream,converts it into a hexadecimal string  and finds the 6 hexadecimal characters from the 16th byte onwards   (Tcl uses zero-based arrays). This is the response code.
    Kind Regards
    Cathy

  • HTTP probe issue with expect regex string

    Hello,
    We have a simple cgi status page setup to poll a background service and return a "PASS" or "FAIL" as output.  I've setup an HTTP probe to look for the "PASS" to determine application health.  The issue appears to be that the expect regex is searching the HEADER but not the BODY of the web page.  I can successfully match on any string in the header, but never on anything in the body.
    Here is what the web page returns if you telnet to it:
    HTTP/1.1 200 OK
    Date: Thu, 22 Sep 2011 22:45:07 GMT
    Server: Apache/2.0.59  HP-UX_Apache-based_Web_Server (Unix) DAV/2
    Content-Length: 4
    Connection: close
    Content-Type: text/plain; charset=iso-8859-1
    PASS
    Here is my probe:
    probe http JOE-TEST-CS
      interval 45
      passdetect interval 30
      receive 30
      request method get url /cgi-bin/ERMS-PREP-statusRepo.cgi
      expect status 0 999
      open 20
      expect regex "PASS"
    Here is the output of the show probe:
    ACE1/euhr-test-ace2# sh probe JOE-TEST-CS detail
    probe       : JOE-TEST-CS
    type        : HTTP
    state       : ACTIVE
    description :
       port      : 80      address     : 0.0.0.0         addr type  : -
       interval  : 45      pass intvl  : 30              pass count : 3
       fail count: 3       recv timeout: 30
       http method      : GET
       http url         : /cgi-bin/ERMS-PREP-statusRepo.cgi
       conn termination : GRACEFUL
       expect offset    : 0         , open timeout     : 20
       expect regex     : PASS
       send data        : -
                           --------------------- probe results --------------------
       probe association   probed-address  probes     failed     passed     health
       ------------------- ---------------+----------+----------+----------+-------
       serverfarm  : JOE-TEST-PROBE-CS
         real      : EUHRTDM50.APP[0]
                           192.168.73.71   2          2          0          FAILED
       Socket state        : CLOSED
       No. Passed states   : 0         No. Failed states : 1
       No. Probes skipped  : 0         Last status code  : 200
       No. Out of Sockets  : 0         No. Internal error: 0
       Last disconnect err : User defined Reg-Exp was not found in Host Response
      Last probe time     : Thu Sep 22 15:00:36 2011
       Last fail time      : Thu Sep 22 15:00:36 2011
       Last active time    : Thu Sep 22 09:40:19 2011
    If I replace the expect regex "PASS" with anything from the HEADER it succeeds!
    Any thoughts?

    Sorry, I missed it.  The content-length in your request is 4.  I think this may be the issue.  I created a basic HTML page that says PASS in the body and my server is returning a content-length of 224 when I fetch the page.  Here is my HTML request:
    GET /index.html
    http-equiv="Content-Type">
      Probe
    PASS
    Here are my headers that I received:
    (Status-Line)    HTTP/1.1 200 OK
    Content-Length    224
    Content-Type    text/html
    Last-Modified    Tue, 27 Sep 2011 12:05:00 GMT
    Accept-Ranges    bytes
    Etag    "8cca60aed7dcc1:41f"
    Server    Microsoft-IIS/6.0
    Date    Tue, 27 Sep 2011 12:25:59 GMT
    What version of code are you running on your ACE?  I can also look to see if there are any known issues.
    Kris

  • ACE - TCP probe goes into INVALID state

    Hello,
    I have a problem with the following configuration of a sticky serverfarm with a backup serverfarm
    (this setup is ofcourse used only for failover purposes, not loadbalancing):
    probe tcp tcp-8888-probe
      port 8888
      interval 5
      faildetect 2
      passdetect interval 3
      passdetect count 1
    rserver host rsrv1
      ip address 10.1.2.10
      inservice
    rserver host rsrv2
      ip address 10.1.2.11
      inservice
    serverfarm host rfarm-primary
      predictor leastconns
      probe tcp-8888-probe
      rserver rsrv1 8888
        inservice
    serverfarm host rfarm-backup
      predictor leastconns
      probe tcp-8888-probe
      rserver rsrv2 8888
       inservice
    sticky http-cookie RFARM-COOKIE sticky-rfarm-1
      cookie insert browser-expire
      serverfarm rfarm-primary backup rfarm-backup
    etc....
    The problem is that every time probe state changes (from SUCCESS to FAIL or otherwise), the tcp-8888-probe on the server that changed
    the state of service, goes into INVALID state:
    #show probe tcp-8888-probe detail
    probe       : tcp-8888-probe
    type        : TCP
    state       : ACTIVE
    description :
       port      : 8888    address     : 0.0.0.0         addr type  : -
       interval  : 5       pass intvl  : 3               pass count : 1
       fail count: 2       recv timeout: 10
       conn termination : GRACEFUL
       expect offset    : 0         , open timeout     : 10
       expect regex     : -
       send data        : -
                           --------------------- probe results --------------------
       probe association   probed-address  probes     failed     passed     health
       ------------------- ---------------+----------+----------+----------+-------
       serverfarm  : rfarm-backup
         real      : rsrv2[8888]
                           10.1.2.11    291        0          291        SUCCESS
       Socket state        : CLOSED
       No. Passed states   : 1         No. Failed states : 0
       No. Probes skipped  : 0         Last status code  : 0
       No. Out of Sockets  : 0         No. Internal error: 0
       Last disconnect err :  -
       Last probe time     : Thu Jun 17 22:12:31 2010
       Last fail time      : Never
       Last active time    : Thu Jun 17 21:48:21 2010
       serverfarm  : rfarm-primary
         real      : rsrv1[8888]
                           10.1.2.10    0          0          0          INVALID
       Socket state        : CLOSED
       No. Passed states   : 0         No. Failed states : 0
       No. Probes skipped  : 0         Last status code  : 0
       No. Out of Sockets  : 0         No. Internal error: 0
       Last disconnect err :  -
       Last probe time     : Never
       Last fail time      : Never
       Last active time    : Never
    I have managed to get the probe into FAIL state again for a moment by removing it from serverfarm, and then reapplying, but in a few seconds it goes again from FAIL to INVAILD state, and stays in this state regardless of avaliability of probed TCP port. Only when i'm reapplying it when the port is avaliable/up, it can stay in SUCCESS state, and work till the failure of service, when INVALID state reappears.
    What can be the cause of such behavior ?
    thanks,
    WM

    Hello,
    It looks very similar to this bug: CSCsh74871
    You may need to collect a #show tech-support and do the following:
    -remove the serverfarm in question
    -reboot the ace module under a maintenance window.
    You may upgrade to a higher version since your version is kind of old.
    Jorge

  • ACE Health probe for SIP

    I've setup a SIP probe to check the health of a Microsoft OCS. The health of this server is always failed. What am I missing? I also tried it with a telnet probe on port 5061, but got the same result. A telnet from ACE to the server on port 5061 works fine.
    See below a show probe SIP detail and the relevant configuration.
    ACE21_Secondary/MOCS# sh probe SIP det
    probe : SIP
    type : SIP
    state : ACTIVE
    description :
    port : 5061 address : 0.0.0.0 addr type : -
    interval : 10 pass intvl : 10 pass count : 3
    fail count: 3 recv timeout: 4
    request-method : OPTIONS
    conn termination : GRACEFUL
    expect offset : 0 , open timeout : 2
    expect regex : -
    ------------------ probe results ------------------
    associations ip-address port porttype probes failed passed health
    ------------ ---------------+-----+--------+--------+--------+--------+------
    rserver : OCS_11
    10.105.11.70 5061 -- 7566 7566 0 FAILED
    Socket state : CLOSED
    No. Passed states : 0 No. Failed states : 0
    No. Probes skipped : 0 Last status code : 0
    No. Out of Sockets : 0 No. Internal error: 0
    Last disconnect err : Server reply timeout (no reply)
    Last probe time : Thu Oct 30 14:18:42 2008
    Last fail time : Tue Oct 28 16:31:30 2008
    Last active time : Never
    ACE21_Secondary/MOCS# sh run
    probe sip tcp SIP
    port 5061
    interval 10
    passdetect interval 10
    receive 4
    expect status 200 200
    open 2
    rserver host OCS_11
    ip address 10.105.11.70
    probe SSL
    probe PING
    probe SIP
    probe SIP_TELNET
    inservice
    Cheers
    Peter

    Peter,
    make sure to NOT run version A2(1.1a) as SIP probes are broken in that specific release.
    If your version is something else, get a sniffer trace on the server to see what is going on.
    Seems like we don't get a reply according to the line :
    "Last disconnect err : Server reply timeout (no reply) "
    Gilles.

  • ACE: Problem configuring probe snmp

    Hi,
    I have a problem when I configure probe snmp and My Server W2K3 dual core, snmp comunity public has an oid cpu .1.3.6.1.2.1.25.3.3.1.2, the output is:
    access-list anyone line 8 extended permit ip any any
    probe snmp was
    interval 4
    faildetect 2
    passdetect interval 10
    receive 2
    community public
    oid .1.3.6.1.2.1.25.3.3.1.2
    threshold 70
    rserver host was1
    ip address 10.24.8.200
    probe was
    inservice
    rserver host was2
    ip address 10.24.8.201
    probe was
    inservice
    serverfarm host servers
    rserver was1
    inservice
    rserver was2
    inservice
    class-map type management match-any ADM-CONTEX-SERV1
    4 match protocol icmp any
    5 match protocol snmp any
    class-map type http loadbalance match-all Check-Headers
    2 match http url .*
    3 match http header Host header-value "10.24.16.*"
    4 match http header User-Agent header-value ".*MSIE.*"
    class-map match-all VIP-10-HTTP
    2 match virtual-address 10.24.16.10 tcp eq www
    class-map type http loadbalance match-all other-HTTP
    2 match http url .*
    policy-map type management first-match ADM-CTX-SERV1
    class ADM-CONTEX-SERV1
    permit
    policy-map type loadbalance first-match L7-logic
    class Check-Headers
    serverfarm servers
    class other-HTTP
    serverfarm servers
    policy-map type loadbalance first-match lb-logic
    class class-default
    serverfarm servers
    policy-map multi-match client-vips
    class VIP-10-HTTP
    loadbalance vip inservice
    loadbalance policy L7-logic
    loadbalance vip icmp-reply active
    interface vlan 60
    ip address 10.24.8.5 255.255.255.0
    access-group input anyone
    access-group output anyone
    service-policy input ADM-CTX-SERV1
    no shutdown
    interface vlan 233
    ip address 10.24.16.5 255.255.255.0
    access-group input anyone
    access-group output anyone
    service-policy input ADM-CTX-SERV1
    service-policy input client-vips
    no shutdown
    ip route 0.0.0.0 0.0.0.0 10.24.16.1
    sh probe was detail
    probe : was
    type : SNMP
    state : ACTIVE
    description :
    port : 161 address : 0.0.0.0 addr type : TRANSPARENT
    interval : 4 pass intvl : 10 pass count : 3
    fail count: 2 recv timeout: 2
    version : 1 community : public
    oid string #1 : .1.3.6.1.2.1.25.3.3.1.2
    type : PERCENTILE max value : 100
    weight : 16000 threshold : 70
    --------------------- probe results --------------------
    probe association probed-address probes failed passed health
    ------------------- ---------------+----------+----------+----------+-------
    rserver : was1
    10.24.8.201 13 13 0 FAILED
    Socket state : CLOSED
    No. Passed states : 0 No. Failed states : 1
    No. Probes skipped : 0 Last status code : 0
    No. Out of Sockets : 0 No. Internal error: 0
    Last disconnect err : Server reply - bad SNMP OID
    Last probe time : Tue Feb 24 23:22:41 2009
    Last fail time : Tue Feb 24 23:20:47 2009
    Last active time : Never
    Server load : 16000
    rserver : was2
    10.24.8.200 12 12 0 FAILED
    Socket state : CLOSED
    No. Passed states : 0 No. Failed states : 1
    No. Probes skipped : 0 Last status code : 0
    No. Out of Sockets : 0 No. Internal error: 0
    Last disconnect err : Server reply timeout (no reply)
    Last probe time : Tue Feb 24 23:22:34 2009
    Last fail time : Tue Feb 24 23:20:52 2009
    Last active time : Never
    Server load : 16000

    Hi,
    For a multicore processor you need to make a few changes to get the load on each core/processor. You need to have an instance for each core.
    Try adding .1 or .2 to the OID to get the load on each core.
    Also try doing an snmpwalk on the OID to see what the real structure is.
    HTH
    Cathy

  • ACE: probe failing

    Hi,
    I've following probe configured:
    probe http probe1.test.com:10114
      port 10114
      interval 34
      faildetect 17
      passdetect interval 60
      expect status 200 200
      header Host header-value "hcmfincrp1.test.com"
      open 1
    and it is applied to serverfarm. but health check is failing. I see following when I do "sh probe probe1.test.com:10114 detail":
    sh probe probe1.test.com:10114 deta
    probe       : probe1.test.com:10114
    type        : HTTP
    state       : ACTIVE
    description :
       port      : 10114   address     : 0.0.0.0         addr type  : -
       interval  : 34      pass intvl  : 60              pass count : 3
       fail count: 17      recv timeout: 10
       http method      : GET
       http url         : /
       conn termination : GRACEFUL
       expect offset    : 0         , open timeout     : 1
       expect regex     : -
       send data        : -
                    ------------------ probe results ------------------
       associations ip-address      port  porttype probes   failed   passed   health
       ------------ ---------------+-----+--------+--------+--------+--------+------
       serverfarm  : probe1.test.com:443
         real      : server1.test.com[10114]
                    192.168.1.110114 PROBE    41531    19556    21975    FAILED
       Socket state        : CLOSED
       No. Passed states   : 5         No. Failed states : 6
       No. Probes skipped  : 0         Last status code  : 0
       No. Out of Sockets  : 0         No. Internal error: 0
       Last disconnect err : Unrecognized or invalid response
       Last probe time     : Wed Oct 12 17:43:30 2011
       Last fail time      : Tue Oct 11 02:33:52 2011
       Last active time    : Sun Oct  9 20:24:02 2011
    May i know why health check is failing? why am I seeing msg "Last disconnect err : Unrecognized or invalid response" ?

    Hi ,
    This error means, that the ace is not receiving a 200 ok response from the server, this happens when server is not responding it or it is receiving that do not have a host header having value hcmfincrp1.test.com , which you have definied, or the page has got modified. Please check if your http server is working fine.
    Regards
    Abijith

  • Ace HTTP Probe expect regex

    Hi,
    I have a question about the config of the ACe probe.
    I have the following probe defined :
    probe http P_HTTP_TEST
    interval 5
    passdetect interval 2
    passdetect count 2
    request method get url /test
    expect status 200 200
    expect regex trululu
    I would like to use the regex just like the expect string on the csm probe...
    The regex doesn't seem to work as the strin trululu is not on the page tested.
    I guess the expect status override the regex but without the expect status it doesn't work either.
    Anyone know how exactly the probe expect works for http ?
    Another question, on the CSM module, the tcp probe by default use the real port for the probe, not the default port of the probe type, is it possible to change that so it mimmicks the CSM way of working ?
    Thanks a lot ;-)

    This seems to be bug related to some version of ACE software as HTTP return code overrides missing regexp. For sure this bug is present in:
    system:    Version A2(2.0) [build 3.0(0)A2(2.0)]
    Notice the difference between 192.168.1.1 (is missing regex in HTTP response) and 192.168.1.2 (sends regexp in HTTP response). Both are successful and as addition 192.168.1.1 (missing regexp) is showing last status code 200 which seems to be sufficient for probe to pass. 192.168.1.2 (which sends expected regexp) doesn't show last status code.
    probe       : tw2_http_81
    type        : HTTP
    state       : ACTIVE
    description :
       port      : 81      address     : 0.0.0.0         addr type  : -
       interval  : 30      pass intvl  : 30              pass count : 1
       fail count: 1       recv timeout: 10
       http method      : GET
       http url         : /knowtw2-f/livelink.exe?func=ll&objtype=142&bypass
       conn termination : GRACEFUL
       expect offset    : 0         , open timeout     : 10
       expect regex     : lbmonitor
       send data        : -
                           --------------------- probe results --------------------
       probe association   probed-address  probes     failed     passed     health
       ------------------- ---------------+----------+----------+----------+-------
         real      : 192.168.1.1[81]
                           192.168.1.1    2          0          2          SUCCESS
       Socket state        : CLOSED
       No. Passed states   : 1         No. Failed states : 0
       No. Probes skipped  : 0         Last status code  : 200
       No. Out of Sockets  : 0         No. Internal error: 0
       Last disconnect err :  -
       Last probe time     : Mon Nov  7 12:38:42 2011
       Last fail time      : Never
       Last active time    : Mon Nov  7 12:38:22 2011
         real      : 192.168.1.2[81]
                           192.168.1.2    2          0          2          SUCCESS
       Socket state        : CLOSED
       No. Passed states   : 1         No. Failed states : 0
       No. Probes skipped  : 0         Last status code  : 0
       No. Out of Sockets  : 0         No. Internal error: 0
       Last disconnect err :  -
       Last probe time     : Mon Nov  7 12:38:27 2011
       Last fail time      : Never
       Last active time    : Mon Nov  7 12:37:58 2011

  • HTTP probe in ACE

    we have a simple layer3-4 port 80 app thta is being load balanced by ACE and created an HTTP probe that actually acts more like a TCP probe, since we took a default on just about all the attributes:
    probe http WEB_SERVERS
    expect status 200 200
    Unfortunately, when we activated this probe, we saw the following:
    probe : WEB_SERVERS
    type : HTTP
    state : ACTIVE
    description :
    port : 80 address : 0.0.0.0 addr type : -
    interval : 120 pass intvl : 300 pass count : 3
    fail count: 3 recv timeout: 10
    http method : GET
    http url : /
    conn termination : GRACEFUL
    expect offset : 0 , open timeout : 10
    expect regex : -
    send data : -
    --------------------- probe results --------------------
    probe association probed-address probes failed passed health
    ------------------- ---------------+----------+----------+----------+-------
    real : Planview_136.39[0]
    167.238.136.39 1 1 0 FAILED
    Socket state : CLOSED
    No. Passed states : 0 No. Failed states : 1
    No. Probes skipped : 0 Last status code : 302
    No. Out of Sockets : 0 No. Internal error: 0
    Last disconnect err : Received invalid status code
    Last probe time : Wed Jul 22 15:07:20 2009
    Last fail time : Wed Jul 22 15:07:21 2009
    Last active time : Never
    real : Planview_136.40[0]
    167.238.136.40 1 1 0 FAILED
    Socket state : CLOSED
    No. Passed states : 0 No. Failed states : 1
    No. Probes skipped : 0 Last status code : 302
    No. Out of Sockets : 0 No. Internal error: 0
    Last disconnect err : Received invalid status code
    Last probe time : Wed Jul 22 15:07:20 2009
    Last fail time : Wed Jul 22 15:07:21 2009
    Last active time : Never
    The obvious culprit here is the return code. How do we assign the correct return code here?
    Thanks...

    Hi,
    I wouldn't just let it default. It is better to probe for a particular page if that is possible. If this is a page you create, then it offers the possibility of being able to take a server out of rotation simply by renaming the page. E.g.
    probe http PROBE-iamhere
    interval 30
    passdetect interval 10
    request method head url /serverhere.html
    expect status 200 200
    Alternatively, it looks like you are getting a 302 response code (a redirect) then you could just change the line in the probe to expect that.
    probe http WEB_SERVERS
    expect status 302 302.
    HTH
    Cathy

  • ACE ping probe

    Hi,
    I have a strange problem on my ACE in one-arm design.
    I have a real server which I can ping from the ACE, but a ping probe always fails:
    server : APACHE4
    10.144.131.6 28 28 0 FAILED
    Socket state : CLOSED
    No. Passed states : 0 No. Failed states : 1
    No. Probes skipped : 4 Last status code : 0
    No. Out of Sockets : 0 No. Internal error: 0
    Last disconnect err : Server reply timeout (no reply)
    Last probe time : Sat Dec 9 11:42:57 2006
    Last fail time : Sat Dec 9 11:29:57 2006
    Last active time : Never
    ace/INTRANET# ping 10.144.131.6
    Pinging 10.144.131.6 with timeout = 2, count = 5, size = 100 ....
    Response from 10.144.131.6 : seq 1 time 0.335 ms
    Response from 10.144.131.6 : seq 2 time 0.181 ms
    Response from 10.144.131.6 : seq 3 time 0.340 ms
    Response from 10.144.131.6 : seq 4 time 0.266 ms
    Response from 10.144.131.6 : seq 5 time 0.341 ms
    5 packet sent, 5 responses received, 0% packet loss
    I have a couple of other real servers which do not have this problem.
    Any ideas?
    According to netflow on the 6500 the server answers correctly.
    There are no syslog messages.
    interface vlan 552
    ip address 10.144.130.3 255.255.255.0
    alias 10.144.130.1 255.255.255.0
    peer ip address 10.144.130.2 255.255.255.0
    no normalization
    no icmp-guard
    access-group input PERMIT
    service-policy input MANAGEMENT
    service-policy input SLB
    no shutdown
    probe icmp PING
    interval 2
    faildetect 5
    passdetect interval 30
    passdetect count 2
    rserver host APACHE1
    ip address 10.144.131.131
    probe PING
    inservice
    rserver host APACHE2
    ip address 10.144.131.132
    probe PING
    inservice
    rserver host APACHE3
    ip address 10.144.131.133
    probe PING
    inservice
    rserver host APACHE4
    ip address 10.144.131.6
    probe TEST
    probe PING
    inservice
    probe tcp TEST
    port 22
    interval 2
    faildetect 5
    passdetect interval 30
    passdetect count 2
    ace/INTRANET# sh probe
    probe : PING
    type : ICMP, state : ACTIVE
    port : 0 address : 0.0.0.0 addr type : -
    interval : 2 pass intvl : 30 pass count : 2
    fail count: 5 recv timeout: 10
    --------------------- probe results --------------------
    probe association probed-address probes failed passed health
    ------------------- ---------------+----------+----------+----------+-------
    rserver : APACHE1
    10.144.131.131 2312 0 2312 SUCCESS
    rserver : APACHE2
    10.144.131.132 2311 0 2311 SUCCESS
    rserver : APACHE3
    10.144.131.133 2311 0 2311 SUCCESS
    rserver : APACHE4
    10.144.131.6 38 38 0 FAILED
    rserver : IIS1
    10.144.131.129 2311 0 2311 SUCCESS
    rserver : IIS2
    10.144.131.130 2311 0 2311 SUCCESS
    probe : TEST
    type : TCP, state : ACTIVE
    port : 22 address : 0.0.0.0 addr type : -
    interval : 2 pass intvl : 30 pass count : 2
    fail count: 5 recv timeout: 10
    --------------------- probe results --------------------
    probe association probed-address probes failed passed health
    ------------------- ---------------+----------+----------+----------+-------
    rserver : APACHE4
    10.144.131.6 557 0 557 SUCCESS
    I have 3.0(0)A1(3b)

    Hi,
    unfortunately your URL did not help me.
    I found out that the sup720-3b adds a 23bytes zero-byte padding to exact the frames corresponding to the failing ping probe. I saw this by spanning the internal te4/1 port from the switch to the ACE to a sniffer.
    The strange thing is that the frame is padded although it's larger than the minimum frame size of 64 bytes.
    When I configure a log-input ACL on the sup720-3b to force the traffic to be routed by the MSFC3 instead of the PFC3 then the ping probe works and the same frames are not padded any more!!
    We run IOS modularity on the sups and according to the 12.2SX release notes they do not support the ACE. I suppose that's the root cause. We will change the sup sw ASAP.

  • ACE Module - HTTP Probe failure

    Hi,
    I have configured the http probe with expect status 200 202, but the probe fails despite availability of the port on rserver.
    I tried head/get method to see the return code, and it came back with HTTP1.1/302. How can I configure an http probe to understand HTTP 302 code as success return.
    Thanks.

    I changed the expect status value as below
    probe http TEST-HTTP
    interval 30
    passdetect interval 10
    request method head
    expect status 302 302
    The probe is still failing with the log message
    Apr 20 2009 12:04:35 : %ACE-3-251010: Health probe failed for server 192.168.1.10 on port 80, received invalid status code
    On 'show probe detail' it shows the last status code as 400 which means Bad Request
    --------------------- probe results --------------------
    probe association probed-address probes failed passed health
    ------------------- ---------------+----------+----------+----------+-------
    serverfarm : TEST-APP
    real : TEST-SERVER1[80]
    192.168.1.10 27 27 0 FAILED
    Socket state : CLOSED
    No. Passed states : 0 No. Failed states : 1
    No. Probes skipped : 0 Last status code : 400
    No. Out of Sockets : 0 No. Internal error: 0
    Last disconnect err : Received invalid status code
    Last probe time : Mon Apr 20 12:05:33 2009
    Last fail time : Mon Apr 20 12:00:53 2009
    Last active time : Never
    The http page is showing perfectly on the web browser. Also, using the http head/get tool, I can see that 302 is returned.
    What could be the problem.
    Regards.

  • Cisco ACE probe setup

    Configured a Probe to check the heath of server webpage .But getting a status code of 400.
    probe http PROBE_80
      interval 10
      faildetect 2
      passdetect interval 10
      passdetect count 2
      receive 5
      request method get url http://<host>:<port>/eml/HealthCheckServlet
      expect status 200 202
      open 10
    getting below status code .would like to know the correct format for the requesr method of the above url
         real      : app02p[0]
                             192.168.10.6  80 VIP     161    161    0      FAILED
       Socket state        : CLOSED
       No. Passed states   : 0         No. Failed states : 1
       No. Probes skipped  : 0         Last status code  : 400
       No. Out of Sockets  : 0         No. Internal error: 0
       Last disconnect err : Received invalid status code
       Last probe time     : Tue Mar 17 02:53:58 2015
       Last fail time      : Tue Mar 17 02:27:15 2015
       Last active time    : Never

    Hi Hari,
    Does this URL return status 200 when you send the request directly from your browser?
    You should use the exact URL here.  If the URL is fine, then check with your server team why server is responding with 400. The syntax looks fine. You can also take a pcap on server and see what is ACE sending for probe.
    Regards,
    Kanwal
    Note: Please mark answers if they are helpful.

  • HTTP GET Probe Monitoring

    I am trying to monitor our web servers from our load balancer with an HTT probe  This probe keeps failing.  Its monitoring a Windows sharepoint server, and I can get to the test page with my credentials, but the Probe seemingly cant pull it.  Is there something in here I am doing wrong?  I have attached a screen shot of the probe for reference. I keep getting probe failed.  Ive tried a lot of different permutations of this probe config with no success.   Any help with anyone who has done this before would be awesome

    ACE-4710-DR/Admin# sh probe HTTP-GET  detail
     probe       : HTTP-GET
     type        : HTTP
     state       : ACTIVE
     description : Test for I-am-alive.html
       port      : 80      address     : 0.0.0.0         addr type  : -
       interval  : 15      pass intvl  : 60              pass count : 3
       fail count: 3       recv timeout: 10
       http method      : GET
       http url         : http://aspenintranet/PSC/Pages/I-am-alive.html
       conn termination : GRACEFUL
       expect offset    : 0         , open timeout     : 1
       regex cache-len  : 0
       expect regex     : -
       send data        : -
                    ------------------ probe results ------------------
       associations ip-address      port  porttype probes   failed   passed   health
       ------------ ---------------+-----+--------+--------+--------+--------+------
       rserver     : 10.22.5.100
                    10.22.5.100     80    --       2970     2970     0        FAILED
       Socket state        : CLOSED
       No. Passed states   : 0         No. Failed states : 1
       No. Probes skipped  : 0         Last status code  : 401
       No. Out of Sockets  : 0         No. Internal error: 0
       Last disconnect err : Received invalid status code
       Last probe time     : Fri May 23 11:33:29 2014
       Last fail time      : Wed May 21 10:04:45 2014
       Last active time    : Never

  • Probe DNS

    Dear
    I have a probe DNS , but by someone reason, in spite that the service DNS is up, the probe show that down. I tried putting domain and expect, but the results are the same. The process is the next:
    a) First time detect service up.
    b) Service is down, the probe detect the fail.Rserver is down.
    c) The service is put up. But the probe never detect the service up.
    See the next picture:
    ACE4710-1/IIS# show probe DNS detail
    probe : DNS
    type : DNS
    state : ACTIVE
    description : "test de DNS"
    port : 53 address : 0.0.0.0 addr type : -
    interval : 30 pass intvl : 300 pass count : 3
    fail count: 3 recv timeout: 10
    dns domain : www.cisco.com
    --------------------- probe results --------------------
    probe association probed-address probes failed passed health
    ------------------- ---------------+----------+----------+----------+-------
    rserver : DNS1-G
    10.1.5.20 17 5 12 FAILED
    Socket state : CLOSED
    No. Passed states : 1 No. Failed states : 1
    No. Probes skipped : 0 Last status code : 0
    No. Out of Sockets : 0 No. Internal error: 0
    Last disconnect err : Connection refused by server
    Last probe time : Wed Feb 11 20:17:12 2009
    Last fail time : Wed Feb 11 20:16:42 2009
    Last active time : Wed Feb 11 19:57:12 2009
    rserver : DNS1-N
    10.1.5.12 29 5 24 FAILED
    Socket state : CLOSED
    No. Passed states : 0 No. Failed states : 1
    No. Probes skipped : 0 Last status code : 0
    No. Out of Sockets : 0 No. Internal error: 0
    Last disconnect err : Connection refused by server
    Last probe time : Wed Feb 11 20:21:17 2009
    Last fail time : Wed Feb 11 20:20:47 2009
    Last active time : Tue Feb 10 22:03:17 2009
    admin dns DNS
    DOMAIN WWW.CISCO.COM
    expect 198.133.219.25
    Best Regards

    It does response to your pc, but not to ACE.
    Or the response never makes it to ACE.
    Either because of routing issue.
    Or because it is dropped by an ACL.
    Could even be an ACL on ACE itself.
    Again, a sniffer trace to confirm that the response makes it to ACE.
    G.

Maybe you are looking for