Probe skipping

Hello,
I am running into a rather interesting issue and I was curious if anyone may have seen it before or if anyone had any insight into what the problem could be. On one of my ACE 4710's (running sw A5(1.2) , I am running a fairly large number of layer 7 probes (71) across both 80 and 443. At seemingly random points in the day, the system reports that the probes are being skipped due to an internal error. I have seen this before when the system runs out of sockets for the probes, but I am not seeing any indication that is the case.
Here is an example probe config:
probe https CHECK-SOME-SITE
port 443
interval 10
faildetect 2
passdetect interval 30
receive 5
ssl version all
request method get url /some/url
header Host header-value "www.somesite.com"
expect regex "SOMEREGEX"
Here is the relevant output from ''show probe detail'
     real      : some-rserver
                          x.x.x.x 443 PROBE   3093610 1749563 1344047 SUCCESS
   Socket state        : CLOSED
   No. Passed states   : 49         No. Failed states : 49
   No. Probes skipped : 479         Last status code : 200
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : -
   Last probe time     : Tue Mar 4 16:45:03 2014
   Last fail time      : Fri Feb 28 13:30:37 2014
   Last active time    : Mon Mar 3 22:08:53 2014
Here are the log messages that are popping up:
Mar 4 2014 14:36:41 : %ACE-3-251014: Could not probe server x.x.x.x on port 443 for 4 consecutive tries - Internal error
The log messages appear for all rservers being probed for about 30 seconds, then they go away until the next event. Considering the probes are skipped, I do not believe this is actually causing failures at the moment. I have read that the ACE platform can only run 200 concurrent scripted probes, however I am at a loss as to how to check if that is what I am running into here. The real confusing thing here is the lack of internal error and out of socket counters.
Any help or insight would be very appreciated. Thanks in advance.
-Ed

Hi Ed,
Two things:
Number of skipped probes. A skipped probe occurs when the ACE does not send out a probe because the scheduled interval to send a probe is shorter than it takes to complete the execution of the probe; the send interval is shorter than the open timeout or receive timeout interval.
In your case the interval is 10 which is little aggressive but still less than receive. But if the probe execution is greater than 10 seconds you may see probes getting skipped. Increasing the interval time by another 10 seconds can be helpful for testing to see if this mitigates the issue.
If you have UDP probes then you need to check this as well:
For UDP probes or UDP-based probes, we recommend a time interval value of 30 seconds. The reason for this recommendation is that the ACE data plane has a management connection limit of 100,000. Management connections are used by all probes as well as Telnet, SSH, SNMP, and other management applications. In addition, the ACE has a default timeout for UDP connections of 120 (ACE module) or 15 (ACE appliance) seconds. This means that the ACE does not remove the UDP connections even though the UDP probe has been closed for two minutes. Using a time interval less than 30 seconds may limit the number of UDP probes that can be configured to run without exceeding the management connection limit, which may result in skipped probes
Are you running any scripted probes?
It could be a stupid bug as well but i would suggest increasing the interval timeout and see how it goes.
You can also alo try debug hm errors/events/all etc and see if you get any detailed output there which can be sent to TAC for further investigation.
Regards,
Kanwal

Similar Messages

ACE 4710 HTTP Probes

Using the ACE 4710 for loadbalancing a Sharepoint site.
We currently have a HTTP probe setup to check the port 80 status of the rserver.
Is there anyway to get the HTTP probe to check a DNS entry for each of the application sites? For instance http://info vs http://site are two different web sites running on the same IP. One site could have a problem but the actual port 80 for the IP may be still alive.
Thanks for any information.

Has anyone figure this out? I am tring to get healthchecks/probes setup in this same fashion. I have 2 servers with 1 IP but have many sites. I want to probe each side and ensure I get a 200 code. I also have to provide credentials to the site. It seems that if i open IE I can log in just fine to the site with the credentials. However there is an active x control box that is wanting to be installed. When I set this up on my ACE it seems I am getting a http 401 unauthorized error. I have done a wireshark capture while I was browsing and I see the 401 however it also reports a 200 code after that. Do you think this is a problem because of the active x control wanting to be downloaded? Or is this an issue with the first http code that is recieved by the probe, that being the 401 and then the 200? Below is my config (cleaned of course).
probe http HTTP-80-OUR.DOMAIN.COM
interval 15
passdetect interval 60
credentials
request method get url http://our.domain.com/default.aspx
expect status 200 200
header Host header-value "our.domain.com"
open 1
rserver host SERVER-A
ip address X.X.X.47
inservice
rserver host SERVER-B
ip address X.X.X.48
inservice
serverfarm host FARM-AB
predictor leastconns
probe HTTP-80-OUR.DOMAIN.COM
rserver SERVER-A
    inservice
rserver SERVER-B
    inservice
ACE4710# show probe HTTP-80-OUR.DOMAIN.COM detail
probe       : HTTP-80-OUR.DOMAIN.COM
type        : HTTP
state       : ACTIVE
description :
   port      : 80      address     : 0.0.0.0         addr type : -
   interval : 15      pass intvl : 60              pass count : 3
   fail count: 3       recv timeout: 10
   http method      : GET
   http url         : http://our.domain.com
   conn termination : GRACEFUL
   expect offset    : 0         , open timeout     : 1
   expect regex     : -
   send data        : -
                ------------------ probe results ------------------
   associations ip-address      port porttype probes   failed   passed   health
   ------------ ---------------+-----+--------+--------+--------+--------+------
   serverfarm : OUR.DOMAIN.COM-10.25.4.12-L3-FARM
     real      : SERVER-A[0]
                X.X.X.47      80    DEFAULT 414      406      8        FAILED
   Socket state        : CLOSED
   No. Passed states   : 1         No. Failed states : 2
   No. Probes skipped : 0         Last status code : 401
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : Received invalid status code
   Last probe time     : Wed Jun 2 17:44:18 2010
   Last fail time      : Wed Jun 2 13:37:04 2010
   Last active time    : Wed Jun 2 13:34:19 2010
     real      : SERVER-B[0]
                X.X.X.48      80    DEFAULT 414      406      8        FAILED
   Socket state        : CLOSED
   No. Passed states   : 1         No. Failed states : 2
   No. Probes skipped : 0         Last status code : 401
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : Received invalid status code
   Last probe time     : Wed Jun 2 17:44:20 2010
   Last fail time      : Wed Jun 2 13:37:06 2010
   Last active time    : Wed Jun 2 13:34:21 2010

Issue with Scripted Probe for LDAP

I have the script LDAP_PROBE loaded into memory on my ACE 4710 (A4(2.0)) and th Probe is name is configured for the LDAP port the servers are listening on. So here is th econfiguration.
probe scripted LDAP_PROBE_3389
port 3389
interval 5
passdetect interval 5
passdetect count 2
receive 5
script LDAP_PROBE 3389
I have tried removing the argument of 3389 at the bottom as well but I continue to get the result:
real      : LDAP02[3389]
                10.220.31.81    3389 PROBE    2491     2491     0        FAILED
   Socket state        : RESET
   No. Passed states   : 0         No. Failed states : 1
   No. Probes skipped : 0         Last status code : 30002
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : Probe error: Server did not respond as expected
   Last probe time     : Thu Jul 12 16:24:41 2012
   Last fail time      : Thu Jul 12 12:56:59 2012
   Last active time    : Never
The server log states this was successful however...
Admin Acct Status: Not Locked
AuditV3--2012-07-11-14:18:21.428+00:00DST--V3 anonymous Bind--bindDN: <*CN=NULLDN*>--client: 10.220.31.217:56908--connectionID: 8--received: 2012-07-11-14:18:21.428+00:00DST--Success
name: <*CN=NULLDN*>
authenticationChoice: simple
Admin Acct Status: Not Locked
Am I missing an argument? I have run debug on LDAP but really don't know what I am looking at...

To update the script
==============
Extract the Cisco-supplied LDAP script from the tar.gz or zip file. Rename it to something unique. Update it to use the
new length and offset.
Import the script into the LDAP contexts on both ACEs. Remember, scripts are not replicated and having mismatched scripts will cause replication to fail.
ACE1/ldap# copy tftp: disk0:
Enter source filename[]? UoN-LDAP_PROBE-iLDAP2
Enter the destination filename[]? [UoN-LDAP_PROBE-iLDAP2]
Address of remote host[]? [redacted]
Trying to connect to tftp server......
TFTP get operation was successful
ACE2/ldap# copy tftp: disk0:
Enter source filename[]? UoN-LDAP_PROBE-iLDAP2
Enter the destination filename[]? [UoN-LDAP_PROBE-iLDAP2]
Address of remote host[]? [redacted]
Trying to connect to tftp server......
TFTP get operation was successful
script file 13 UoN-LDAP_PROBE-iLDAP2
If you look at (for example) packet 651 in the capture in wireshark you'll see a
successful bind response. You will need to tell wireshark to decode the packet as LDAP.
The payload is:
30 84 00 00 00 10 02 01 01 61 84 00 00 00 07 0a 01 00 04 00 04 00
You need to have a basic understanding of ASN.1 and something called Basic Encoding Rules (BER) - whicj comes down to TLV format structures.
The key to understanding this output is that there are three ways of specifying a length in ASN.1. The first way we have already seen in the Cisco script is to use a single byte. This known as the "definite" form and can be used for lengths of 127 bytes or less. Otherwise if the high bit is set to one, the low seven bits define the length of length. The length is then encoded in that many bytes. This is the "length of the length field" form. It looks like Microsoft Active Directory uses the indefinite form for all length encoding. The third form (for completeness is "indefinite" where the length is coded as x'80' and the end of the content is marked by x'0000'. Deconstructing the data:
0x30    The start of a universal constructed sequence
0x84    The length of the sequence in "length of the length" format. The next 4 bytes give the length.
0x00000010    sequence length of 16 bytes
0x02    Integer
0x01    The length of the next field (1 byte)
0x01    Value (this is the message ID which agrees with the ID in the BIND Request)
0x61    Application, number 0, use RFC2251 to decode. This is a Bind Response
0x84    The length of the sequence in "length of the length" format. The next 4 bytes give the length.
0x00000007    bind response length of 7 bytes
0x0a    Enumeration
0x01    Length 1
0x03    0 - Success
0x04    String
0x00    Length 0 (null string)
0x04    String
0x00    Length 0 (null string)
The patch given takes in 20 bytes from the bitstream,converts it into a hexadecimal string and finds the 6 hexadecimal characters from the 16th byte onwards   (Tcl uses zero-based arrays). This is the response code.
Kind Regards
Cathy

HTTP probe issue with expect regex string

Hello,
We have a simple cgi status page setup to poll a background service and return a "PASS" or "FAIL" as output. I've setup an HTTP probe to look for the "PASS" to determine application health. The issue appears to be that the expect regex is searching the HEADER but not the BODY of the web page. I can successfully match on any string in the header, but never on anything in the body.
Here is what the web page returns if you telnet to it:
HTTP/1.1 200 OK
Date: Thu, 22 Sep 2011 22:45:07 GMT
Server: Apache/2.0.59 HP-UX_Apache-based_Web_Server (Unix) DAV/2
Content-Length: 4
Connection: close
Content-Type: text/plain; charset=iso-8859-1
PASS
Here is my probe:
probe http JOE-TEST-CS
interval 45
passdetect interval 30
receive 30
request method get url /cgi-bin/ERMS-PREP-statusRepo.cgi
expect status 0 999
open 20
expect regex "PASS"
Here is the output of the show probe:
ACE1/euhr-test-ace2# sh probe JOE-TEST-CS detail
probe       : JOE-TEST-CS
type        : HTTP
state       : ACTIVE
description :
   port      : 80      address     : 0.0.0.0         addr type : -
   interval : 45      pass intvl : 30              pass count : 3
   fail count: 3       recv timeout: 30
   http method      : GET
   http url         : /cgi-bin/ERMS-PREP-statusRepo.cgi
   conn termination : GRACEFUL
   expect offset    : 0         , open timeout     : 20
   expect regex     : PASS
   send data        : -
                       --------------------- probe results --------------------
   probe association   probed-address probes     failed     passed     health
   ------------------- ---------------+----------+----------+----------+-------
   serverfarm : JOE-TEST-PROBE-CS
     real      : EUHRTDM50.APP[0]
                       192.168.73.71   2          2          0          FAILED
   Socket state        : CLOSED
   No. Passed states   : 0         No. Failed states : 1
   No. Probes skipped : 0         Last status code : 200
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : User defined Reg-Exp was not found in Host Response
Last probe time     : Thu Sep 22 15:00:36 2011
   Last fail time      : Thu Sep 22 15:00:36 2011
   Last active time    : Thu Sep 22 09:40:19 2011
If I replace the expect regex "PASS" with anything from the HEADER it succeeds!
Any thoughts?

Sorry, I missed it. The content-length in your request is 4. I think this may be the issue. I created a basic HTML page that says PASS in the body and my server is returning a content-length of 224 when I fetch the page. Here is my HTML request:
GET /index.html
http-equiv="Content-Type">
Probe
PASS
Here are my headers that I received:
(Status-Line)    HTTP/1.1 200 OK
Content-Length    224
Content-Type    text/html
Last-Modified    Tue, 27 Sep 2011 12:05:00 GMT
Accept-Ranges    bytes
Etag    "8cca60aed7dcc1:41f"
Server    Microsoft-IIS/6.0
Date    Tue, 27 Sep 2011 12:25:59 GMT
What version of code are you running on your ACE? I can also look to see if there are any known issues.
Kris

ACE - TCP probe goes into INVALID state

Hello,
I have a problem with the following configuration of a sticky serverfarm with a backup serverfarm
(this setup is ofcourse used only for failover purposes, not loadbalancing):
probe tcp tcp-8888-probe
port 8888
interval 5
faildetect 2
passdetect interval 3
passdetect count 1
rserver host rsrv1
ip address 10.1.2.10
inservice
rserver host rsrv2
ip address 10.1.2.11
inservice
serverfarm host rfarm-primary
predictor leastconns
probe tcp-8888-probe
rserver rsrv1 8888
    inservice
serverfarm host rfarm-backup
predictor leastconns
probe tcp-8888-probe
rserver rsrv2 8888
   inservice
sticky http-cookie RFARM-COOKIE sticky-rfarm-1
cookie insert browser-expire
serverfarm rfarm-primary backup rfarm-backup
etc....
The problem is that every time probe state changes (from SUCCESS to FAIL or otherwise), the tcp-8888-probe on the server that changed
the state of service, goes into INVALID state:
#show probe tcp-8888-probe detail
probe       : tcp-8888-probe
type        : TCP
state       : ACTIVE
description :
   port      : 8888    address     : 0.0.0.0         addr type : -
   interval : 5       pass intvl : 3               pass count : 1
   fail count: 2       recv timeout: 10
   conn termination : GRACEFUL
   expect offset    : 0         , open timeout     : 10
   expect regex     : -
   send data        : -
                       --------------------- probe results --------------------
   probe association   probed-address probes     failed     passed     health
   ------------------- ---------------+----------+----------+----------+-------
   serverfarm : rfarm-backup
     real      : rsrv2[8888]
                       10.1.2.11    291        0          291        SUCCESS
   Socket state        : CLOSED
   No. Passed states   : 1         No. Failed states : 0
   No. Probes skipped : 0         Last status code : 0
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : -
   Last probe time     : Thu Jun 17 22:12:31 2010
   Last fail time      : Never
   Last active time    : Thu Jun 17 21:48:21 2010
   serverfarm : rfarm-primary
     real      : rsrv1[8888]
                       10.1.2.10    0          0          0          INVALID
   Socket state        : CLOSED
   No. Passed states   : 0         No. Failed states : 0
   No. Probes skipped : 0         Last status code : 0
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : -
   Last probe time     : Never
   Last fail time      : Never
   Last active time    : Never
I have managed to get the probe into FAIL state again for a moment by removing it from serverfarm, and then reapplying, but in a few seconds it goes again from FAIL to INVAILD state, and stays in this state regardless of avaliability of probed TCP port. Only when i'm reapplying it when the port is avaliable/up, it can stay in SUCCESS state, and work till the failure of service, when INVALID state reappears.
What can be the cause of such behavior ?
thanks,
WM

Hello,
It looks very similar to this bug: CSCsh74871
You may need to collect a #show tech-support and do the following:
-remove the serverfarm in question
-reboot the ace module under a maintenance window.
You may upgrade to a higher version since your version is kind of old.
Jorge

ACE Health probe for SIP

I've setup a SIP probe to check the health of a Microsoft OCS. The health of this server is always failed. What am I missing? I also tried it with a telnet probe on port 5061, but got the same result. A telnet from ACE to the server on port 5061 works fine.
See below a show probe SIP detail and the relevant configuration.
ACE21_Secondary/MOCS# sh probe SIP det
probe : SIP
type : SIP
state : ACTIVE
description :
port : 5061 address : 0.0.0.0 addr type : -
interval : 10 pass intvl : 10 pass count : 3
fail count: 3 recv timeout: 4
request-method : OPTIONS
conn termination : GRACEFUL
expect offset : 0 , open timeout : 2
expect regex : -
------------------ probe results ------------------
associations ip-address port porttype probes failed passed health
------------ ---------------+-----+--------+--------+--------+--------+------
rserver : OCS_11
10.105.11.70 5061 -- 7566 7566 0 FAILED
Socket state : CLOSED
No. Passed states : 0 No. Failed states : 0
No. Probes skipped : 0 Last status code : 0
No. Out of Sockets : 0 No. Internal error: 0
Last disconnect err : Server reply timeout (no reply)
Last probe time : Thu Oct 30 14:18:42 2008
Last fail time : Tue Oct 28 16:31:30 2008
Last active time : Never
ACE21_Secondary/MOCS# sh run
probe sip tcp SIP
port 5061
interval 10
passdetect interval 10
receive 4
expect status 200 200
open 2
rserver host OCS_11
ip address 10.105.11.70
probe SSL
probe PING
probe SIP
probe SIP_TELNET
inservice
Cheers
Peter

Peter,
make sure to NOT run version A2(1.1a) as SIP probes are broken in that specific release.
If your version is something else, get a sniffer trace on the server to see what is going on.
Seems like we don't get a reply according to the line :
"Last disconnect err : Server reply timeout (no reply) "
Gilles.

ACE: Problem configuring probe snmp

Hi,
I have a problem when I configure probe snmp and My Server W2K3 dual core, snmp comunity public has an oid cpu .1.3.6.1.2.1.25.3.3.1.2, the output is:
access-list anyone line 8 extended permit ip any any
probe snmp was
interval 4
faildetect 2
passdetect interval 10
receive 2
community public
oid .1.3.6.1.2.1.25.3.3.1.2
threshold 70
rserver host was1
ip address 10.24.8.200
probe was
inservice
rserver host was2
ip address 10.24.8.201
probe was
inservice
serverfarm host servers
rserver was1
inservice
rserver was2
inservice
class-map type management match-any ADM-CONTEX-SERV1
4 match protocol icmp any
5 match protocol snmp any
class-map type http loadbalance match-all Check-Headers
2 match http url .*
3 match http header Host header-value "10.24.16.*"
4 match http header User-Agent header-value ".*MSIE.*"
class-map match-all VIP-10-HTTP
2 match virtual-address 10.24.16.10 tcp eq www
class-map type http loadbalance match-all other-HTTP
2 match http url .*
policy-map type management first-match ADM-CTX-SERV1
class ADM-CONTEX-SERV1
permit
policy-map type loadbalance first-match L7-logic
class Check-Headers
serverfarm servers
class other-HTTP
serverfarm servers
policy-map type loadbalance first-match lb-logic
class class-default
serverfarm servers
policy-map multi-match client-vips
class VIP-10-HTTP
loadbalance vip inservice
loadbalance policy L7-logic
loadbalance vip icmp-reply active
interface vlan 60
ip address 10.24.8.5 255.255.255.0
access-group input anyone
access-group output anyone
service-policy input ADM-CTX-SERV1
no shutdown
interface vlan 233
ip address 10.24.16.5 255.255.255.0
access-group input anyone
access-group output anyone
service-policy input ADM-CTX-SERV1
service-policy input client-vips
no shutdown
ip route 0.0.0.0 0.0.0.0 10.24.16.1
sh probe was detail
probe : was
type : SNMP
state : ACTIVE
description :
port : 161 address : 0.0.0.0 addr type : TRANSPARENT
interval : 4 pass intvl : 10 pass count : 3
fail count: 2 recv timeout: 2
version : 1 community : public
oid string #1 : .1.3.6.1.2.1.25.3.3.1.2
type : PERCENTILE max value : 100
weight : 16000 threshold : 70
--------------------- probe results --------------------
probe association probed-address probes failed passed health
------------------- ---------------+----------+----------+----------+-------
rserver : was1
10.24.8.201 13 13 0 FAILED
Socket state : CLOSED
No. Passed states : 0 No. Failed states : 1
No. Probes skipped : 0 Last status code : 0
No. Out of Sockets : 0 No. Internal error: 0
Last disconnect err : Server reply - bad SNMP OID
Last probe time : Tue Feb 24 23:22:41 2009
Last fail time : Tue Feb 24 23:20:47 2009
Last active time : Never
Server load : 16000
rserver : was2
10.24.8.200 12 12 0 FAILED
Socket state : CLOSED
No. Passed states : 0 No. Failed states : 1
No. Probes skipped : 0 Last status code : 0
No. Out of Sockets : 0 No. Internal error: 0
Last disconnect err : Server reply timeout (no reply)
Last probe time : Tue Feb 24 23:22:34 2009
Last fail time : Tue Feb 24 23:20:52 2009
Last active time : Never
Server load : 16000

Hi,
For a multicore processor you need to make a few changes to get the load on each core/processor. You need to have an instance for each core.
Try adding .1 or .2 to the OID to get the load on each core.
Also try doing an snmpwalk on the OID to see what the real structure is.
HTH
Cathy

ACE: probe failing

Hi,
I've following probe configured:
probe http probe1.test.com:10114
port 10114
interval 34
faildetect 17
passdetect interval 60
expect status 200 200
header Host header-value "hcmfincrp1.test.com"
open 1
and it is applied to serverfarm. but health check is failing. I see following when I do "sh probe probe1.test.com:10114 detail":
sh probe probe1.test.com:10114 deta
probe       : probe1.test.com:10114
type        : HTTP
state       : ACTIVE
description :
   port      : 10114   address     : 0.0.0.0         addr type : -
   interval : 34      pass intvl : 60              pass count : 3
   fail count: 17      recv timeout: 10
   http method      : GET
   http url         : /
   conn termination : GRACEFUL
   expect offset    : 0         , open timeout     : 1
   expect regex     : -
   send data        : -
                ------------------ probe results ------------------
   associations ip-address      port porttype probes   failed   passed   health
   ------------ ---------------+-----+--------+--------+--------+--------+------
   serverfarm : probe1.test.com:443
     real      : server1.test.com[10114]
                192.168.1.110114 PROBE    41531    19556    21975    FAILED
   Socket state        : CLOSED
   No. Passed states   : 5         No. Failed states : 6
   No. Probes skipped : 0         Last status code : 0
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : Unrecognized or invalid response
   Last probe time     : Wed Oct 12 17:43:30 2011
   Last fail time      : Tue Oct 11 02:33:52 2011
   Last active time    : Sun Oct 9 20:24:02 2011
May i know why health check is failing? why am I seeing msg "Last disconnect err : Unrecognized or invalid response" ?

Hi ,
This error means, that the ace is not receiving a 200 ok response from the server, this happens when server is not responding it or it is receiving that do not have a host header having value hcmfincrp1.test.com , which you have definied, or the page has got modified. Please check if your http server is working fine.
Regards
Abijith

Ace HTTP Probe expect regex

Hi,
I have a question about the config of the ACe probe.
I have the following probe defined :
probe http P_HTTP_TEST
interval 5
passdetect interval 2
passdetect count 2
request method get url /test
expect status 200 200
expect regex trululu
I would like to use the regex just like the expect string on the csm probe...
The regex doesn't seem to work as the strin trululu is not on the page tested.
I guess the expect status override the regex but without the expect status it doesn't work either.
Anyone know how exactly the probe expect works for http ?
Another question, on the CSM module, the tcp probe by default use the real port for the probe, not the default port of the probe type, is it possible to change that so it mimmicks the CSM way of working ?
Thanks a lot ;-)

This seems to be bug related to some version of ACE software as HTTP return code overrides missing regexp. For sure this bug is present in:
system:    Version A2(2.0) [build 3.0(0)A2(2.0)]
Notice the difference between 192.168.1.1 (is missing regex in HTTP response) and 192.168.1.2 (sends regexp in HTTP response). Both are successful and as addition 192.168.1.1 (missing regexp) is showing last status code 200 which seems to be sufficient for probe to pass. 192.168.1.2 (which sends expected regexp) doesn't show last status code.
probe       : tw2_http_81
type        : HTTP
state       : ACTIVE
description :
   port      : 81      address     : 0.0.0.0         addr type : -
   interval : 30      pass intvl : 30              pass count : 1
   fail count: 1       recv timeout: 10
   http method      : GET
   http url         : /knowtw2-f/livelink.exe?func=ll&objtype=142&bypass
   conn termination : GRACEFUL
   expect offset    : 0         , open timeout     : 10
   expect regex     : lbmonitor
   send data        : -
                       --------------------- probe results --------------------
   probe association   probed-address probes     failed     passed     health
   ------------------- ---------------+----------+----------+----------+-------
     real      : 192.168.1.1[81]
                       192.168.1.1    2          0          2          SUCCESS
   Socket state        : CLOSED
   No. Passed states   : 1         No. Failed states : 0
   No. Probes skipped : 0         Last status code : 200
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : -
   Last probe time     : Mon Nov 7 12:38:42 2011
   Last fail time      : Never
   Last active time    : Mon Nov 7 12:38:22 2011
     real      : 192.168.1.2[81]
                       192.168.1.2    2          0          2          SUCCESS
   Socket state        : CLOSED
   No. Passed states   : 1         No. Failed states : 0
   No. Probes skipped : 0         Last status code : 0
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : -
   Last probe time     : Mon Nov 7 12:38:27 2011
   Last fail time      : Never
   Last active time    : Mon Nov 7 12:37:58 2011

HTTP probe in ACE

we have a simple layer3-4 port 80 app thta is being load balanced by ACE and created an HTTP probe that actually acts more like a TCP probe, since we took a default on just about all the attributes:
probe http WEB_SERVERS
expect status 200 200
Unfortunately, when we activated this probe, we saw the following:
probe : WEB_SERVERS
type : HTTP
state : ACTIVE
description :
port : 80 address : 0.0.0.0 addr type : -
interval : 120 pass intvl : 300 pass count : 3
fail count: 3 recv timeout: 10
http method : GET
http url : /
conn termination : GRACEFUL
expect offset : 0 , open timeout : 10
expect regex : -
send data : -
--------------------- probe results --------------------
probe association probed-address probes failed passed health
------------------- ---------------+----------+----------+----------+-------
real : Planview_136.39[0]
167.238.136.39 1 1 0 FAILED
Socket state : CLOSED
No. Passed states : 0 No. Failed states : 1
No. Probes skipped : 0 Last status code : 302
No. Out of Sockets : 0 No. Internal error: 0
Last disconnect err : Received invalid status code
Last probe time : Wed Jul 22 15:07:20 2009
Last fail time : Wed Jul 22 15:07:21 2009
Last active time : Never
real : Planview_136.40[0]
167.238.136.40 1 1 0 FAILED
Socket state : CLOSED
No. Passed states : 0 No. Failed states : 1
No. Probes skipped : 0 Last status code : 302
No. Out of Sockets : 0 No. Internal error: 0
Last disconnect err : Received invalid status code
Last probe time : Wed Jul 22 15:07:20 2009
Last fail time : Wed Jul 22 15:07:21 2009
Last active time : Never
The obvious culprit here is the return code. How do we assign the correct return code here?
Thanks...

Hi,
I wouldn't just let it default. It is better to probe for a particular page if that is possible. If this is a page you create, then it offers the possibility of being able to take a server out of rotation simply by renaming the page. E.g.
probe http PROBE-iamhere
interval 30
passdetect interval 10
request method head url /serverhere.html
expect status 200 200
Alternatively, it looks like you are getting a 302 response code (a redirect) then you could just change the line in the probe to expect that.
probe http WEB_SERVERS
expect status 302 302.
HTH
Cathy

ACE ping probe

Hi,
I have a strange problem on my ACE in one-arm design.
I have a real server which I can ping from the ACE, but a ping probe always fails:
server : APACHE4
10.144.131.6 28 28 0 FAILED
Socket state : CLOSED
No. Passed states : 0 No. Failed states : 1
No. Probes skipped : 4 Last status code : 0
No. Out of Sockets : 0 No. Internal error: 0
Last disconnect err : Server reply timeout (no reply)
Last probe time : Sat Dec 9 11:42:57 2006
Last fail time : Sat Dec 9 11:29:57 2006
Last active time : Never
ace/INTRANET# ping 10.144.131.6
Pinging 10.144.131.6 with timeout = 2, count = 5, size = 100 ....
Response from 10.144.131.6 : seq 1 time 0.335 ms
Response from 10.144.131.6 : seq 2 time 0.181 ms
Response from 10.144.131.6 : seq 3 time 0.340 ms
Response from 10.144.131.6 : seq 4 time 0.266 ms
Response from 10.144.131.6 : seq 5 time 0.341 ms
5 packet sent, 5 responses received, 0% packet loss
I have a couple of other real servers which do not have this problem.
Any ideas?
According to netflow on the 6500 the server answers correctly.
There are no syslog messages.
interface vlan 552
ip address 10.144.130.3 255.255.255.0
alias 10.144.130.1 255.255.255.0
peer ip address 10.144.130.2 255.255.255.0
no normalization
no icmp-guard
access-group input PERMIT
service-policy input MANAGEMENT
service-policy input SLB
no shutdown
probe icmp PING
interval 2
faildetect 5
passdetect interval 30
passdetect count 2
rserver host APACHE1
ip address 10.144.131.131
probe PING
inservice
rserver host APACHE2
ip address 10.144.131.132
probe PING
inservice
rserver host APACHE3
ip address 10.144.131.133
probe PING
inservice
rserver host APACHE4
ip address 10.144.131.6
probe TEST
probe PING
inservice
probe tcp TEST
port 22
interval 2
faildetect 5
passdetect interval 30
passdetect count 2
ace/INTRANET# sh probe
probe : PING
type : ICMP, state : ACTIVE
port : 0 address : 0.0.0.0 addr type : -
interval : 2 pass intvl : 30 pass count : 2
fail count: 5 recv timeout: 10
--------------------- probe results --------------------
probe association probed-address probes failed passed health
------------------- ---------------+----------+----------+----------+-------
rserver : APACHE1
10.144.131.131 2312 0 2312 SUCCESS
rserver : APACHE2
10.144.131.132 2311 0 2311 SUCCESS
rserver : APACHE3
10.144.131.133 2311 0 2311 SUCCESS
rserver : APACHE4
10.144.131.6 38 38 0 FAILED
rserver : IIS1
10.144.131.129 2311 0 2311 SUCCESS
rserver : IIS2
10.144.131.130 2311 0 2311 SUCCESS
probe : TEST
type : TCP, state : ACTIVE
port : 22 address : 0.0.0.0 addr type : -
interval : 2 pass intvl : 30 pass count : 2
fail count: 5 recv timeout: 10
--------------------- probe results --------------------
probe association probed-address probes failed passed health
------------------- ---------------+----------+----------+----------+-------
rserver : APACHE4
10.144.131.6 557 0 557 SUCCESS
I have 3.0(0)A1(3b)

Hi,
unfortunately your URL did not help me.
I found out that the sup720-3b adds a 23bytes zero-byte padding to exact the frames corresponding to the failing ping probe. I saw this by spanning the internal te4/1 port from the switch to the ACE to a sniffer.
The strange thing is that the frame is padded although it's larger than the minimum frame size of 64 bytes.
When I configure a log-input ACL on the sup720-3b to force the traffic to be routed by the MSFC3 instead of the PFC3 then the ping probe works and the same frames are not padded any more!!
We run IOS modularity on the sups and according to the 12.2SX release notes they do not support the ACE. I suppose that's the root cause. We will change the sup sw ASAP.

ACE Module - HTTP Probe failure

Hi,
I have configured the http probe with expect status 200 202, but the probe fails despite availability of the port on rserver.
I tried head/get method to see the return code, and it came back with HTTP1.1/302. How can I configure an http probe to understand HTTP 302 code as success return.
Thanks.

I changed the expect status value as below
probe http TEST-HTTP
interval 30
passdetect interval 10
request method head
expect status 302 302
The probe is still failing with the log message
Apr 20 2009 12:04:35 : %ACE-3-251010: Health probe failed for server 192.168.1.10 on port 80, received invalid status code
On 'show probe detail' it shows the last status code as 400 which means Bad Request
--------------------- probe results --------------------
probe association probed-address probes failed passed health
------------------- ---------------+----------+----------+----------+-------
serverfarm : TEST-APP
real : TEST-SERVER1[80]
192.168.1.10 27 27 0 FAILED
Socket state : CLOSED
No. Passed states : 0 No. Failed states : 1
No. Probes skipped : 0 Last status code : 400
No. Out of Sockets : 0 No. Internal error: 0
Last disconnect err : Received invalid status code
Last probe time : Mon Apr 20 12:05:33 2009
Last fail time : Mon Apr 20 12:00:53 2009
Last active time : Never
The http page is showing perfectly on the web browser. Also, using the http head/get tool, I can see that 302 is returned.
What could be the problem.
Regards.

Cisco ACE probe setup

Configured a Probe to check the heath of server webpage .But getting a status code of 400.
probe http PROBE_80
interval 10
faildetect 2
passdetect interval 10
passdetect count 2
receive 5
request method get url http://<host>:<port>/eml/HealthCheckServlet
expect status 200 202
open 10
getting below status code .would like to know the correct format for the requesr method of the above url
     real      : app02p[0]
                         192.168.10.6 80 VIP     161    161    0      FAILED
   Socket state        : CLOSED
   No. Passed states   : 0         No. Failed states : 1
   No. Probes skipped : 0         Last status code : 400
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : Received invalid status code
   Last probe time     : Tue Mar 17 02:53:58 2015
   Last fail time      : Tue Mar 17 02:27:15 2015
   Last active time    : Never

Hi Hari,
Does this URL return status 200 when you send the request directly from your browser?
You should use the exact URL here. If the URL is fine, then check with your server team why server is responding with 400. The syntax looks fine. You can also take a pcap on server and see what is ACE sending for probe.
Regards,
Kanwal
Note: Please mark answers if they are helpful.

HTTP GET Probe Monitoring

I am trying to monitor our web servers from our load balancer with an HTT probe This probe keeps failing. Its monitoring a Windows sharepoint server, and I can get to the test page with my credentials, but the Probe seemingly cant pull it. Is there something in here I am doing wrong? I have attached a screen shot of the probe for reference. I keep getting probe failed. Ive tried a lot of different permutations of this probe config with no success. Any help with anyone who has done this before would be awesome

ACE-4710-DR/Admin# sh probe HTTP-GET detail
probe       : HTTP-GET
type        : HTTP
state       : ACTIVE
description : Test for I-am-alive.html
   port      : 80      address     : 0.0.0.0         addr type : -
   interval : 15      pass intvl : 60              pass count : 3
   fail count: 3       recv timeout: 10
   http method      : GET
   http url         : http://aspenintranet/PSC/Pages/I-am-alive.html
   conn termination : GRACEFUL
   expect offset    : 0         , open timeout     : 1
   regex cache-len : 0
   expect regex     : -
   send data        : -
                ------------------ probe results ------------------
   associations ip-address      port porttype probes   failed   passed   health
   ------------ ---------------+-----+--------+--------+--------+--------+------
   rserver     : 10.22.5.100
                10.22.5.100     80    --       2970     2970     0        FAILED
   Socket state        : CLOSED
   No. Passed states   : 0         No. Failed states : 1
   No. Probes skipped : 0         Last status code : 401
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : Received invalid status code
   Last probe time     : Fri May 23 11:33:29 2014
   Last fail time      : Wed May 21 10:04:45 2014
   Last active time    : Never

Probe DNS

Dear
I have a probe DNS , but by someone reason, in spite that the service DNS is up, the probe show that down. I tried putting domain and expect, but the results are the same. The process is the next:
a) First time detect service up.
b) Service is down, the probe detect the fail.Rserver is down.
c) The service is put up. But the probe never detect the service up.
See the next picture:
ACE4710-1/IIS# show probe DNS detail
probe : DNS
type : DNS
state : ACTIVE
description : "test de DNS"
port : 53 address : 0.0.0.0 addr type : -
interval : 30 pass intvl : 300 pass count : 3
fail count: 3 recv timeout: 10
dns domain : www.cisco.com
--------------------- probe results --------------------
probe association probed-address probes failed passed health
------------------- ---------------+----------+----------+----------+-------
rserver : DNS1-G
10.1.5.20 17 5 12 FAILED
Socket state : CLOSED
No. Passed states : 1 No. Failed states : 1
No. Probes skipped : 0 Last status code : 0
No. Out of Sockets : 0 No. Internal error: 0
Last disconnect err : Connection refused by server
Last probe time : Wed Feb 11 20:17:12 2009
Last fail time : Wed Feb 11 20:16:42 2009
Last active time : Wed Feb 11 19:57:12 2009
rserver : DNS1-N
10.1.5.12 29 5 24 FAILED
Socket state : CLOSED
No. Passed states : 0 No. Failed states : 1
No. Probes skipped : 0 Last status code : 0
No. Out of Sockets : 0 No. Internal error: 0
Last disconnect err : Connection refused by server
Last probe time : Wed Feb 11 20:21:17 2009
Last fail time : Wed Feb 11 20:20:47 2009
Last active time : Tue Feb 10 22:03:17 2009
admin dns DNS
DOMAIN WWW.CISCO.COM
expect 198.133.219.25
Best Regards

It does response to your pc, but not to ACE.
Or the response never makes it to ACE.
Either because of routing issue.
Or because it is dropped by an ACL.
Could even be an ACL on ACE itself.
Again, a sniffer trace to confirm that the response makes it to ACE.
G.

Probe skipping

Similar Messages

Maybe you are looking for