IP SLA Probe Failing

Hi,
I have a router configured with ip sla icmp echo to track some ip addresses in order to make some routing decisions.
However, although I can ping the IP addresses just fine from the router with the correct source interface, their corresponding ip sla probes fails sometimes. At first usually all seems to be working fine then after some time some ip addresses fails and stay that way and usually only way i can fix this is to track new IP addresses.
My router is a C2811 with Advanced IP Services 12.4(22)T.
Anyone faced a similar problem before.
Thanks,

What error do you see when you run "show ip sla stat" for these collectors? When they are failing can you manually ping the same addresses from the device? If you enable "debug ip sla error" and "debug ip sla trace" what messages do you see?

Similar Messages

ACE: probe failing

Hi,
I've following probe configured:
probe http probe1.test.com:10114
port 10114
interval 34
faildetect 17
passdetect interval 60
expect status 200 200
header Host header-value "hcmfincrp1.test.com"
open 1
and it is applied to serverfarm. but health check is failing. I see following when I do "sh probe probe1.test.com:10114 detail":
sh probe probe1.test.com:10114 deta
probe       : probe1.test.com:10114
type        : HTTP
state       : ACTIVE
description :
   port      : 10114   address     : 0.0.0.0         addr type : -
   interval : 34      pass intvl : 60              pass count : 3
   fail count: 17      recv timeout: 10
   http method      : GET
   http url         : /
   conn termination : GRACEFUL
   expect offset    : 0         , open timeout     : 1
   expect regex     : -
   send data        : -
                ------------------ probe results ------------------
   associations ip-address      port porttype probes   failed   passed   health
   ------------ ---------------+-----+--------+--------+--------+--------+------
   serverfarm : probe1.test.com:443
     real      : server1.test.com[10114]
                192.168.1.110114 PROBE    41531    19556    21975    FAILED
   Socket state        : CLOSED
   No. Passed states   : 5         No. Failed states : 6
   No. Probes skipped : 0         Last status code : 0
   No. Out of Sockets : 0         No. Internal error: 0
   Last disconnect err : Unrecognized or invalid response
   Last probe time     : Wed Oct 12 17:43:30 2011
   Last fail time      : Tue Oct 11 02:33:52 2011
   Last active time    : Sun Oct 9 20:24:02 2011
May i know why health check is failing? why am I seeing msg "Last disconnect err : Unrecognized or invalid response" ?

Hi ,
This error means, that the ace is not receiving a 200 ok response from the server, this happens when server is not responding it or it is receiving that do not have a host header having value hcmfincrp1.test.com , which you have definied, or the page has got modified. Please check if your http server is working fine.
Regards
Abijith

Probe fail on Standby ACE in One-armed mode

Hi there
I'm Kilsoo.
I made One-armed mode using ACE.
Real servers are in away Vlan from ACE.
So, I configured the PBR with ACE alias ip address for the next-hop on the real server's gateway interface.
And, the probe from active ACE works well.
But, the probe from standby ACE was fail.
At this point, my first question
Is it normal situation that the probe fail from standby ACE????
So, I made the route-map for PBR like below for temporary solution.
route-map deny PBR 5
match ip address Probe_ACL
route-map permit PBR 10
match ip address L4_ACL
set ip next-hop <Alias IP address>
ip access-list extended Probe_ACL
pemit ip any <Standby ACE's IP address>
ip access-list extended L4_ACL
permit tcp <Real server's IP address> eq 80 any
Second question...
Do you have any other good solutions???
Thanks

Hi Cesar
Thanks for your reply.
But I think I was confuse when I wrote the message.
I used both ace's vlan ip address for next-hop ip address like your advice.
Do you know the standby ace can't check probe without route-map in one-armed mode like below diagram???
Backbone Router
         |
         |
         |
Supervisor --------------------ACE(vserver: 172.19.100.100)
         |         (vlan 200)
         |
         |
         |(vlan 110)
         |
         |
Real servers
(172.19.110.111)

ACE Module: Recover a real server probe-failed status

How does the ACE module recover a real server that has entered a probe-failed status state? We are doing some testing, purposely dropping a servers interface. ACE recognizes the server as being down and show it in a probe-failed state. When we bring the system's interface back up, will ACE see this and automatically bring the state back into Operational status, or does someone have to do something on the ACE module?

ACE continues to probe servers that are down or probe_failed. As soon as a server starts responding again its state will switch to alive again.
Nothing to be done.
Gilles.

ACE : PROBE-FAILED and Syslog messages

Hi,
When a real server is in PROBE-FAILED status, I observe a syslog message at each trial of the proble. This fills our syslog server. Is there a mean to configure the ACE in such a way that a syslog message would be generated only when a transition occurs in the probe status ?
Thank you for any hints,
Yves

Hello,
You can utilize "logging trap " command and
"logging message level " command
in order to achive what you are seeking.
The "logging trap " command limits the logging messages sent to a syslog server based on severity.
If it is set to "5 - notification", all messages that have security level of 5 or lower number are sent to the syslog server.
You can disable the display of a specific syslog
message or change the severity level of a specific system log message using
"logging message level " command.
Not sure what kind of probe you are using but If it is ICMP probe and
the reason of probe failure is arp, it generates a message for every try
as below with severity level of 3, by default.
%ACE-3-251009: ICMP health probe failed for server 192.168.0.1, connectivity error: ARP not resolved for destination ip address
%ACE-3-251009: ICMP health probe failed for server 192.168.0.1, connectivity error: ARP not resolved for destination ip address
%ACE-3-251009: ICMP health probe failed for server 192.168.0.1, connectivity error: ARP not resolved for destination ip address
%ACE-3-251009: ICMP health probe failed for server 192.168.0.1, connectivity error: ARP not resolved for destination ip address
%ACE-3-251009: ICMP health probe failed for server 192.168.0.1, connectivity error: ARP not resolved for destination ip address
%ACE-5-441002: Serverfarm (SF) is now back in service in policy_map (fs) -->
class_map (#class_default_slb). Number of failovers = 0, number of times back in service = 0
%ACE-4-442007: VIP in class: 'VIP' changed state from OUTOFSERVICE to INSERVICE
%ACE-5-441002: Serverfarm (SF) is now back in service in policy_map (fs) -->
class_map (#class_default_slb). Number of failovers = 0, number of times back in service = 0
%ACE-4-442004: Health probe ICMP detected rserver r1 (interface vlan31) changed state to UP
%ACE-4-442001: Health probe ICMP detected r1 (interface vlan31) in serverfarm SF changed state to UP
If your "logging trap " is set to "5 - notification" and you do not want
the message "%ACE-3-251009:xxx" to be sent to syslog server,
you can change its security level like below.
switch/Admin(config)# logging message 251009 level 6
switch/Admin(config)# do show logging message 251009
Message logging:
message 251009: current-level 6 default-level 3 (enabled)
You can check the message id that is filling the syslog server
and change its security level to higher number than "logging trap ".
Regards,
Kimihito.

Managedavailability error about inbound proxy probe failing

getting a managedavailability error about inbound proxy probe failing.
what does the inbound proxy probe do, and why does this error reference forefront, when I do not have MS forefront installed?
below are the details:
The inbound proxy probe failed 3 times over 15 minutes.
The inbound proxy probe failed 3 times over 15 minutes. No connection could be made because the target machine actively refused it 127.0.0.1:25 Probe Exception: 'System.Net.Sockets.SocketException (0x80004005): No connection could be made because the target
machine actively refused it 127.0.0.1:25 at System.Net.Sockets.TcpClient..ctor(String hostname, Int32 port) at Microsoft.Forefront.Monitoring.ActiveMonitoring.Smtp.Probes.SimpleSmtpClient.Connect(String server, Int32 port, Boolean disconnectIfConnected) at
Microsoft.Forefront.Monitoring.ActiveMonitoring.Smtp.Probes.SmtpConnectionProbe.MeasureLatency(String reason, ActionWithReturn`1 cmd) at Microsoft.Forefront.Monitoring.ActiveMonitoring.Smtp.Probes.SmtpConnectionProbe.MeasureLatency(String reason, ActionWithReturn`1
cmd, ConnectionLostPoint connectionLostPoint) at Microsoft.Forefront.Monitoring.ActiveMonitoring.Smtp.Probes.SmtpConnectionProbe.TestConnection() at Microsoft.Forefront.Monitoring.ActiveMonitoring.Smtp.Probes.SmtpConnectionProbe.DoWork(CancellationToken cancellationToken)
at Microsoft.Office.Datacenter.WorkerTaskFramework.WorkItem.Execute(CancellationToken joinedToken) at Microsoft.Office.Datacenter.WorkerTaskFramework.WorkItem.<>c__DisplayClass2.b__0() at System.Threading.Tasks.Task.Execute()' Failure Context: 'No connection
could be made because the target machine actively refused it 127.0.0.1:25' Execution Context: '' Probe Result Name: 'OnPremisesInboundProxy' Probe Result Type: 'Failed' Monitor Total Value: '3' Monitor Total Sample Count: '3' Monitor Total Failed Count: '0'
Monitor Poisoned Count: '0' Monitor First Alert Observed Time: '1/27/2015 4:46:45 PM

Hi,
Please use the Get-ServerHealth and Get-HealthReport commands to check result.
Here is a related blog for your reference.
http://blogs.technet.com/b/exchange/archive/2013/06/26/managed-availability-and-server-health.aspx
Best regards,
Belinda Ma
TechNet Community Support

HTTPS probe failing randomly

Hello All,
I have a physical server running behind the ACE module ACE20-MOD-K9. The Server has several virtual machines. One of that virtual machines, has a WEB SERVER running virtual https servers. For example, server with IP address 10.0.0.20/24, has serveral virtual HTTPs servers as of www.virtual.local, www.virtual2.local, www.virtual3.local. So, if you nslookup the servers, they all respond with 10.0.0.20 IP address. So if I do https://www.virtual.local goes to 10.0.0.20 and read the VIRTUAL SERVER config and replies back to the request.
Now, I am trying to verify that the TCP connection (443) and the HTTPS server itself is up and running but only for the www.virtual.local site and not for the other 2.
I've got 2 probes for that:
probe tcp TCP_443
port 443
interval 20
passdetect interval 30
probe https TEST_HTTPS
interval 20
passdetect interval 30
request method head url https://www.virtual.local/default.htm
expect status 200 200
The problem that I am facing is tha the HTTPS probe fails randomly. The TCP probe works fine.
Thanks

Hi Fernando,
If it is failing randomly, what do you see in last status code and last disconnect err? That should show you the reason and give an idea where is the problem.
Regards,
Kanwal

ACE http redirect on probe fail & others

Hi everyone,
I have multiple http based application running on 2 servers and they all be referenced behind the publised VIP from the load balancer.
The probes are already there, applications are accessed but one criteria from the business is not to fail the whole server for one application. There is some independance between the apps that if one fails, the other would need to still load balanced.
I would like, if the application fails on both server, to maybe be able to redirect to another URL any request for a particular App/URL.
Any suggestions ?

Hi,
To not declare a real server down if one of its applications fail, you should configure your probes in your serverfarm, and (if not already done) create a serverfarm per application.
If you want to be able to redirect a request send to a failed serverfarm, you can configure a backup serverfarm in you L7 policy map like this:
serverfarm name1 backup name2
The second serverfarm should then be of the type:
serverfarm redirect name2
webhost-redirection relocation_string [301 | 302]
where the relocation_string is the URL that should be used, 301 is permanently moved and 302 is temporarily.
For the relocation_string, you can use following special characters:
%h Inserts the hostname from the request Host header
%p Inserts the URL path string from the request
Mor info can be found in this doc:
http://www.cisco.com/en/US/docs/interfaces_modules/services_modules/ace/v3.00_A2/configuration/slb/guide/slbgd.html
Hope this helps.
Kr,
Dario

EEM http script returning "content problem"

Hi @ll,
I finished to record an HTTP script via EEM, but i got an error message "status: [503] content problem"
Can anyone explain what is this error message means ?
please see the attach screen shot.
TIA,
Haim.

Derrick,
Just noticed your query.
The script seems to do what you have described. If both IP SLA probes are down, it would go an remove the static default route.
You are right, there is no recovery script, and I guess it was intended to be done manually.
For recovery, you would need to come up with a logic you like, because if you are getting a backup internet uplink through your MPLS network, then the IP SLA probes should recover, and there is no easy way to know the ISP uplink is working again...
You could have a script that recovers automatically later in the night (just adding the static route again), and if the IP SLA probes fail again, it would fail back again, and generate the syslog.
If you are adding the default route, and the script triggers, then it means something is wrong with the IP SLA probes... Are you sure the route is working?
Take a look at the "show ip sla stat" outputs to see what is wrong...
You could disable the script (just remove it from the config) and see if Internet actually works for you when you add the static route...
Arie

ASA5510 sla monitor does not fail back

I've been down this path before and never got a resolution to this issue.
ASA5510 Security Plus
Primary ISP conn is Comcast cable
Secondary ISP conn is fract T1
I duplicated the SLA code from http://www.cisco.com/en/US/partner/products/hw/vpndevc/ps2030/products_configuration_example09186a00806e880b.shtml
When I pull the conn from primary ISP the default route to the secondary comes up
When I reconnect the primary the default route to the secondary does not go away.
I must either reload the ASA or remove/readd the two default outside routes.
Anyone have this same experience and could lend a hand?
Are there any commands I might have in my config that break SLA?
If so I would have hoped either the Configuration Guide or Command Reference for 8.2 would say so, but I don't see any mentioned.
I'm working remotely with my customer so I can't play with this except on off-hours.
ASA running 8.2(2) so as to use AnyConnect Essentials.
Thx,
Phil

Pls. read and try the workaround.
CSCtc16148 SLA monitor fails to fail back when ip verify reverse is applied
Symptom:
Route Tracking may fail to fail back to the primary link/route when restored.
Conditions:
SLA monitor must configured along with ip verify reverse path on the tracked interface.
Workaround:
1. Remove ip verify reverse path off of the tracked interface
or
2. add a static route to the SLA target out the primary tracked interface.
[Wrap text] [Edit this enclosure]
Release-note: Added 09/23/2009 20:28:24 by kusankar
[Unwrap text] [Edit this enclosure]
Release-note: Added 09/23/2009 20:28:24 by kusankar
[Uwrap text] [Edit this enclosure]
fixed-in-broadview-8.3.1.1_interim-by-cl104097: Added 03/23/2010 11:54:08 by perforce
fixed-in-broadview-8.3.1.1_interim-by-cl104097: Added 03/23/2010 11:54:08 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-broadview-8.3.1.1_interim-by-cl104097&ext=&type=FILE
fixed-in-broadview-8.3.1.1_interim-by-cl104097: Added 03/23/2010 11:54:08 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-broadview-8.3.1.1_interim-by-cl104097: Added 03/23/2010 11:54:08 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-broadview-8.3.1.1_interim-by-cl104097: Added 03/23/2010 11:54:08 by perforce
[Uwrap text] [Edit this enclosure]
fixed-in-broadview-8.3.1_fcs_throttle-by-cl103850: Added 03/22/2010 15:48:05 by perforce
fixed-in-broadview-8.3.1_fcs_throttle-by-cl103850: Added 03/22/2010 15:48:05 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-broadview-8.3.1_fcs_throttle-by-cl103850&ext=&type=FILE
fixed-in-broadview-8.3.1_fcs_throttle-by-cl103850: Added 03/22/2010 15:48:05 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-broadview-8.3.1_fcs_throttle-by-cl103850: Added 03/22/2010 15:48:05 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-broadview-8.3.1_fcs_throttle-by-cl103850: Added 03/22/2010 15:48:05 by perforce
[Uwrap text] [Edit this enclosure]
fixed-in-broadview-bennu-by-cl101314: Added 02/18/2010 19:06:08 by perforce
fixed-in-broadview-bennu-by-cl101314: Added 02/18/2010 19:06:08 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-broadview-bennu-by-cl101314&ext=&type=FILE
fixed-in-broadview-bennu-by-cl101314: Added 02/18/2010 19:06:08 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-broadview-bennu-by-cl101314: Added 02/18/2010 19:06:08 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-broadview-bennu-by-cl101314: Added 02/18/2010 19:06:08 by perforce
[Uwrap text] [Edit this enclosure]
fixed-in-broadview-idfw-by-cl101317: Added 02/18/2010 19:09:07 by perforce
fixed-in-broadview-idfw-by-cl101317: Added 02/18/2010 19:09:07 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-broadview-idfw-by-cl101317&ext=&type=FILE
fixed-in-broadview-idfw-by-cl101317: Added 02/18/2010 19:09:07 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-broadview-idfw-by-cl101317: Added 02/18/2010 19:09:07 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-broadview-idfw-by-cl101317: Added 02/18/2010 19:09:07 by perforce
[Uwrap text] [Edit this enclosure]
fixed-in-broadview-logging-ng-by-cl101311: Added 02/18/2010 19:03:08 by perforce
fixed-in-broadview-logging-ng-by-cl101311: Added 02/18/2010 19:03:08 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-broadview-logging-ng-by-cl101311&ext=&type=FILE
fixed-in-broadview-logging-ng-by-cl101311: Added 02/18/2010 19:03:08 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-broadview-logging-ng-by-cl101311: Added 02/18/2010 19:03:08 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-broadview-logging-ng-by-cl101311: Added 02/18/2010 19:03:08 by perforce
[Uwrap text] [Edit this enclosure]
fixed-in-broadview-main-by-cl101300: Added 02/18/2010 18:27:07 by perforce
fixed-in-broadview-main-by-cl101300: Added 02/18/2010 18:27:07 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-broadview-main-by-cl101300&ext=&type=FILE
fixed-in-broadview-main-by-cl101300: Added 02/18/2010 18:27:07 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-broadview-main-by-cl101300: Added 02/18/2010 18:27:07 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-broadview-main-by-cl101300: Added 02/18/2010 18:27:07 by perforce
[Uwrap text] [Edit this enclosure]
fixed-in-sedona-64bit-by-cl101362: Added 02/19/2010 04:52:24 by perforce
fixed-in-sedona-64bit-by-cl101362: Added 02/19/2010 04:52:24 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-sedona-64bit-by-cl101362&ext=&type=FILE
fixed-in-sedona-64bit-by-cl101362: Added 02/19/2010 04:52:24 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-sedona-64bit-by-cl101362: Added 02/19/2010 04:52:24 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-sedona-64bit-by-cl101362: Added 02/19/2010 04:52:24 by perforce
[Uwrap text] [Edit this enclosure]
fixed-in-sedona-bv64-by-cl101426: Added 02/19/2010 11:42:41 by perforce
fixed-in-sedona-bv64-by-cl101426: Added 02/19/2010 11:42:41 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-sedona-bv64-by-cl101426&ext=&type=FILE
fixed-in-sedona-bv64-by-cl101426: Added 02/19/2010 11:42:41 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-sedona-bv64-by-cl101426: Added 02/19/2010 11:42:41 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-sedona-bv64-by-cl101426: Added 02/19/2010 11:42:41 by perforce
[Uwrap text] [Edit this enclosure]
fixed-in-sedona-main-by-cl101297: Added 02/18/2010 18:24:15 by perforce
fixed-in-sedona-main-by-cl101297: Added 02/18/2010 18:24:15 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-sedona-main-by-cl101297&ext=&type=FILE
fixed-in-sedona-main-by-cl101297: Added 02/18/2010 18:24:15 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-sedona-main-by-cl101297: Added 02/18/2010 18:24:15 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-sedona-main-by-cl101297: Added 02/18/2010 18:24:15 by perforce
[Uwrap text] [Edit this enclosure]
fixed-in-titan-8.2.2_fcs_throttle-by-cl101307: Added 02/18/2010 18:57:08 by perforce
fixed-in-titan-8.2.2_fcs_throttle-by-cl101307: Added 02/18/2010 18:57:08 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-titan-8.2.2_fcs_throttle-by-cl101307&ext=&type=FILE
fixed-in-titan-8.2.2_fcs_throttle-by-cl101307: Added 02/18/2010 18:57:08 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-titan-8.2.2_fcs_throttle-by-cl101307: Added 02/18/2010 18:57:08 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-titan-8.2.2_fcs_throttle-by-cl101307: Added 02/18/2010 18:57:08 by perforce
[Uwrap text] [Edit this enclosure]
fixed-in-titan-bennu-by-cl101294: Added 02/18/2010 18:24:08 by perforce
fixed-in-titan-bennu-by-cl101294: Added 02/18/2010 18:24:08 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-titan-bennu-by-cl101294&ext=&type=FILE
fixed-in-titan-bennu-by-cl101294: Added 02/18/2010 18:24:08 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-titan-bennu-by-cl101294: Added 02/18/2010 18:24:08 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-titan-bennu-by-cl101294: Added 02/18/2010 18:24:08 by perforce
[Uwrap text] [Edit this enclosure]
fixed-in-titan-main-by-cl101282: Added 02/18/2010 16:48:04 by perforce
fixed-in-titan-main-by-cl101282: Added 02/18/2010 16:48:04 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=fixed-in-titan-main-by-cl101282&ext=&type=FILE
fixed-in-titan-main-by-cl101282: Added 02/18/2010 16:48:04 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
fixed-in-titan-main-by-cl101282: Added 02/18/2010 16:48:04 by perforce
[Wrap Text] [Edit this enclosure]
fixed-in-titan-main-by-cl101282: Added 02/18/2010 16:48:04 by perforce
[Uwrap text] [Edit this enclosure]
sla-mon-sh-tech: Added 09/23/2009 20:43:52 by kusankar
sla-mon-sh-tech: Added 09/23/2009 20:43:52 by kusankarCan not view this .log file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=sla-mon-sh-tech&ext=log&type=FILE
sla-mon-sh-tech: Added 09/23/2009 20:43:52 by kusankarCan not view this .log file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
sla-mon-sh-tech: Added 09/23/2009 20:43:52 by kusankar
[Wrap Text] [Edit this enclosure]
sla-mon-sh-tech: Added 09/23/2009 20:43:52 by kusankar
[Uwrap text] [Edit this enclosure]
static-analysis-titan-main: Added 02/18/2010 16:48:07 by perforce
static-analysis-titan-main: Added 02/18/2010 16:48:07 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://cdetsweb-prd.cisco.com/apps/dumpcr_att?identifier=CSCtc16148&title=static-analysis-titan-main&ext=&type=FILE
static-analysis-titan-main: Added 02/18/2010 16:48:07 by perforceCan not view this . file attachment inline, please click on the following link to view the attachment.
http://
[UnWrap text] [Edit this enclosure]
static-analysis-titan-main: Added 02/18/2010 16:48:07 by perforce
[Wrap Text] [Edit this enclosure]
static-analysis-titan-main: Added 02/18/2010 16:48:07 by perforce
-KS

ACE failing server out using TCP health probe

We have a mix of ACE20s and ACE30s currently and I am seeing the ACE in both HW platforms failing out our servers sporadically after a sucessful TCP handshake. Here is the configuration:
probe tcp TCP-25
   port 25
   interval 25
   faildetect 2
   passdetect interval 90
   open 10
When I do a show probe TCP-25 detail I see the default recv timeout is 10.
I captured a trace between the ACE and the server. When the health probes pass I see a good 3 way TCP handshake, then 50ms later the server sends a SMTP 220 then ace from ace, fin ack from ace and graceful TCP termination occurs. When the probe fails I see a sucessful TCP handshake but the ACE sends FIN ACK 47ms after it sends ACK for the TCP connection. Server then sends ACK and ACE sends RST.
Shouldn't ACE wait 10 seconds in this example for server to respond after TCP handshake?

TAC/Martin Nash was very helpful in explaining this. The TCP 3 way handshake was sucessful, but the ACE sent a FIN ACK as expected, but after the server sent an ACK the server did not send a FIN ACK so the ACE marked it down. The health check not only requires a 3 way handshake, but a clean teardown of the TCP session.

Why do I see "FAILED" for probes on standby ACE?

Here there,
I am running a pair of ACE in redundancy mode for HA and have created multiple context.
here is my basic config for the serverfarm.
serverfarm host VPN_Farm
transparent
failaction purge
predictor leastconns
probe ICMP_Probe
rserver SVR_A
    probe ICMP_Probe
    inservice
rserver SVR_B
    probe ICMP_Probe
    inservice
So, on the active unit, I can see that the probes are running fine. However, if I do "show probe" on the standby unit, it appears that all my probes fail.
Result of "show probe" captured from Standby Unit.
probe       : ICMP_Probe
type        : ICMP
state       : ACTIVE
   port      : 0       address     : 0.0.0.0         addr type : -
   interval : 15      pass intvl : 60              pass count : 3
   fail count: 3       recv timeout: 10
                ------------------ probe results ------------------
   associations ip-address      port porttype probes   failed   passed   health
   ------------ ---------------+-----+--------+--------+--------+--------+------
   rserver        : SVR_A
                      1.1.1.1   0     --                       109      109      0        FAILED
is it normal to see failed probe on the standby unit?
Thank you
Best Regards

Hi Hyeon,
Some questions here.
Is this an ACE module or an ACE 4710? What version?
Are both ACEs peers connected to the same switch or how you got them setup? Can you describe a little bit your topology?
From the standby, Did you try to ping/telnet the servers?
Did you try to remove the probe and re-add it back? (get a #show tech-support before and after)
Is there any firewall or L3 device between the ACEs and the servers?
Do you use these servers for several contexts? Is the probe failing in all the contexts?
Jorge

CSM 4.2(5): Reoccuring failed health probes

Hi all
I've finally started to investigate an issue I have with our CSM setup. Several times a day I get the below syslog message from the 6500
10:49:11: %CSM_SLB-6-RSERVERSTATE: Module 4 server state changed: SLB-NETMGT: TCP health probe failed for server
Then a few seconds later
10:49:41: %CSM_SLB-6-RSERVERSTATE: Module 4 server state changed: SLB-NETMGT: TCP health probe re-activated server
I never seems to catch the event in action and can never verify if the real server is indeed failed or if this is only a probe timeout. I have both layer 2 and layer 3 server farms in operation and this problem occurs on all of my server farms a few times a day.
No pattern and I have no other indications of any problems. I have most of the probes set on 1 repeat and 30sec timeout. Increase the probe timeouts perhaps?
Regards
Fredrik

Those error messages are related to probing the CSM does when determining server health. For a TCP probe, this means that the CSM either gets a TCP RST from the server or it does not see a SYN-ACK coming from the server.

ACE failed probe and established connections

Hello,
I have four ACE 4710. Each pair of ACE is in one geographical location. Probes are configured so that it is checking regular regex (HTTP GET).
When there is need rserver update we change text in our testpage.html (for ie. from "OK" to "SUSPEND" ) so that probe detect fail.
In fact rservers are still operational, but should not accept new connections. This works fine.
BUT I observed that established connection/sessions did not end up after probe fails. ACE probably wait for openned/established connections to end up and it is what I am askign for.
What happens if probe fails but in fact rserver is operational? I thought that if probe fails it also end up/cut all established connections to rserver. But seems it is not true. Does anybody has this experience?
Thanks for your opinion.
Jan

Hello Jan,
if I understood correctly what you're looking for is domented in the area for the failaction command which actually makes the ACE behavior on this aspect configurable:
http://www.cisco.com/en/US/docs/app_ntwk_services/data_center_app_services/ace_appliances/vA4_2_0/configuration/slb/guide/rsfarms.html#wp1117375
indeed the default behavior of the ACE is to take a failed real server out of load-balancing rotation for new connections and to allow existing connections to complete.
Hope it helps,
Francesco

Cisco ACE Mod 30 - HTTPS probes are failing after hardware replacement.

We recently had a hardware failure on ACE Mod30. The replacement went in relatively painless (except for having to import about 100 SSL Certificates and Private Keys).
However, on the new ACE, the HTTPS probes are failing for all contexts using them. We can work around this by using TCP-443 probe, but the customer prefers that we actually request a logon page to ensure that the application is running properly.
Here are the probe stats for one context (THIS ONE IS ACTIVE)
BRTDCSCRTR2/INTRA-DEV-TST# sho stats probe type https
+------------------------------------------+
+----------- Probe statistics -------------+
+------------------------------------------+
----- https probe ----
Total probes sent : 52422 Total send failures : 0
Total probes passed : 0 Total probes failed : 52422
Total connect errors : 0 Total conns refused : 0
Total RST received : 0 Total open timeouts : 52422
Total receive timeout : 0 Total active sockets : 0
Here are the probe stats for one context (THIS ONE IS HOT_STANDBY)
BRTDCSCRTR2/INTRA-PROD# sho stats probe type https
+------------------------------------------+
+----------- Probe statistics -------------+
+------------------------------------------+
----- https probe ----
Total probes sent : 69398 Total send failures : 0
Total probes passed : 0 Total probes failed : 69398
Total connect errors : 0 Total conns refused : 0
Total RST received : 0 Total open timeouts : 69398
Total receive timeout : 0 Total active sockets : 0
Everything else appears to be working properly, except for the HTTPS probes.

Hi,
For HTTS Probes to be successful, you don't need to have SSL Certs/Private keys on ACE, unless servers are doing client authentication. When ACE sends HTTS Probes to servers, it acts as a client.
Here are few things that can be tried:
- Test HTTS probe with only one server. Reload the server to clear any SSL cache on it.
- check SSL probe detail to verify the error code received
- Take captures between ACE and that server to find at what stage of the probe packet exchange flow is failing.
Here is a good link to troubleshoot HTTPS probe issues:
http://docwiki.cisco.com/wiki/Cisco_Application_Control_Engine_%28ACE%29_Troubleshooting_Guide_--_Troubleshooting_ACE_Health_Monitoring#Troubleshooting_an_HTTPS_Probe_Error
Regards,
Hasham

IP SLA Probe Failing

Similar Messages

Maybe you are looking for