ACE: probe failing
Hi,
I've following probe configured:
probe http probe1.test.com:10114
port 10114
interval 34
faildetect 17
passdetect interval 60
expect status 200 200
header Host header-value "hcmfincrp1.test.com"
open 1
and it is applied to serverfarm. but health check is failing. I see following when I do "sh probe probe1.test.com:10114 detail":
sh probe probe1.test.com:10114 deta
probe : probe1.test.com:10114
type : HTTP
state : ACTIVE
description :
port : 10114 address : 0.0.0.0 addr type : -
interval : 34 pass intvl : 60 pass count : 3
fail count: 17 recv timeout: 10
http method : GET
http url : /
conn termination : GRACEFUL
expect offset : 0 , open timeout : 1
expect regex : -
send data : -
------------------ probe results ------------------
associations ip-address port porttype probes failed passed health
------------ ---------------+-----+--------+--------+--------+--------+------
serverfarm : probe1.test.com:443
real : server1.test.com[10114]
192.168.1.110114 PROBE 41531 19556 21975 FAILED
Socket state : CLOSED
No. Passed states : 5 No. Failed states : 6
No. Probes skipped : 0 Last status code : 0
No. Out of Sockets : 0 No. Internal error: 0
Last disconnect err : Unrecognized or invalid response
Last probe time : Wed Oct 12 17:43:30 2011
Last fail time : Tue Oct 11 02:33:52 2011
Last active time : Sun Oct 9 20:24:02 2011
May i know why health check is failing? why am I seeing msg "Last disconnect err : Unrecognized or invalid response" ?
Hi ,
This error means, that the ace is not receiving a 200 ok response from the server, this happens when server is not responding it or it is receiving that do not have a host header having value hcmfincrp1.test.com , which you have definied, or the page has got modified. Please check if your http server is working fine.
Regards
Abijith
Similar Messages
-
ACE : PROBE-FAILED and Syslog messages
Hi,
When a real server is in PROBE-FAILED status, I observe a syslog message at each trial of the proble. This fills our syslog server. Is there a mean to configure the ACE in such a way that a syslog message would be generated only when a transition occurs in the probe status ?
Thank you for any hints,
YvesHello,
You can utilize "logging trap " command and
"logging message level " command
in order to achive what you are seeking.
The "logging trap " command limits the logging messages sent to a syslog server based on severity.
If it is set to "5 - notification", all messages that have security level of 5 or lower number are sent to the syslog server.
You can disable the display of a specific syslog
message or change the severity level of a specific system log message using
"logging message level " command.
Not sure what kind of probe you are using but If it is ICMP probe and
the reason of probe failure is arp, it generates a message for every try
as below with severity level of 3, by default.
%ACE-3-251009: ICMP health probe failed for server 192.168.0.1, connectivity error: ARP not resolved for destination ip address
%ACE-3-251009: ICMP health probe failed for server 192.168.0.1, connectivity error: ARP not resolved for destination ip address
%ACE-3-251009: ICMP health probe failed for server 192.168.0.1, connectivity error: ARP not resolved for destination ip address
%ACE-3-251009: ICMP health probe failed for server 192.168.0.1, connectivity error: ARP not resolved for destination ip address
%ACE-3-251009: ICMP health probe failed for server 192.168.0.1, connectivity error: ARP not resolved for destination ip address
%ACE-5-441002: Serverfarm (SF) is now back in service in policy_map (fs) -->
class_map (#class_default_slb). Number of failovers = 0, number of times back in service = 0
%ACE-4-442007: VIP in class: 'VIP' changed state from OUTOFSERVICE to INSERVICE
%ACE-5-441002: Serverfarm (SF) is now back in service in policy_map (fs) -->
class_map (#class_default_slb). Number of failovers = 0, number of times back in service = 0
%ACE-4-442004: Health probe ICMP detected rserver r1 (interface vlan31) changed state to UP
%ACE-4-442001: Health probe ICMP detected r1 (interface vlan31) in serverfarm SF changed state to UP
If your "logging trap " is set to "5 - notification" and you do not want
the message "%ACE-3-251009:xxx" to be sent to syslog server,
you can change its security level like below.
switch/Admin(config)# logging message 251009 level 6
switch/Admin(config)# do show logging message 251009
Message logging:
message 251009: current-level 6 default-level 3 (enabled)
You can check the message id that is filling the syslog server
and change its security level to higher number than "logging trap ".
Regards,
Kimihito. -
Probe fail on Standby ACE in One-armed mode
Hi there
I'm Kilsoo.
I made One-armed mode using ACE.
Real servers are in away Vlan from ACE.
So, I configured the PBR with ACE alias ip address for the next-hop on the real server's gateway interface.
And, the probe from active ACE works well.
But, the probe from standby ACE was fail.
At this point, my first question
Is it normal situation that the probe fail from standby ACE????
So, I made the route-map for PBR like below for temporary solution.
route-map deny PBR 5
match ip address Probe_ACL
route-map permit PBR 10
match ip address L4_ACL
set ip next-hop <Alias IP address>
ip access-list extended Probe_ACL
pemit ip any <Standby ACE's IP address>
ip access-list extended L4_ACL
permit tcp <Real server's IP address> eq 80 any
Second question...
Do you have any other good solutions???
ThanksHi Cesar
Thanks for your reply.
But I think I was confuse when I wrote the message.
I used both ace's vlan ip address for next-hop ip address like your advice.
Do you know the standby ace can't check probe without route-map in one-armed mode like below diagram???
Backbone Router
|
|
|
Supervisor --------------------ACE(vserver: 172.19.100.100)
| (vlan 200)
|
|
|(vlan 110)
|
|
Real servers
(172.19.110.111) -
ACE Module: Recover a real server probe-failed status
How does the ACE module recover a real server that has entered a probe-failed status state? We are doing some testing, purposely dropping a servers interface. ACE recognizes the server as being down and show it in a probe-failed state. When we bring the system's interface back up, will ACE see this and automatically bring the state back into Operational status, or does someone have to do something on the ACE module?
ACE continues to probe servers that are down or probe_failed. As soon as a server starts responding again its state will switch to alive again.
Nothing to be done.
Gilles. -
Ace probe failure after IIS app pool recycle?
Windows Server 2003 SP2
ACE Module A2(1.6a)
I suspect this is caused by an IIS6 setting, but posting here in case anyone has seen this. For this one particular site, we have 4 servers in the farm. 2 of those servers are fine. The other 2 (new) servers will generate probe failure after the site's app pool recycles. I then remove the 2 servers from service and re-activate (no inservice, then inservice) and the probe comes back as operational. It appears that the app pool recycle somehow is resetting the hash on the default page, though I'm not sure how. Any ideas are very much appreciated.Yeah, the hash is inside the probe. Here's the config for the serverfarm and the probe. Public-007 and Public-008 are new servers...the other 6 have been in the farm for the last 2.5 years and they don't have this issue. It's only the 2 new boxes that the probe fails when the app pool is recycled.
serverfarm host PUBLIC
probe URL-DEFAULT-ASPX
rserver PUBLIC-001
inservice
rserver PUBLIC-002
inservice
rserver PUBLIC-003
inservice
rserver PUBLIC-004
inservice
rserver PUBLIC-005
inservice
rserver PUBLIC-006
inservice
rserver PUBLIC-007
inservice
rserver PUBLIC-008
inservice
probe http URL-DEFAULT-ASPX
interval 2
faildetect 2
passdetect interval 2
passdetect count 2
request method get url /default.aspx
expect status 200 200
hash -
ACE ; probe for host header-value
Hi,
we have following probe setup. sometimes this probe fails because server resets the connection but server team claims there aren't any issues with server.
probe https probe1.abc.com:10456
port 10456
interval 34
passdetect interval 17
ssl version all
expect status 200 200
header Host header-value "probe1.abc.com"
open 1
is there a way to validate able probe using linux/linux servers? i.e. using unix/linux server is there a way to send that host header-value to the servers and see if servers are responding with 200 OK status? if not from Unix/Linux servers than if there any otherway to validate it apart from validating it from ACE?
Thanks...or can we do it using window? maybe using firefox on windows machine?
please advise. -
Looking for ACE Probe TCL script specific for LDAPS
Hello Everyone,
I have searched the forum, and i am having difficulty finding an example of how to modify the LDAP TCL probe from port 389 to secure LDAP port 636.
Could someone kindly point me or provide me the modified TCL script if you happen to have it.
During my search I also found a config that someone had provided, which contained the following probe:
probe tcp LDAPS_Probe
port 636
probe tcp LDAP_Probe
port 389
I was trying to figure out if this a modified TCL script for LDAP or modifed TCP TCL script specific for port 636.
This is how I applied the script for LDAP port 389.
script file 1 LDAP_PROBE
probe scripted LDAP_PROBE_389
interval 5
passdetect interval 30
receive 5
script LDAP_PROBE
serverfarm host SF-LDAP-389
description SF LDAP Port 389
predictor leastconns
probe LDAP_PROBE_389
rserver LDAP-RS1-389
inservice
I will be more than glad to provide you any additional information that you need.
As always thanks for your input.
Raman Azizian
SAIC/NISN Network servicesnormally you would engage a TCL developer or ciso advanced services to develop a custom script for anything other than what Cisco provides in canned scripts. If you are comfortable with tcl you can do it yourself. Here is an example of the LDAP script modified to include initiation via ssl. default port is 389 when you implement you would specify 636.
#!name = LDAP_PROBE
# Description:
# LDAP_PROBE opens a TCP connection to an LDAP server, sends a bind request. and
# determines whether the bind request succeeds. LDAP_PROBE then closes the
# connection with a TCP RST.
# If a port is specified in the "probe scripted" configuration, the script probes
# each suspect on that port. If no port is specified, the default LDAP port 389
# is used.
# Success:
# The script succeeds if the server returns a bind response indicating success
# (status code 0x0a0100) to the bind request.
# The script closes the TCP connection with a RST following a successful attempt.
# Failure:
# The script fails due to timeout if the response is not returned. This
# includes a failure to receive ARP resolution, a failure to create a TCP connection
# to the port, or a failure to return a response to the LDAP bind request.
# The script also fails if the server bind response does not indicate success.
# This specific error returns the 30002 error code.
# The script closes any attempted TCP connection, successful or not, with a RST.
# PLEASE NOTE: This script expects the server LDAP bind response to specify length
# in ASN.1 short definite form. Responses using other length forms (e.g., long
# definite length form) will require script modification to achieve success.
# SCRIPT version: 1.0 April 1, 2008
# Parameters:
# [DEBUG]
# username - user login name
# password - password
# DEBUG - optional key word 'DEBUG'. default is off
# Do not enable this flag while multiple probe suspects are configured for this
# script.
# Example config :
# probe scripted USE_LDAP_PROBE
# script LDAP_PROBE
# Values configured in the "probe scripted" configuration populate the
# scriptprobe_env array. These may be accessed or manipulated if desired.
# Documentation:
# A detailed discussion of the use of scripts on the ACE is included in
# "Using Toolkit Command Language (TCL) Scripts with the ACE"
# in the "Load-Balancing Configuration Guide" section of the ACE documentation set.
# Copyright (c) 2005-2008 by Cisco Systems, Inc.
# debug procedure
# set the EXIT_MSG environment variable to help debug
# also print the debug message when debug flag is on
proc ace_debug { msg } {
global debug ip port EXIT_MSG
set EXIT_MSG $msg
if { [ info exists ip ] && [ info exists port ] } {
set EXIT_MSG "[ info script ]:$ip:$port: $EXIT_MSG "
if { [ info exists debug ] && $debug } {
puts $EXIT_MSG
# main
# parse cmd line args and initialize variables
## set debug value
set debug 0
if { [ regsub -nocase "DEBUG" $argv "" argv] } {
set debug 1
ace_debug "initializing variable"
set EXIT_MSG "Error config: script LDAP_PROBE \[DEBUG\]"
set ip $scriptprobe_env(realIP)
set port $scriptprobe_env(realPort)
# if port is zero the use well known ldap port 389
if { $port == 0 } {
set port 389
# PROBE START
# open connection
ace_debug "opening socket"
set sock [ socket -sslversion all -sslcipher RSA_WITH_RC4_128_MD5 $ip $port ]
fconfigure $sock -buffering line -translation binary
# send a standard anonymous bind request
ace_debug "sending ldap bind request"
puts -nonewline $sock [ binary format "H*" 300c020101600702010304008000 ]
flush $sock
# read string back from server
ace_debug "receiving ldap bind result"
set line [read $sock 14]
binary scan $line H* res
binary scan $line @7H6 code
ace_debug "received $res with code $code"
# close connection
ace_debug "closing socket"
close $sock
# make probe fail by exit with 30002 if ldap reply code != success code 0x0a0100
if { $code != "0a0100" } {
ace_debug " probe failed : expect response code \'0a0100\' but received \'$code\'"
exit 30002
## make probe success by exit with 30001
ace_debug "probe success"
exit 30001 -
Hello All,
I have a physical server running behind the ACE module ACE20-MOD-K9. The Server has several virtual machines. One of that virtual machines, has a WEB SERVER running virtual https servers. For example, server with IP address 10.0.0.20/24, has serveral virtual HTTPs servers as of www.virtual.local, www.virtual2.local, www.virtual3.local. So, if you nslookup the servers, they all respond with 10.0.0.20 IP address. So if I do https://www.virtual.local goes to 10.0.0.20 and read the VIRTUAL SERVER config and replies back to the request.
Now, I am trying to verify that the TCP connection (443) and the HTTPS server itself is up and running but only for the www.virtual.local site and not for the other 2.
I've got 2 probes for that:
probe tcp TCP_443
port 443
interval 20
passdetect interval 30
probe https TEST_HTTPS
interval 20
passdetect interval 30
request method head url https://www.virtual.local/default.htm
expect status 200 200
The problem that I am facing is tha the HTTPS probe fails randomly. The TCP probe works fine.
ThanksHi Fernando,
If it is failing randomly, what do you see in last status code and last disconnect err? That should show you the reason and give an idea where is the problem.
Regards,
Kanwal -
Hi,
I've general question about ACE probe timers. I've following probe setup:
probe https probe:1061
port 1061
interval 34
passdetect interval 17
open 1
ACE# sh probe probe:1061detail
probe : probe:1061
type : HTTPS
state : ACTIVE
description :
port : 1061 address : 0.0.0.0 addr type : -
interval : 34 pass intvl : 17 pass count : 3
fail count: 3 recv timeout: 10
===
for above probe: when ACE will declare the server as down? will it declare it down after (17*3+34) 85 seconds or it will declare it down after 115 seconds (added recv timeout=secs 3 times = 30 seconds).
please help.
========
we did a test and bought down the server manually. ACE declared the server down after 91 seconds (from the time when server was brought down).Hi Gavin, Krishna,
The explanation for all these parameters can be found in the health monitoring section of the configuration guide (
http://www.cisco.com/en/US/docs/interfaces_modules/services_modules/ace/vA2_3_0/configuration/slb/guide/probe.html#wp1031040)
Below are the definitions quoted from the guide:
Interval:
The time interval between probes is the frequency that the ACE sends probes to a server marked as passed. You can change the time interval between probes by using the interval command
Faildetect:
Before the ACE marks a server as failed, it must detect that probes have failed a consecutive number of times. By default, when three consecutive probes have failed, the ACE marks the server as failed. You can configure this number of failed probes by using the faildetect command
Passdetect interval/count:
To configure the time interval after which the ACE sends a probe to a failed server and the number of consecutive successful probes required to mark the server as passed, use the passdetect command.
So, to summarize, taking Gavin's configuration as example. A server failure would be detected in a time between 78 seconds (2x34 +10) and 112 (3x34 +10). Once it's down, it will become operational between 34 (2x17) and 51 (3x17) seconds after it comes back up.
I hope this helps
Daniel -
ACE Probe regex and escaping Parenthesis
I'm trying to setup a ACE probe that expects a return of
(server.domain.com) EXISTS=TRUE,AVAILABLE=TRUE,ACTIVE=TRUE
But it doesn't appear that I can use Parenthesis inside a regex. I've tried escaping as well.
expect \(server\.domain\.com\) EXISTS=TRUE,AVAILABLE=TRUE,ACTIVE=TRUE
% invalid command detected at '^' marker. Pointing at the (
But this doesn't work either. Any ideas?Hi,
Hi,
If it has taken it, it should match the response from server. Is it still not matching?
If you look at the regex builder below, the regex matches the response which is expected from the server. So ACE should be able to match it.
Also, you can try and put \ before dots but not sure. In my opinion it should work fine with what we have put in already. If it doesn't we will have to use hit and trial. Let me know if you need this regex builder. You can download it from google though. In any case i just attached it. -
Managedavailability error about inbound proxy probe failing
getting a managedavailability error about inbound proxy probe failing.
what does the inbound proxy probe do, and why does this error reference forefront, when I do not have MS forefront installed?
below are the details:
The inbound proxy probe failed 3 times over 15 minutes.
The inbound proxy probe failed 3 times over 15 minutes. No connection could be made because the target machine actively refused it 127.0.0.1:25 Probe Exception: 'System.Net.Sockets.SocketException (0x80004005): No connection could be made because the target
machine actively refused it 127.0.0.1:25 at System.Net.Sockets.TcpClient..ctor(String hostname, Int32 port) at Microsoft.Forefront.Monitoring.ActiveMonitoring.Smtp.Probes.SimpleSmtpClient.Connect(String server, Int32 port, Boolean disconnectIfConnected) at
Microsoft.Forefront.Monitoring.ActiveMonitoring.Smtp.Probes.SmtpConnectionProbe.MeasureLatency(String reason, ActionWithReturn`1 cmd) at Microsoft.Forefront.Monitoring.ActiveMonitoring.Smtp.Probes.SmtpConnectionProbe.MeasureLatency(String reason, ActionWithReturn`1
cmd, ConnectionLostPoint connectionLostPoint) at Microsoft.Forefront.Monitoring.ActiveMonitoring.Smtp.Probes.SmtpConnectionProbe.TestConnection() at Microsoft.Forefront.Monitoring.ActiveMonitoring.Smtp.Probes.SmtpConnectionProbe.DoWork(CancellationToken cancellationToken)
at Microsoft.Office.Datacenter.WorkerTaskFramework.WorkItem.Execute(CancellationToken joinedToken) at Microsoft.Office.Datacenter.WorkerTaskFramework.WorkItem.<>c__DisplayClass2.b__0() at System.Threading.Tasks.Task.Execute()' Failure Context: 'No connection
could be made because the target machine actively refused it 127.0.0.1:25' Execution Context: '' Probe Result Name: 'OnPremisesInboundProxy' Probe Result Type: 'Failed' Monitor Total Value: '3' Monitor Total Sample Count: '3' Monitor Total Failed Count: '0'
Monitor Poisoned Count: '0' Monitor First Alert Observed Time: '1/27/2015 4:46:45 PMHi,
Please use the Get-ServerHealth and Get-HealthReport commands to check result.
Here is a related blog for your reference.
http://blogs.technet.com/b/exchange/archive/2013/06/26/managed-availability-and-server-health.aspx
Best regards,
Belinda Ma
TechNet Community Support -
Hi,
I have a router configured with ip sla icmp echo to track some ip addresses in order to make some routing decisions.
However, although I can ping the IP addresses just fine from the router with the correct source interface, their corresponding ip sla probes fails sometimes. At first usually all seems to be working fine then after some time some ip addresses fails and stay that way and usually only way i can fix this is to track new IP addresses.
My router is a C2811 with Advanced IP Services 12.4(22)T.
Anyone faced a similar problem before.
Thanks,What error do you see when you run "show ip sla stat" for these collectors? When they are failing can you manually ping the same addresses from the device? If you enable "debug ip sla error" and "debug ip sla trace" what messages do you see?
-
ACE http redirect on probe fail & others
Hi everyone,
I have multiple http based application running on 2 servers and they all be referenced behind the publised VIP from the load balancer.
The probes are already there, applications are accessed but one criteria from the business is not to fail the whole server for one application. There is some independance between the apps that if one fails, the other would need to still load balanced.
I would like, if the application fails on both server, to maybe be able to redirect to another URL any request for a particular App/URL.
Any suggestions ?Hi,
To not declare a real server down if one of its applications fail, you should configure your probes in your serverfarm, and (if not already done) create a serverfarm per application.
If you want to be able to redirect a request send to a failed serverfarm, you can configure a backup serverfarm in you L7 policy map like this:
serverfarm name1 backup name2
The second serverfarm should then be of the type:
serverfarm redirect name2
webhost-redirection relocation_string [301 | 302]
where the relocation_string is the URL that should be used, 301 is permanently moved and 302 is temporarily.
For the relocation_string, you can use following special characters:
%h Inserts the hostname from the request Host header
%p Inserts the URL path string from the request
Mor info can be found in this doc:
http://www.cisco.com/en/US/docs/interfaces_modules/services_modules/ace/v3.00_A2/configuration/slb/guide/slbgd.html
Hope this helps.
Kr,
Dario -
Configured a Probe to check the heath of server webpage .But getting a status code of 400.
probe http PROBE_80
interval 10
faildetect 2
passdetect interval 10
passdetect count 2
receive 5
request method get url http://<host>:<port>/eml/HealthCheckServlet
expect status 200 202
open 10
getting below status code .would like to know the correct format for the requesr method of the above url
real : app02p[0]
192.168.10.6 80 VIP 161 161 0 FAILED
Socket state : CLOSED
No. Passed states : 0 No. Failed states : 1
No. Probes skipped : 0 Last status code : 400
No. Out of Sockets : 0 No. Internal error: 0
Last disconnect err : Received invalid status code
Last probe time : Tue Mar 17 02:53:58 2015
Last fail time : Tue Mar 17 02:27:15 2015
Last active time : NeverHi Hari,
Does this URL return status 200 when you send the request directly from your browser?
You should use the exact URL here. If the URL is fine, then check with your server team why server is responding with 400. The syntax looks fine. You can also take a pcap on server and see what is ACE sending for probe.
Regards,
Kanwal
Note: Please mark answers if they are helpful. -
Hello everyone, okay?
I was thinking of a possibility to use my ACE to monitor a database, in this case a MySQL database Today I use a TCP probe, monitoring the port, but I would go one step further and try to make a connection in the DATABASE.
I would like to see the possibility of a guideline in creating a TCL script to make a simple connection to a database.
The idea is to try to make a connection in a database, run a query / select on any table just to validate its functionality and not just checking if the port is responding.
I do not know how complex it is or what would be my pre -requisites required, but any help would be welcome.
I thought about using an HTTP probe to make this validation and use a web page making the connection to the database, but it ended up creating another layer and if there is any problem in web service, the database would be affected indirectly.
Thank you. All suggestions are welcome.Hi Plinio,
I cannot see any support for testing authentication, SQL queries or connections to a database that is supported directly in TCL at this time.
Here is the TCL guide that expalains the supported commands ( there is a HTTP example probe at the bottom )
http://www.cisco.com/en/US/docs/app_ntwk_services/data_center_app_services/ace_appliances/vA4_2_0/configuration/slb/guide/script.html
Beyond a TCL TCP probe to the port to test the listener is running, I believe your suggestion of a HTTP TCL script is probably the most accurate way to check the integrity of the database. You could write code to set a certain response to all types of failure scenarios and on the ACE you could then use a HTTP TCL script to parse the response from the web server to identify exactly what has failed in your database and act accordingly.
cheers,
Chris
Maybe you are looking for
-
Setting up a 5 GHz only network with my AirPort Extreme
"I recently purchased a 21.5" Samsung monitor that is connected to my late 2009 13" 2.26GHz 8GB RAM MacBook Pro through a VGA connection with the VGA to Mini DisplayPort adaptor. Since I've started using it, my Internet connection has slowed to a cra
-
Viewing embedded video on the touch problems
I post quicktime videos on my web page. They are encoded in h.264 the audio is ACC. I use Apples approved javascript solution for posting Quicktime Videos to the web to avoid problems with Internet Explorer that began a couple of years ago. I recentl
-
Lack of intelligent assistance when a fault is rep...
My elderly neighbour lives alone (her dog died yesterday to make things worse). Her phone stopped working on Monday and since then she has been housebound without any form of means of contact whatsoever. No mobile works at her cottage as it is dee
-
Solaris & Windows 2000 Problems???
I've mounted my windows partition so that solaris can see it, via the vfstab file in the /etc directory. Whenever I try to copy files from my windows partition to my solaris partition i get the following error: "Invalid device or address" What I'm wo
-
when I run ij.bat, I receive this error "java is not recorgnised as an internal or external command, operable program or batch file." Not too sure what is the problem.