Fault management

Hi All,
i'm trying to write a proposal for a client involving an all-Cisco solution, and am getting stuck with a particular requirement with the network management software -- i.e. it should be able to detect and "automatically fix" any faults in the network.
Client setup is relatively simple: basically there will be a web application running in a DMZ, accessible to the outside and using a db backend in the local lan. No audio, video or any other service is required.
The client want the NMS to
1.) proactively monitor *all* devices -- switches, firewall appliances, the works -- in a centralized view
2.) do trend analysis and identify network bottlenecks
3.) Fault detection, notification, and automatic resolution.
Currently i'm looking into NetIQ, NetExpert, etc.; which of the Cisco NMS products fit here? Cisco Data Center Network Manager? Cisco Prime?
tia

In my opinion (based on almost 30 years of data networking experience and over 15 with Cisco), no tool will "automatically fix" any faults in the network. You can build in some simple "if, then" actions with things like Cisco's EEM technology but that is limited to things you anticipate and for which you have a relatively simple remedy at hand.
Any network management approach is only one leg of the "people, processes and tools" triad. Any one (or two) of those without the other(s) is not sufficient. Typically, the more a tool promises to do all of this, the higher its cost (both acquisition and ownership) and the more work it is to implement.
My advice is to focus equally on the people and processes bits and let the tool(s) play their supporting role. Tools are not a substitute for people and processes.

Similar Messages

Oracle Fault Management Framework - how to intercept first invocation

I know the Oracle Fault Management Framework is only activated on faults that occur on invokes.
Which is fine for all internal calls.
However, is there any way to get the framework to be activated on the very first invoke?
I want to be able to get the framework to log all fault details and then rethrow. I'ms using Oracle version 10.1.3.4.
However, if an external third party calls my process (say Process A) and an error occurs in process A that is NOT from an invoke, eg xpath error.
The framework does NOT get activated.
Any ideas how to get around this other than going into every BPEL and adding in a catchAll , log, throw fault?
Thanks.

There should be no difference between managing internal and external calls using the fault handler.
If we can just talk about process A as a BPEL process which utilizes the fault management.
It should not matter where the services process A calls reside. The fault management system will only get calls as part of an error occurring on an invoke.
I don't agree with your comment that the framework will get called if a fault occurs any where within the domain. e.g. If process A has an assign activity and fails because it can't convert a string into date. The fault management system will not be called. Therefore it is independent if the services process A calls are internal or external.
What is your reasoning behind having a process that continues in the event of a fault when the first fault could cause subsequent faults.
If no fault is manged either by the framework or a catch then the raw error will be returned back to the consumer. So you will be able to debug that initial fault. Does this not fulfill your uses case?
cheers
James

BPEL 10.1.3.5 Fault Management - Using Xpath in Fault Policy conditions

Hi all.
I have a requirement to use the xpath functions "contais" and "upper-case" inside a condition in a fault policy file. I've done some tests and didn't get successful results so far.
My first test using the policy file was the following:
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
          xmlns:ebsv1="http://www.claro.com.br/EBS/Claro/v1"
          xmlns:ebov1="http://www.claro.com.br/EBO/Claro/v1"
          name="ebsv1:TechnicalFault">
     <condition>
     *<test>$fault.TechnicalFault/ebsv1:TechnicalFault/ebov1:message = 'TE-0001'</test>*
     <action ref="ora-human-intervention" />
     </condition>
In this first test I used a simple expression just to test the overall namespace declarations and xpath navigation. It worked as expected.
Second, I modified the test to use the "contais" function. I need to use this function because my message will eventually contain the value 'TE-0001' mixed with other string:
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
          xmlns:ebsv1="http://www.claro.com.br/EBS/Claro/v1"
          xmlns:ebov1="http://www.claro.com.br/EBO/Claro/v1"
          name="ebsv1:TechnicalFault">
     <condition>
     *<test>contains($fault.TechnicalFault/ebsv1:TechnicalFault/ebov1:message, 'TE-0001')</test>*
     <action ref="ora-human-intervention" />
     </condition>
In this second test I allways get FALSE results, meaning that this expression may not be correct. I'm certain that this should be evaluated to TRUE as the test scenario is the same as the first test. Is there something missing?? I turned on debug level log in the domain but didn't find any hint about the fault management processing.
Besides the use of the contains function, it would be nice if I'm able use the function upper-case. Something like this:
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
          xmlns:ebsv1="http://www.claro.com.br/EBS/Claro/v1"
          xmlns:ebov1="http://www.claro.com.br/EBO/Claro/v1"
          xmlns:xp20="http://www.oracle.com/XSL/Transform/java/oracle.tip.pc.services.functions.Xpath20"
          name="ebsv1:TechnicalFault">
     <condition>
     *<test>contains(xp20:upper-case($fault.TechnicalFault/ebsv1:TechnicalFault/ebov1:message), 'TE-0001')</test>*
     <action ref="ora-human-intervention" />
     </condition>
Any ideas???
Thanks
Denis

Hi again.
Has anyone been able to use any kind of Xpath function inside a policy file? If so, could you please share the code fragment including the namespace declarations and the conditions?
Does anyone know if Fault Management Framework at least support the use of Xpath functions?
Thanks.
Denis

10.1.3.3 fault management to simplify our complex error hospital pattern

We have designed an error hospital pattern on 10.1.3.1, this consists of :
1. A loop around key components of each process which may fail and need retry (e.g. invokes, database operations)
2. Within each error loop a fault handler, this catches faults as they are thrown and passes them onto another BPEL error handling process (with a nested fault handler in case this invoke fails which spits error out to a file), the error handler can reply with retry (loop around again) or cancel (exit loop), if no response is received an automated retry occurs
3. An error service process which creates a human task for an error to be dealt with
4. A web app which allows users to process the tasks created by the error service, user can inspect data, see fault detail and mainly just select to cancel or retry the failed operation (including multi select if for example hundreds of processes had hit the same error)
This has been refined and is now operating nicely in production, but the development (items 1 and 2 above) and testing effort involved is high - frequently more than the actual business logic. In a 3GL this error handling would be simplified by having a nice reusable piece of code (proceudre or method call) that could go in the error handling loop, rather than putting a whole set of logic (invoke, pick, handling of faults in the error handling code, etc).
I'm aware of the fault management framework in 10.1.3.3 which does a subset of our functionality, we are now looking back at the error hospital we have created and wondering if we could simplify. The main stumbing block I see with the fault management framework is that we need to decide whether to do an automated retry after some interval or whether to await human action, we want to do both, await human activity but if they don't action it retry the operation. I'm thinking that we could achieve everything that we have got by a slight manipulation of the error management framework, as follows :
A. Define faults such that all faults cause human intervention.
B. Create a daemon process which scans for activities requiring human intervention and automatrically triggers a rety after a configurable period (so that things like database errors will get auto retried)
C. Adapt our web app to look for and operate on activities outstanding rather than workflow tasks.
Questions:
i) To support B/C above is there a java api this daemon can use or is it a question of updating the rdbms
ii) Am I underestimating what can be done with the java action fault policy ?
iii) To make our existing approach easier, is there a way of doing an invoke and pick operation actually within an embedded java routine to minimise the amount of code we have in each error handling block
Hope this makes sense, thanks in advance - there seem to be a lot of knowledgeable people out there in this forum.

Hi,
The next blogs might be interesting for you. It addresses your questions.
http://technology.amis.nl/blog/?p=2485
http://www.it-eye.nl/weblog/2007/09/10/oracle-bpel-10133-fault-policy-management/
Kind regards,
Harm

Fault Management Framework

Hi folks,
I am studying the Fault Management Framework that comes with BPEL.
It is a really nice feature: in a single place I can define a strategy for handling errors that occurs in the invoke activity of any BPEL process.
I have a question about this: is there a similar tool or framework to handle in a single place errors that can occur in ANY activity of the BPEL process not only in the invoke activity.
In other words, instead of using catchall in all of BPEL processes (and writing the same code in all BPEL processes) can I use the Fault Management Framework (or write some BPEL code) to catch errors like : selectionFailure, joinFailure or others and associate specific action with them?
If anybody has an idea feels free to explain it!

No.
Marc
http://orasoa.blogspot.com

LMS 4.0 Fault Management Module alert doesn't show CurrentUtilization

Hi,
I would like to know if there's a way to show CurrentUtilization percentage within the messages generated by the Fault Management Module in LMS 4.0
EVENT ID                = 00008Z1
TIME                    = S
STATUS                  = Active
SEVERITY                = Critical
MANAGED OBJECT          = switch
MANAGED OBJECT TYPE     = Switches and Hubs
EVENT DESCRIPTION       = HighUtilization::Component=PORT-switch/11150 [Gi3/0/50] [---> TRUNK ];ComponentClass=Port;ComponentEventCode=1057;TrafficRate=5.2261432E7 BYPS;DuplexMode=FULLDUPLEX;UtilizationThreshold=40;MaxSpeed=1000000000;Type
NOTIFICATION ORIGINATOR = Fault Management Module
As you can see above CurrentUtilization percentage is not shown in Event Description section.
Could someone help?
Thank you!
Massimiliano.

To my knowledge nothing can be configured here.
You have to take traffic rate and max speed and do the calculation yourself
FAD! functioning as designed
Cheers,
Michel

LMS 4.2 Fault Manager Issue

Hi All,
We are seeing many Unidentified traps on the DFM for multiple devices.
274.              008ZETI      Active          InformAlarm                   Mumbai_6509-1          Mumbai_6509-1: Unidentified Trap Generic Trap:6 Specific Trap:1 EnterpriseOid:.1.3.6.1.4.1.9       04-Feb-2015 01:59:28                    NA
275.              008ZETG    Cleared      InformAlarm                   Mumbai_6509-1          Mumbai_6509-1: Unidentified Trap Generic Trap:6 Specific Trap:1 EnterpriseOid:.1.3.6.1.4.1.9       04-Feb-2015 01:59:03                    NA
276.              008ZETA     Active          InformAlarm                   Mumbai_6509-1          Mumbai_6509-1: Unidentified Trap Generic Trap:6 Specific Trap:2 EnterpriseOid:.1.3.6.1.4.1.9.9.109.2            04-Feb-2015 01:58:22                    NA
274.              008ZETI      Active          InformAlarm                   Mumbai_6509-1          Mumbai_6509-1: Unidentified Trap Generic Trap:6 Specific Trap:1 EnterpriseOid:.1.3.6.1.4.1.9       04-Feb-2015 01:59:28                    NA
275.              008ZETG    Cleared      InformAlarm                   Mumbai_6509-1          Mumbai_6509-1: Unidentified Trap Generic Trap:6 Specific Trap:1 EnterpriseOid:.1.3.6.1.4.1.9       04-Feb-2015 01:59:03                    NA
276.              008ZETA     Active          InformAlarm                   Mumbai_6509-1          Mumbai_6509-1: Unidentified Trap Generic Trap:6 Specific Trap:2 EnterpriseOid:.1.3.6.1.4.1.9.9.109.2            04-Feb-2015 01:58:22                    NA
Regards,
Channa

Hi,
The unidentified trap message in fault manager is expected when LMS receives a trap that is not in the list of traps that the fault manager is capable of processing.
Here are the traps that fault manager can process:
http://www.cisco.com/c/en/us/td/docs/net_mgmt/ciscoworks_lan_management_solution/4-2/user/guide/lms_monitor/lms_mnt/TrapFwd.html
The SNMP traps are only processed by the fault manager and as per the document above the ones that will be identified are pre-defined and the list cannot be modified.
Clearing an Unidentified Trap
You can manually clear Unidentified Traps from LMS. To do this:
Step 1 Select the Unidentified Trap and click Clear.
A message appears prompting you to confirm the clearing.
Step 2 Enter your user ID.
This will be used as a reference to identify who cleared the Unidentified Trap.
Step 3 Click OK to confirm.
The Unidentified trap is cleared.
To retain the trap click Cancel.
- Ashok
Please rate the post or mark as correct answer as it will help others looking for similar information

LMS 4.2.2 Fault manager does not resolve hostname for some devices

This is Cisco Prime LMS 4.2.2 on Windows 2008 R2
As far as I understand it Fault Manager need to be able to do reverse lookup for ip adresses to show the correct name in the "device name" column. I have double and tripple checked and all devices that only is shown as an ip address do have a reverse in the dns used by the LMS server.
The device is correctly registered and inventory have been run. If I hold the mouse pointer over the crosshair on the row of the offending device all info is shown including correct device name and fqdn.
The server is upgraded from 4.2.1 to 4.2.2 and we have the same problem before the upgrade.

The problem may occur if the lookup was not possible when the device was added to fault management.
Faultmanager, unlike the rest of LMS, does not update the display name.
If the resolver.pl in /opt/CSCOpx/bin is able to get the device name, then resolution is fine.
The only workarrounds are to the delete the device from LMS and re-add it, or use cli tools on the server to remove and re-add the device from the DFM repository.
Cheers,
Michel

Fault management - ora-human-intervention - help

Folks,
I am playing with Fault management framework and everything went fine so far. However when I changed action to 'ora-huma-intervention' I am not seeing the desired output.
My version - 10.1.3.3
Here is my audit :
[2009/05/07 15:13:58] [FAULT RECOVERY] Marked Invoke activity as "pending manual recovery".
[2009/05/07 15:13:58] "{http://schemas.oracle.com/bpel/extension}bindingFault" has been thrown.More...
[2009/05/07 15:13:58] "BPELFault" has not been caught by a catch block.
[2009/05/07 15:13:58] BPEL process instance "3320518" cancelled
According to audit, invoke activity is marked to 'Pending manual recovery', but the last line shows process instance is cancelled. I believe instance must not be cancelled in this stage and it must appear under activities tab if this work correctly ...
Not sure if the problem is related,I see the below error in opmn log about the missing table 'wi_fault'.
The process domain was unable to update the fault entry for the activity "3320518-BpInv0-BpSeq0.3-2" from the datastore. The exception reported is: ORA-00942: table or view does not exist
Please check that the machine hosting the datasource is physically connected to the network. Otherwise, check that the datasource connection parameters (user/password) is currently valid.
sql statement: SELECT COUNT(*) FROM wi_fault WHERE cikey = ? AND node_id = ? AND scope_id = ? AND count_id = ?
Any guidance will be appreciated ...
Ron

That SQL error ORA-00942: table or view does not exist is caused because you haven't install the SQL scripts after performing the upgrade.
The scripts that need to be run can be found
E:\Oracle\product\soa\10.1.3\bpel\system\database\scripts
run the scripts that are relevant to your upgrade.
If you are running olite the SQL prompt is found
E:\Oracle\product\soa\10.1.3\bpel\bin\polsql.cmd
the fault management was introduced in 10.1.3.3 and requires these database changes.
Also make sure your JDev version is the same as SOA Suite version.
cheers
James

10.1.3.3 fault management framework and catchall problem

Hi all,
I have a BPEL process with a catchall exception block for the entire process (in the .bpel file). The errorhandling determines if action needs to be taken on a given exception and notify's tech support with an e-mail if action needs to be taken. This works fine.
However, it would be nice if you could retry faulted instances if for instance a web service was temporarily down. Here the fault management framework comes to the rescue. I have created default policy and fault-binding files. I have set it up so that any error goes to human interaction. Apparently this worked. An instance pending manual recovery was created, and it could be retry'ed from the BPEL console.
So here is my problem: The fault management framework apparently overwrites the catchall exception in the BPEL process, so that while a task for human interaction is created the error block is not excuted first - meaning that no e-mail notification is sent and no differentiating between exception occurs.
If I remove the default policy and fault-binding files then the catchall is used again.
This may be expected behaviour, but I assume that it is possible to use both a catch all block in the bpel code AND the generic fault management framework.
Does anybody have any ideas?
Your assistance will be much appreciated.
Regards,
Aagaard

Hi Aagaard,
The BPEL Error Hospital that Oracle introduced with SOA Suite 10.1.3.3 will prevent you from having to model the handling of Binding Faults or Runtime Faults in BPEL processes.
Hope in the future we will also be able to specify policies for handling custom faults that may occur anywhere in the BPEL process, i.e. not only on invoke activities.
This framework will gives us the opportunity to handle all the business and runtime faults for an “invoke” activity. With the framework we can define one policy for every bpel domain.
In the bpel/domains/default/config/fault-policy-binding.xml , we can setup the policies we would use and which processes,partnerlinks,port types will be part of it.
If there is already some fault-handling (catch) defined in the the bpel process, the framework will overrule this, and use a policy if possible.
Hope that answers your question!
Cheers
Anirudh Pucha

SOA 11g Fault Management Framework Issue

Hi,
I am using soa 11.1.1.3.
I have a composite with bpel process. Bpel process invokes a external web service.
I add fault-bindings.xml and fault policy to catch remote fault.
When I turn off web service, bpel process has the remote fault but the fault didn't be caught by fault management framework.
Here is my fault-policies.xml and fault-ibindings.xml
<?xml version="1.0" encoding="UTF-8"?>
<faultPolicyBindings version="0.0.1"
xmlns="http://schemas.oracle.com/bpel/faultpolicy"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<component faultPolicy="FusionMidFaults" >
</component>
</faultPolicyBindings>
<?xml version="1.0" encoding="UTF-8"?>
<faultPolicies xmlns="http://schemas.oracle.com/bpel/faultpolicy"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<faultPolicy version="0.0.1" id="FusionMidFaults"
xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="http://schemas.oracle.com/bpel/faultpolicy"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Conditions>

<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
name="bpelx:remoteFault">
<condition>
<action ref="BPELJavaAction"/>
</condition>
</faultName>
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
name="bpelx:bindingFault">
<condition>
<action ref="BPELJavaAction"/>
</condition>
</faultName>
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
name="bpelx:runtimeFault">
<condition>
<action ref="BPELJavaAction"/>
</condition>
</faultName>
</Conditions>
<Actions>

<Action id="default-terminate">
<abort/>
</Action>
<Action id="default-replay-scope">
<replayScope/>
</Action>
<Action id="default-rethrow-fault">
<rethrowFault/>
</Action>
<Action id="default-human-intervention">
<humanIntervention/>
</Action>

<Action id="BPELJavaAction">

<javaAction className="com.rubiconred.faultManagement.MyFaultPolicyJavaAction"
defaultAction="default-terminate">
<returnValue value="MANUAL" ref="default-human-intervention"/>
</javaAction>
</Action>
</Actions>
</faultPolicy>
</faultPolicies>

Hi, I've the same issue. I created the fault-bindings.xml and fault-policies.xml in the same directory as composite.xml. I throw a fault in the BPEL process, but it does not get caught by the fault policy mechanism.
fault-bindings.xml
<?xml version="1.0" encoding="UTF-8"?>
<faultPolicyBindings version="2.0.1"
xmlns="http://schemas.oracle.com/bpel/faultpolicy">
<composite faultPolicy="MyCompositeFaultPolicy"/>
</faultPolicyBindings>
fault-policies.xml
<?xml version="1.0" encoding="UTF-8"?>
<faultPolicies xmlns="http://schemas.oracle.com/bpel/faultpolicy"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<faultPolicy version="2.0.1" id="MyCompositeFaultPolicy"
xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="http://schemas.oracle.com/bpel/faultpolicy"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<Conditions>
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
name="bpelx:assertFailure">
<condition>
<action ref="ora-terminate"/>
</condition>
</faultName>
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
name="bpelx:runtimeFault">
<condition>
<action ref="ora-terminate"/>
</condition>
</faultName>
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
name="bpelx:remoteFault">
<condition>
<action ref="ora-terminate"/>
</condition>
</faultName>
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
name="bpelx:bindingFault">
<condition>
<action ref="ora-terminate"/>
</condition>
</faultName>
</Conditions>
<Actions>
<Action id="ora-retry">
<retry>
<retryCount>3</retryCount>
<retryInterval>10800</retryInterval>
<retryFailureAction ref="ora-human-intervention"/>

</retry>
</Action>

<Action id="ora-rethrow-fault">
<rethrowFault/>
</Action>
<Action id="ora-human-intervention">
<humanIntervention/>
</Action>

<Action id="ora-terminate">
<abort/>
</Action>
</Actions>
</faultPolicy>
</faultPolicies>
I tried restarting the managed server, but it did not help. If you have any suggestions, please supply them.

How does BPEL Fault Management Framework gel with ESB Error Handling ?

I see that BPEL 10.1.3.3 has pretty neat Fault Management Framework (although I have to admit it is not very well advertised).
The next logical question is: what about ESB ? Would that help in ESB error handling ? I understand that ESB has its own Error Hospital etc.; however, we have to constantly grapple with two distinct paths for any piece of integration functionality (1. ESB 2. BPEL). I guess, all of this will be moot in the 11g timeframe. Still wondering if anyone out there has somehow unified error handling for these two distinct offerings ?

It's not available in ESB, you have to implement/extend that by your self. Off course in the next release everthing will be better :-)
But, if you are able to use Oracle AIA (http://edelivery.oracle.com) You could use Oracle AIA Foundation, that has a fault 'hospital' implemented both for BPEL and ESB.
Marc
http://orasoa.blogspot.com

BPEL Fault Management and Notification

Hi peers - has anyone successfully implemented Email Notification as part of a Fault Management Framework approach to their BPEL projects - i.e implemented the necessary entries in a fault-policies.xml file to allow an email notification to be sent as part of standard procedure? I acknowledge the custom Java approach, but I don't want to reinvent the wheel if someone has done it already.
Any help much appreciated.
Dennis R

If anyone's interested - here is our solution. We've commented out some BPEL specific context handling as there is a problem with it at the moment - but this is a start.
package au.com.abcde.bpel;
// import com.collaxa.cube.engine.fp.BPELFaultRecoveryContextImpl;
import java.util.List;
import java.util.Map;
import java.util.Properties;
import javax.mail.Message;
import javax.mail.MessagingException;
import javax.mail.Session;
import javax.mail.Transport;
import javax.mail.internet.InternetAddress;
import javax.mail.internet.MimeMessage;
import oracle.integration.platform.faultpolicy.IFaultRecoveryContext;
import oracle.integration.platform.faultpolicy.IFaultRecoveryJavaClass;
public class CustomFaultHandler implements IFaultRecoveryJavaClass {
public void handleRetrySuccess(IFaultRecoveryContext iFaultRecoveryContext) {
System.out.println("This is for retry success");
handleFault(iFaultRecoveryContext);
public String handleFault(IFaultRecoveryContext iFaultRecoveryContext) {
Map faultProperties = iFaultRecoveryContext.getProperties();
Properties properties = new Properties();
properties.put("mail.smtp.host",
((List)faultProperties.get("host")).iterator().next().toString());
properties.put("mail.smtp.port",
((List)faultProperties.get("port")).iterator().next().toString());
Session session = Session.getDefaultInstance(properties, null);
try {
// BPELFaultRecoveryContextImpl bpelCtx =
// (BPELFaultRecoveryContextImpl)iFaultRecoveryContext;
Message message = new MimeMessage(session);
message.setFrom(new InternetAddress(((List)faultProperties.get("from")).iterator().next().toString()));
message.setRecipient(Message.RecipientType.TO,
new InternetAddress(((List)faultProperties.get("to")).iterator().next().toString()));
message.setSubject(((List)faultProperties.get("subject")).iterator().next().toString());
message.setText("\n" +
"A BPEL Process Instance has faulted.\n" +
"Check the Instances tab in the SOA Composite console in order to resolve the problem.\n" +
// "BPEL Composite/Instance: " + bpelCtx.getCompositeName() +
// "/" + bpelCtx.getComponentInstanceId() + ".\n" +
"Fault with Partner Link: " +
iFaultRecoveryContext.getReferenceName() + ".\n" +
"Composite Fault Policy: " +
iFaultRecoveryContext.getPolicyId() + ".\n" +
"This message was automatically generated, please do not reply to it.");
Transport.send(message);
} catch (MessagingException e) {
e.printStackTrace();
return "OK";
and our fault-policies.xml looks like
<?xml version="1.0" encoding="UTF-8"?>
<faultPolicies xmlns="http://schemas.oracle.com/bpel/faultpolicy">
<faultPolicy version="2.0.1" id="StandardFaults"
xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="http://schemas.oracle.com/bpel/faultpolicy"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Conditions>

<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
name="bpelx:bindingFault">
<condition>
<action ref="ora-retry"/>
</condition>
</faultName>

</Conditions>
<Actions>
<Action id="ora-retry">
<retry>
<retryCount>5</retryCount>
<retryInterval>4</retryInterval>
<exponentialBackoff/>
<retryFailureAction ref="java-fault-handler"/>
<retrySuccessAction ref="ora-terminate"/>
</retry>
</Action>

<Action id="java-fault-handler">
<javaAction className="au.com.abcde.bpel.CustomFaultHandler"
defaultAction="ora-human-intervention"
propertySet="properties">
<returnValue value="OK" ref="ora-human-intervention"/>
</javaAction>
</Action>
<Action id="ora-human-intervention">
<humanIntervention/>
</Action>
<Action id="ora-terminate">
<abort/>
</Action>
</Actions>
<Properties>
<propertySet name="properties">
<property name="from">[email protected]</property>
<property name="to">[email protected]</property>
<property name="subject">BPEL Problem Notification</property>
<property name="host">smtp.abcde.com.au</property>
<property name="port">25</property>
</propertySet>
</Properties>
</faultPolicy>
</faultPolicies>

CiscoWorks-Device Fault Manager

I would like to install the HP-OV-Adapter from the Device Fault Manager but then the CPU-Utilisation goes up to 100% and stays there. What do I do wrong? Is there a workaround for it? I usr HP OV 6.20 on Solaris 7/SPARC. Thanx for help.

We are preparing to install the same software. You should check the server setup requirements for the Device Fault Manager. We found that it requires more RAM and Processor speed than previous versions. You may have to beef up your server.

Fault Manager Cause a Reboot?

Hi there. I get the idea that Predictive Self-Healing is supposed to prevent system failures, or at least gracefully handle them, but is there ever a condition where the fault manager will bring the system down as part of its fault handling?

Certainly not by design. The fault manager's role is strictly to propagate fault messages to one or more agents who receive and process them. The FMA architecture itself has no provision to act on those messages -- it just reports them.

Questioned status in Fault Management for Cisco Prime 4.2

Hi all,
Need help in Cisco Prime 4.2. My device is stuck in Questioned state in Fault Management. The device though can be pinged from the server. Actually, I can already manage the device and have archived its configuration. Problem is, on Fault Monitoring Device Administration, its on Questioned state even though I already tried to rediscover the device several times.
Do I need to configure something on the server like put in the IP address and hostname of the device in the host file of Windows Server 2008?
Thanks in advance for your help!

Hi ,
Is this happenning for just one particular device or for all of them ?
If for a particular device then Are you using SNMPv2 or SNMPv3 on your device ?
Disable the Windows Firewall and ANTI-Virus on the serevr and Rediscover the device again.
Thanks
Afroz

Fault management

Similar Messages

Maybe you are looking for