Monitoring a Clustered Resource
Can you let me know if it's possible to monitor a service like SMTP on the
clustered resource. I'd like idealy to know if SMTP had failed, even if the
server is still up and running, but this resource could be on one of two
servers, even though it's IP would always be the same.
I'm running a clustered environment and I'd like to check if the GWIA is
running. I can monitor SMTP with Zen for Servers and tell it to let me know
if it stops working. Problem is that the GWIA is not on the Physical IP
address, it's a virtual resource mapped to a secondary IP, while I can use
the DB editor to add in the secondard IP, it's asking me for a MAC as well -
the MAC of course will change if the resource is failed over onto another
node.
Any ideas??
Tony,
It appears that in the past few days you have not received a response to your posting. That concerns us, and has triggered this automated reply.
Has your problem been resolved? If not, you might try one of the following options:
- Do a search of our knowledgebase at http://support.novell.com/search/kb_index.jsp
- Check all of the other support tools and options available at http://support.novell.com in both the "free product support" and "paid product support" drop down boxes.
- You could also try posting your message again. Make sure it is posted in the correct newsgroup. (http://support.novell.com/forums)
If this is a reply to a duplicate posting, please ignore and accept our apologies and rest assured we will issue a stern reprimand to our posting bot.
Good luck!
Your Novell Product Support Forums Team
http://support.novell.com/forums/
Similar Messages
-
Server Monitoring with clustered instances
Anyone using the server monitor or multiserver monitor with
clustered instances of coldfusion? In CF 8.0.1 on Solaris, enabling
monitoring produces a vast number of repeated errors of the form
included below. This occurs on the both clustered instances as the
instances are setup to replicate session data using J2EE session
variables. The monitoring appears to work but the frequency of the
errors produced in the ouput log of *BOTH* of the cluster instances
is extensive. These errors do not occur when monitoring the
"cfusion" admin instance. Is this a product issue or a
configuration issue?
MM/DD HH:MM:SS error Setup of session replication failed.
[2]java.io.StreamCorruptedException: unexpected end of block
data
at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1945)
at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1869)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at java.util.Hashtable.readObject(Hashtable.java:859)
...Dear Jan,
Already I have added the plugin but while adding the target i am getting below error. Can u please give some idea on this
Test Connection failed: [_WinAuthDLLToLoadDynamicProp;em_error=DLL file 'D:\12c_agent\plugins\oracle.em.smss.agent.plugin_12.1.0.2.0\scripts\emx\microsoft_sqlserver_database..\..\..\..\dependencies\oracle.em.smss\jdbcdriver\sqljdbc_auth.dll' is found missing or not was never copied manually. Please copy amd64 version of sqljdbc_auth.dll at the above location and re-try, MSSQL_NumClusterNodes;Can't resolve a non-optional query descriptor property [dllFile] (dllFile), WbemRemote_Determination_DynamicProperty;Can't resolve a non-optional query descriptor property [dllFile] (dllFile), MSSQLInstance_TestMetric_DynamicProperty;Can't resolve a non-optional query descriptor property [dllFile] (dllFile), OSType_TargetHost_DynamicProperty;Can't resolve a non-optional query descriptor property [STDINWBEM_HOST] (ms_sqlserver_host), MSSQL_NumClusterNodes;Can't resolve a non-optional query descriptor property [dllFile] (dllFile)] -
Customer would like to have details on Clustering resources usage
IHAC that currently implemented an application using 6 WLS instances in 6 domains (!) and 2 phisical servers. I mean they have 6 admin with a single managed server each. They did so to implement an High Available system, working Cluster-like.
When I told them that a better configuration should be 1 Admin and 6 managed servers using the standard Clustering WLS features they claimed that it will be resources consuming, more than the current configuration.
Is there any document where we describe the best way to implement Clustering and a benchmerk where we show the differences in system usage when using cluster and not using it, and also the Pro/Cons of having a clustered architecture?
I cannot find a the info in the standard WLS docs.
They have a WLS 9.2.
Thank you !
Chiara
(ACS Service Manager)The following doc tells more about why one should cluster.
http://download-llnw.oracle.com/docs/cd/E13222_01/wls/docs92/cluster/overview.html#wp1011562
Scalability
High-Availability
Application Failover
Load Balancing
If your customer doesn't need the benefits mentioned in the above doc then, there's no need for them to cluster their app.
I agree with your customer that cluster adds resource overhead. Most customers in production environment choose high availability and load balancing against additional resource consumption at runtime.
Following the performance tuning best practices and understanding customers application, one can decrease the overhead of clustering to some extent. -
Alert monitor PP/DS resource alert default priority error
Hi Guys,
I am seeing DP (Default Priority) column showing error lable in PP/DS alert profile against
1. Overload on single activity resource
2. Overload on Multi-activity resource
3. Characteristics mismatch on resource
Where can I set this right...cos this is leading me to non display of overload capacity though the resource is actually overloaded.
Thanks In advance
Regards
Prasanna.HI Prasanna,
Threshold limits for resource overload alerts are specified in the Alert Monitor profile for PPDS and not in the resource master. In resource master you define the bucket capacity for the resource.
In the Alert Monitor profile, you specify 3 threshold values, namely, information, warning and error. Suggest you take the below steps to see if the Alert profile is really working:
1. Identify a test resource
2. Define the block capacity as per your requirement.
3. Create orders on the test resource such that it becomes overloaded.
4. In the Alert Monitor profile, define a new PPDS alert Profile keeping the below points in mind:
a) Select "Resource Utilization in Bucket(PPDS)"
b) Define the threshhold values for the above three types of alerts
c) Fill the selection of resources tab with the test reosurce and test location
d) DO NOT MAKE ANY OTHER SELECTION
e) Save the profile
f) Redertmine the alerts
It should display all the Alerts, if any on the above test resource. One point to note here is that, if in DP column, the alert type is ERROR (ie red) then only the values which are above the Threshold value for ERROR shall be populated.
Kindly take note of the planning version in which you are testing the scenario and it should be same in the Alert monitor profile as well.
Please revert if you are still not able to see the alerts.
Regards,
Binod -
Hi,
I am looking for a way to get a report about the number of users that opened a resource for reading.
The built-in reports I found only monitors changes to files, and not accesses which doesn't change the resource.
Any ideas?You can write a Namespace filter that monitors a particular respository. This filter will then write user name & resource they accessed to the database.
-
Transaction for monitoring memory and resources on BW server
hi experts
I have the task to determine what occurred from an hour specified on few minutes, because there was a process or program ( we don't know ) which collapse memory and resources on BW server. I check throw SM37 if some Job was executed but no one was. I need a transaction for monitoring / checking memory and resources.
Please I will appreciate any helpfull information.
Regards
mggHi
You can use ST03 / ST03 N also check ST04, ST06 . If you have access to the service market palce please check SAP Note - 618868 ( FAQ: Oracle Performance ) and related notes for more help.
Hope this helps
Assign points if useful
Regards -
Linux/Unix Server Monitoring in Same Resource Pool as Windows Monitoring
Hi
We are planning to deploy OpsMgr 2012 R2 for our single site infrastructure. We wish to monitor Windows, Linux/Unix and Network Devices. I know that separate resource pools are recommended for monitoring Network devices but can we have a single resource
pool that can monitor Windows as well as Linux/Unix systems?
If same resource pool can be used for Windows as well as Linux/Unix systems, what are the limitations of doing this vs using separate resource pools for them.
Thanks
Taranjeet Singh
zamnHi Tranjeet, In answer to your questions:
1. Having separate resource pools (in case of Windows and Linux systems) and single Management Server in each resource pool
A: Dedicated specific management servers for specific servers to be monitored from. (ie isolation of environments like test dev prod)
2. Having single resource pool with multiple Management Servers managing Windows as well as Linux systems.
HA
Cheers,
Martin
Blog:
http://sustaslog.wordpress.com
LinkedIn:
Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose. -
Monitor historical session resources with sql statements
Hi,
I would like to write some queries to retrieve the following data from my database:
For a certain day, give me the total
physical read total bytes
physical write total bytes
cpu usage?
from my sessions in the database. I have specific groups I want to monitor, certain logon users that run the same batch program.
I guess the above stats are waits although I 'm not sure about cpu usage.
Can I monitor this with dba_hist_active_sess_history? How do I translate the waits named above to the dba_hist_active_sess_history columns? I know this view only holds data for about 7 days, but that is fine.
Kind regards,
NicoI am almost there now. But this only works on Oracle 10.2
On my Oracle 10.1 databases the wait_class and session state fields do not seem to be defined in:
- v$active_session_history
- dba_hist_active_sess_history
Any ideas where I could find these in 10.1?
regards
SELECT * FROM (
select
--ash.session_id,
u.username,
ash.program,
ash.MODULE,
ash.session_state,
Nvl(ash.wait_class,'Other') wait_class,
sum(decode(ash.session_state,'ON CPU',1,0)) "CPU",
sum(decode(ash.session_state,'WAITING',1,0)) -
sum(decode(ash.session_state,'WAITING',
decode(ash.wait_class,'User I/O',1, 0 ), 0)) "WAITING" ,
sum(decode(ash.session_state,'WAITING',
decode(ash.wait_class,'User I/O',1, 0 ), 0)) "IO"
from v$active_session_history ash,
--dba_hist_active_sess_history ash,
dba_users u
WHERE u.user_id = ash.user_id
group by username,program,MODULE,session_state,wait_class
order by cpu DESC -
Errors when adding resources to rg in zone cluster
Hi guys,
I managed to create and bring up a zone cluster, create a rg and add a HAStoragePlus resource (zpool), but getting errors when I want to add a lh resource. Here's the output I find relevant:
root@node1:~# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
rpool 24.6G 10.0G 14.6G 40% 1.00x ONLINE -
zclusterpool 187M 98.5K 187M 0% 1.00x ONLINE -
root@node1:~# clzonecluster show ztestcluster
=== Zone Clusters ===
Zone Cluster Name: ztestcluster
zonename: ztestcluster
zonepath: /zcluster/ztestcluster
autoboot: TRUE
brand: solaris
bootargs: <NULL>
pool: <NULL>
limitpriv: <NULL>
scheduling-class: <NULL>
ip-type: shared
enable_priv_net: TRUE
resource_security: SECURE
--- Solaris Resources for ztestcluster ---
Resource Name: net
address: 192.168.10.55
physical: auto
Resource Name: dataset
name: zclusterpool
--- Zone Cluster Nodes for ztestcluster ---
Node Name: node2
physical-host: node2
hostname: zclnode2
--- Solaris Resources for node2 ---
Node Name: node1
physical-host: node1
hostname: zclnode1
--- Solaris Resources for node1 ---
Now I want to add a lh (zclusterip - 192.168.10.55) to a resource group named z-test-rg.
root@zclnode2:~# cat /etc/hosts
# Copyright 2009 Sun Microsystems, Inc. All rights reserved.
# Use is subject to license terms.
# Internet host table
::1 localhost
127.0.0.1 localhost loghost
#zone cluster
192.168.10.51 zclnode1
192.168.10.52 zclnode2
192.168.10.55 zclusterip
root@zclnode2:~# cluster status
=== Cluster Resource Groups ===
Group Name Node Name Suspended State
z-test-rg zclnode1 No Online
zclnode2 No Offline
=== Cluster Resources ===
Resource Name Node Name State Status Message
zclusterpool-rs zclnode1 Online Online
zclnode2 Offline Offline
root@zclnode2:~# clrg show
=== Resource Groups and Resources ===
Resource Group: z-test-rg
RG_description: <NULL>
RG_mode: Failover
RG_state: Managed
Failback: False
Nodelist: zclnode1 zclnode2
--- Resources for Group z-test-rg ---
Resource: zclusterpool-rs
Type: SUNW.HAStoragePlus:10
Type_version: 10
Group: z-test-rg
R_description:
Resource_project_name: default
Enabled{zclnode1}: True
Enabled{zclnode2}: True
Monitored{zclnode1}: True
Monitored{zclnode2}: True
The error, for lh resource:
root@zclnode2:~# clrslh create -g z-test-rg -h zclusterip zclusterip-rs
clrslh: No IPMP group on zclnode1 matches prefix and IP version for zclusterip
Any ideas?
Much appreciated!Hello,
First of all, I detected a mistake in my previous config: instead of adding an ipmp, a "simple" NIC was added to cluster. I rectified that (I created zclusteripmp0 ipmp out of net11):
root@node1:~# ipadm
NAME CLASS/TYPE STATE UNDER ADDR
clprivnet0 ip ok -- --
clprivnet0/? static ok -- 172.16.3.66/26
clprivnet0/? static ok -- 172.16.2.2/24
lo0 loopback ok -- --
lo0/v4 static ok -- 127.0.0.1/8
lo0/v6 static ok -- ::1/128
lo0/zoneadmd-v4 static ok -- 127.0.0.1/8
lo0/zoneadmd-v6 static ok -- ::1/128
net0 ip ok sc_ipmp0 --
net1 ip ok sc_ipmp1 --
net2 ip ok -- --
net2/? static ok -- 172.16.0.66/26
net3 ip ok -- --
net3/? static ok -- 172.16.0.130/26
net4 ip ok sc_ipmp2 --
net5 ip ok sc_ipmp2 --
net11 ip ok zclusteripmp0 --
sc_ipmp0 ipmp ok -- --
sc_ipmp0/out dhcp ok -- 192.168.1.3/24
sc_ipmp1 ipmp ok -- --
sc_ipmp1/static1 static ok -- 192.168.10.11/24
sc_ipmp2 ipmp ok -- --
sc_ipmp2/static1 static ok -- 192.168.30.11/24
sc_ipmp2/static2 static ok -- 192.168.30.12/24
zclusteripmp0 ipmp ok -- --
zclusteripmp0/zoneadmd-v4 static ok -- 192.168.10.51/24
root@node1:~# clzonecluster export ztestcluster
create -b
set zonepath=/zcluster/ztestcluster
set brand=solaris
set autoboot=true
set enable_priv_net=true
set ip-type=shared
add net
set address=192.168.10.55
set physical=auto
end
add dataset
set name=zclusterpool
end
add attr
set name=cluster
set type=boolean
set value=true
end
add node
set physical-host=node2
set hostname=zclnode2
add net
set address=192.168.10.52
set physical=zclusteripmp0
end
end
add node
set physical-host=node1
set hostname=zclnode1
add net
set address=192.168.10.51
set physical=zclusteripmp0
end
end
An then I tried again to add the lh, but getting the same error:
root@node2:~# zlogin -C ztestcluster
[Connected to zone 'ztestcluster' console]
zclnode2 console login: root
Password:
Last login: Mon Jan 19 15:28:28 on console
Jan 19 19:17:24 zclnode2 login: ROOT LOGIN /dev/console
Oracle Corporation SunOS 5.11 11.2 June 2014
root@zclnode2:~# ipadm
NAME CLASS/TYPE STATE UNDER ADDR
clprivnet0 ip ok -- --
clprivnet0/? inherited ok -- 172.16.3.65/26
lo0 loopback ok -- --
lo0/? inherited ok -- 127.0.0.1/8
lo0/? inherited ok -- ::1/128
zclusteripmp0 ipmp ok -- --
zclusteripmp0/? inherited ok -- 192.168.10.52/24
root@zclnode2:~# cluster status
=== Cluster Resource Groups ===
Group Name Node Name Suspended State
z-test-rg zclnode1 No Offline
zclnode2 No Online
=== Cluster Resources ===
Resource Name Node Name State Status Message
zclusterpool-rs zclnode1 Offline Offline
zclnode2 Online Online
root@zclnode2:~# ipadm
NAME CLASS/TYPE STATE UNDER ADDR
clprivnet0 ip ok -- --
clprivnet0/? inherited ok -- 172.16.3.65/26
lo0 loopback ok -- --
lo0/? inherited ok -- 127.0.0.1/8
lo0/? inherited ok -- ::1/128
zclusteripmp0 ipmp ok -- --
zclusteripmp0/? inherited ok -- 192.168.10.52/24
root@zclnode2:~# clreslogicalhostname create -g z-test-rg -h zclusterip zcluste
rip-rs
clreslogicalhostname: No IPMP group on zclnode1 matches prefix and IP version for zclusterip
root@zclnode2:~#
To answer your first question, yes - all global nodes and zone cluster nodes have entries for zclusterip:
root@zclnode2:~# cat /etc/hosts
# Copyright 2009 Sun Microsystems, Inc. All rights reserved.
# Use is subject to license terms.
# Internet host table
::1 localhost
127.0.0.1 localhost loghost
#zone cluster
192.168.10.51 zclnode1
192.168.10.52 zclnode2
192.168.10.55 zclusterip
root@zclnode2:~# ping zclnode1
zclnode1 is alive
When I tried the command you mentioned, first it gave me an error ( there was a space between interfaces), then I changed the rg group to fit mine (z-test-rg) and it (partially) worked:
root@zclnode2:~# clrs create -g z-test-rg -t LogicalHostname -p Netiflist=sc_ip
mp0@1,sc_ipmp0@2 -p Hostnamelist=zclusterip zclusterip-rs
root@zclnode2:~# clrg show
=== Resource Groups and Resources ===
Resource Group: z-test-rg
RG_description: <NULL>
RG_mode: Failover
RG_state: Managed
Failback: False
Nodelist: zclnode1 zclnode2
--- Resources for Group z-test-rg ---
Resource: zclusterpool-rs
Type: SUNW.HAStoragePlus:10
Type_version: 10
Group: z-test-rg
R_description:
Resource_project_name: default
Enabled{zclnode1}: True
Enabled{zclnode2}: True
Monitored{zclnode1}: True
Monitored{zclnode2}: True
Resource: zclusterip-rs
Type: SUNW.LogicalHostname:5
Type_version: 5
Group: z-test-rg
R_description:
Resource_project_name: default
Enabled{zclnode1}: True
Enabled{zclnode2}: True
Monitored{zclnode1}: True
Monitored{zclnode2}: True
root@zclnode2:~# cluster status
=== Cluster Resource Groups ===
Group Name Node Name Suspended State
z-test-rg zclnode1 No Offline
zclnode2 No Online
=== Cluster Resources ===
Resource Name Node Name State Status Message
zclusterip-rs zclnode1 Offline Offline
zclnode2 Online Online - LogicalHostname online.
zclusterpool-rs zclnode1 Offline Offline
zclnode2 Online Online
root@zclnode2:~# ipadm
NAME CLASS/TYPE STATE UNDER ADDR
clprivnet0 ip ok -- --
clprivnet0/? inherited ok -- 172.16.3.65/26
lo0 loopback ok -- --
lo0/? inherited ok -- 127.0.0.1/8
lo0/? inherited ok -- ::1/128
sc_ipmp0 ipmp ok -- --
sc_ipmp0/? inherited ok -- 192.168.10.55/24
zclusteripmp0 ipmp ok -- --
zclusteripmp0/? inherited ok -- 192.168.10.52/24
root@zclnode2:~# ping zclusterip
zclusterip is alive
root@zclnode2:~# clrg switch -n zclnode1 z-test-rg
root@zclnode2:~# cluster status
=== Cluster Resource Groups ===
Group Name Node Name Suspended State
z-test-rg zclnode1 No Online
zclnode2 No Offline
=== Cluster Resources ===
Resource Name Node Name State Status Message
zclusterip-rs zclnode1 Online Online - LogicalHostname online.
zclnode2 Offline Offline - LogicalHostname offline.
zclusterpool-rs zclnode1 Online Online
zclnode2 Offline Offline
root@zclnode2:~# ping zclusterip
no answer from zclusterip
root@zclnode2:~# ping zclusterip
no answer from zclusterip
root@zclnode2:~#
So, the lh was added, the rg can switch over to the other node, but zclusterip is pingable only from that cluster zone; I cannot ping zcluster ip from the cluster zone that does not hold the rg, nor from any global cluster node (node1, node2)... -
Monitoring Team Foundation Server
Hello,
We have a two server deployment for Team Foundation Server 2013 with Database Tier on one server and Application tier on another.
To effectively manage and maintain both the servers/applications/Database, my organization is looking at options.
I suggested use of SCOM to monitor resources, services, etc. We are looking at options like activebatch - an application that is a Job Scheduler. activebatch is a consideration because we already have a license of the product.
Using activebatch, we are supposed to schedule jobs (written in PowerShell, SQL Server and whatever may be required) that will monitor certain parameters/resources of the Servers/Application/Database. Please suggest what parameters can be considered for
monitoring. What all is required to be monitor.
I understand that SCOM Provides, out-of-box management packs to do such activity, I just need an answer as to why SCOM is better than scheduling hand-crafted jobs for monitoring.
I am in favour of SCOM but organization requires explanations to why not activebatch.
Thanks in advance!!
Best Regards,
YogeshHi Yogesh,
Based on your description, seems it's not a real TFS question. For the options to monitor TFS servers, it depends on your own decision.
You can let your organization know more information about the features and benefits of SCOM. Please refer to the links below for more info:
http://en.wikipedia.org/wiki/System_Center_Operations_Manager
http://www.systemcentercentral.com/the-top-5-benefits-of-combining-operations-manager-and-sharepoint-scom-sysctr-sharepoint/
And you can also consider to integrate SCOM with activebatch, check this
page for more information.
Best regards,
We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
Click
HERE to participate the survey. -
Error while adding oracle database resource to the fail safe group
Hi,
we are installaing ERP 6.0 EHP4 , oracle10.2.04 in MSCS
During the step, Adding the oracle Database Resource to the fail safe
group , I am getting the error.
28 13:21:57 ** WARNING : FS-10288: Parameter file C:\oracle\BCP\102\database\init<SID>_OFS.ora is not located on a cluster disk
29 13:21:57 ** WARNING : FS-10404: The database uses a nonclustered disk in one of the system parameters. Value of parameter is C:\ORACLE\<SID>\102\RDBMS\AUDIT
30 13:21:58 ** ERROR : FS-10036: The resource uses disk SAP HDD, which is also used by cluster resource SAP VIP in another group
31 13:21:58 ** ERROR : FS-10778: The Oracle Database resource provider failed to configure the cluster resource <SID>.WORLD
32 13:21:58 ** ERROR : FS-10890: Oracle Services for MSCS failed during the add operation
33 13:21:58 ** ERROR : FS-10497: Starting clusterwide rollback of the operation
34 13:21:58 FS-10488:<primary node name> : Starting rollback of operation
35 13:21:58 > FS-10090: Rolling back Oracle Net changes on node <primary node name>
I am having one local disk C: one shared disk Z: and quorum disk Q:
Shared disk Z: is already used for SAP<sid> group.
Regards,
JoelJoeldhanaraj wrote:>
> Hi,
>
> we are installaing ERP 6.0 EHP4 , oracle10.2.04 in MSCS
>
> During the step, Adding the oracle Database Resource to the fail safe
> group , I am getting the error.
>
> 28 13:21:57 ** WARNING : FS-10288: Parameter file C:\oracle\BCP\102\database\init<SID>_OFS.ora is not located on a cluster disk
> 29 13:21:57 ** WARNING : FS-10404: The database uses a nonclustered disk in one of the system parameters. Value of parameter is C:\ORACLE\<SID>\102\RDBMS\AUDIT
> 30 13:21:58 ** ERROR : FS-10036: The resource uses disk SAP HDD, which is also used by cluster resource SAP VIP in another group
> 31 13:21:58 ** ERROR : FS-10778: The Oracle Database resource provider failed to configure the cluster resource <SID>.WORLD
> 32 13:21:58 ** ERROR : FS-10890: Oracle Services for MSCS failed during the add operation
> 33 13:21:58 ** ERROR : FS-10497: Starting clusterwide rollback of the operation
> 34 13:21:58 FS-10488:<primary node name> : Starting rollback of operation
> 35 13:21:58 > FS-10090: Rolling back Oracle Net changes on node <primary node name>
>
> I am having one local disk C: one shared disk Z: and quorum disk Q:
>
> Shared disk Z: is already used for SAP<sid> group.
Hi Joel,
how about following the advice given by the error message and moving the mentioned files/folder (init<sid>_OFS.ora, AUDIT folder) to a clustered resource disk?
just my 2 pence... -
Issue in Global Services Monitor GSM
Dear all,
I'm facing a problem in Global Services Monitor.
The Resources pool contains 4 Management Servers; 2 old and 2 recently installed.
The GSM was installed and was working normally on the old Management Servers.
But, after increasing the number of management servers to be 4 instead of two, the problem appeared.
The GSM is firing alerts on the new MS's and their state are critical (old 2 servers are healthy).
The alert description is as below:
Global Service Monitor Modules: Failed to discover Global Service Monitor locations.
Failure step: 'Couldn't get the ACS endpoint from discovery service. SubscriptionId: 'a6846da0-e5d7-4bea-ab13-836d89364b60', OutsideInServiceBaseUri: 'https://gsm-prod.systemcenter.microsoft.com/''
Message: 'Could not establish trust relationship for the SSL/TLS secure channel with authority 'gsm-prod.systemcenter.microsoft.com'.'
Details: 'System.ServiceModel.Security.SecurityNegotiationException: Could not establish trust relationship for the SSL/TLS secure channel with authority 'gsm-prod.systemcenter.microsoft.com'. ---> System.Net.WebException: The underlying connection was closed:
Could not establish trust relationship for the SSL/TLS secure channel. ---> System.Security.Authentication.AuthenticationException: The remote certificate is invalid according to the validation procedure.
at System.Net.Security.SslState.StartSendAuthResetSignal(ProtocolToken message, AsyncProtocolRequest asyncRequest, Exception exception)
at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.ForceAuthentication(Boolean receiveFirst, Byte[] buffer, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.ProcessAuthentication(LazyAsyncResult lazyResult)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Net.TlsStream.ProcessAuthentication(LazyAsyncResult result)
at System.Net.TlsStream.Write(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.PooledStream.Write(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.ConnectStream.WriteHeaders(Boolean async)
--- End of inner exception stack trace ---
at System.Net.HttpWebRequest.GetResponse()
at System.ServiceModel.Channels.HttpChannelFactory`1.HttpRequestChannel.HttpChannelRequest.WaitForReply(TimeSpan timeout)
--- End of inner exception stack trace ---
Server stack trace:
at System.ServiceModel.Channels.HttpChannelUtilities.ProcessGetResponseWebException(WebException webException, HttpWebRequest request, HttpAbortReason abortReason)
at System.ServiceModel.Channels.HttpChannelFactory`1.HttpRequestChannel.HttpChannelRequest.WaitForReply(TimeSpan timeout)
at System.ServiceModel.Channels.RequestChannel.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)
Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at Microsoft.SystemCenter.Cloud.SharedLibrary.Discovery.IDiscovery.GetEndpoints(String subscriptionId)
at Microsoft.SystemCenter.Cloud.SharedLibrary.Discovery.DiscoveryHelper.<>c__DisplayClass1.<DiscoverAcsEndpoint>b__0(IDiscovery service)
at Microsoft.SystemCenter.Cloud.SharedLibrary.RestCallHelper.ExecuteRestCall[TContract](Uri endpointUri, WebProxy webProxy, String accessToken, RestMethod`1 method)
at Microsoft.SystemCenter.Cloud.SharedLibrary.Discovery.DiscoveryHelper.DiscoverAcsEndpoint(String subscriptionId, Uri outsideInServiceBaseUri, WebProxy proxy)
at Microsoft.SystemCenter.Cloud.OutsideInUnitModule.DiscoveryWriteActionModule.Execute()'
Any clue?
Regards,
Khaled A. HamadHi
Is there any requirement to install the Microsoft Root Certificate on the server where SCOM console is working? Shall I need to purchase Windows Azure Subscription also for GSM? Please let me know.
The scenario is - I have one SCOM server (Including all the server roles on the single server) and other server where VMM server and SCOM console is installed. I have installed GSM Management Packs on the SCOM server and configured one Web Availability Monitor
to be monitored from external servers (e.g. Chicago).
I am getting the below error:-
Log Name: Operations Manager
Source: Health Service Modules Ex
Date: 9/11/2014 7:14:26 PM
Event ID: 10001
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Computer: SCOMCLOUD.abc.in
Description:
Global Service Monitor Modules: Failed step: 'Couldn't get the ACS endpoint from discovery service. SubscriptionId: '1f156904-532e-416f-b570-1141438392a3', OutsideInServiceBaseUri: 'https://gsm-prod.systemcenter.microsoft.com/''. Diagnostic context: RequestId
= '0fe72d85-989c-4c1b-89c1-1f4b641c1578', New ConfigHash = '65afc4b6-c18d-5e68-56d3-482e2db1851a', '1' tests, Last ConfigHash = '00000000-0000-0000-0000-000000000000'. Exception: 'There was no endpoint listening at https://gsm-prod.systemcenter.microsoft.com/DiscoveryService/1f156904-532e-416f-b570-1141438392a3/Endpoints
that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details.'
One or more workflows were affected by this.
Workflow name: Microsoft.SystemCenter.Omonline.OutsideIn.Discovery.ConfigUploaderRule
Instance name: Global Service Monitor
Instance ID: {298CB0DA-4453-EFD2-A7AC-C2E8F2F7100D}
Management group: SCOMGROUP
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Health Service Modules Ex" />
<EventID Qualifiers="0">10001</EventID>
<Level>3</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2014-09-11T13:44:26.000000000Z" />
<EventRecordID>149790</EventRecordID>
<Channel>Operations Manager</Channel>
<Computer>SCOMCLOUD.abc.in</Computer>
<Security />
</System>
<EventData>
<Data>SCOMGROUP</Data>
<Data>Microsoft.SystemCenter.Omonline.OutsideIn.Discovery.ConfigUploaderRule</Data>
<Data>Global Service Monitor</Data>
<Data>{298CB0DA-4453-EFD2-A7AC-C2E8F2F7100D}</Data>
<Data>Failed step: 'Couldn't get the ACS endpoint from discovery service. SubscriptionId: '1f156904-532e-416f-b570-1141438392a3', OutsideInServiceBaseUri: 'https://gsm-prod.systemcenter.microsoft.com/''. Diagnostic context: RequestId = '0fe72d85-989c-4c1b-89c1-1f4b641c1578',
New ConfigHash = '65afc4b6-c18d-5e68-56d3-482e2db1851a', '1' tests, Last ConfigHash = '00000000-0000-0000-0000-000000000000'. Exception: 'There was no endpoint listening at https://gsm-prod.systemcenter.microsoft.com/DiscoveryService/1f156904-532e-416f-b570-1141438392a3/Endpoints
that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details.'</Data>
<Data>Global Service Monitor Modules</Data>
<Data>Couldn't get the ACS endpoint from discovery service. SubscriptionId: '1f156904-532e-416f-b570-1141438392a3', OutsideInServiceBaseUri: 'https://gsm-prod.systemcenter.microsoft.com/'</Data>
<Data>RequestId = '0fe72d85-989c-4c1b-89c1-1f4b641c1578', New ConfigHash = '65afc4b6-c18d-5e68-56d3-482e2db1851a', '1' tests, Last ConfigHash = '00000000-0000-0000-0000-000000000000'</Data>
<Data>There was no endpoint listening at https://gsm-prod.systemcenter.microsoft.com/DiscoveryService/1f156904-532e-416f-b570-1141438392a3/Endpoints that could accept the message. This is often caused by an incorrect address or SOAP action.
See InnerException, if present, for more details.</Data>
<Data>System.ServiceModel.EndpointNotFoundException: There was no endpoint listening at https://gsm-prod.systemcenter.microsoft.com/DiscoveryService/1f156904-532e-416f-b570-1141438392a3/Endpoints that could accept the message. This is often
caused by an incorrect address or SOAP action. See InnerException, if present, for more details. ---> System.Net.WebException: The remote name could not be resolved: 'gsm-prod.systemcenter.microsoft.com'
at System.Net.HttpWebRequest.GetResponse()
at System.ServiceModel.Channels.HttpChannelFactory`1.HttpRequestChannel.HttpChannelRequest.WaitForReply(TimeSpan timeout)
--- End of inner exception stack trace ---
Server stack trace:
at System.ServiceModel.Channels.HttpChannelUtilities.ProcessGetResponseWebException(WebException webException, HttpWebRequest request, HttpAbortReason abortReason)
at System.ServiceModel.Channels.HttpChannelFactory`1.HttpRequestChannel.HttpChannelRequest.WaitForReply(TimeSpan timeout)
at System.ServiceModel.Channels.RequestChannel.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)
Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at Microsoft.SystemCenter.Cloud.SharedLibrary.Discovery.IDiscovery.GetEndpoints(String subscriptionId)
at Microsoft.SystemCenter.Cloud.SharedLibrary.Discovery.DiscoveryHelper.<>c__DisplayClass1.<DiscoverAcsEndpoint>b__0(IDiscovery service)
at Microsoft.SystemCenter.Cloud.SharedLibrary.RestCallHelper.ExecuteRestCall[TContract](Uri endpointUri, WebProxy webProxy, String accessToken, RestMethod`1 method)
at Microsoft.SystemCenter.Cloud.SharedLibrary.Discovery.DiscoveryHelper.DiscoverAcsEndpoint(String subscriptionId, Uri outsideInServiceBaseUri, WebProxy proxy)
at Microsoft.SystemCenter.Cloud.OutsideInUnitModule.ConfigUploaderWriteActionModule.Execute()</Data>
</EventData>
</Event>
Any HELP would be really Appreciated.
Thanks in advance.
Abhinav | MCTS-Server Virtualization -
Failover Cluster Core Resources question on a Windows 2008R2 three node cluster
We have a three node Windows 2008R2 cluster with SQL Server 2008 R2 as a clustered resource. There are three resource groups in this cluster 1) Available Storage 2) Cluster Group 3) SQL Server. The Available Storage and SQL Server resource groups
reside on one node while the Cluster Group resides on another. The only resources residing in the Cluster Resource Group is the Cluster name and IP. I'd like to failover the Cluster Resource Group to be on the same node as everything else.
I'm not sure what the implications are on doing this. Failing over the Cluster Group shouldn't have any impact on the SQL Server Resource Group correct or would there be an interruption to SQL because of the failover of the Cluster Group. It's
an critical application of which I'm trying to gather some information for a change request and I know I'm going to be asked if this impacts the production database and everybody using it.
Thanks
RGNo, that should not impact anything. The cluster group is completely separate from the SQL group.
. : | : . : | : . tim -
Activity Monitor no longer works, but icon is there
Recently, my Activity Monitor App in the Utilities folder stoped working. You try to load it up , but it won't start up.
Is there a way to relaod this app without having to reload the OS X Panther, and do all the updates again?Open the Terminal in the /Applications/Utilities/ folder and run the following commands:
sudo chmod 4775 "/Applications/Utilities/Activity Monitor.app/Contents/Resources/pmTool"
sudo chown root:admin "/Applications/Utilities/Activity Monitor.app/Contents/Resources/pmTool"
Press Enter and type in your administrator password as necessary; when done, the Activity Monitor should open properly. Additionally, if you deleted some or all of the receipts from the /Library/Receipts/ folder, the ability of the Disk Utility to repair your permissions will be affected; you may need to reinstall Mac OS X to get the receipts back.
(10893) -
I was told that it was probably a problem with 10.4.5. Activity Monitor doesn't show any programs.
Now I've upgraded to leopard the same problem persists. I've deleted the plist files and this makes no difference. I can now, with leopard, look at the plist files and there isn't any obvious reason for this.
A ps -A in command mode works perfectly well, but no programs at all appear in Activity Monitor!
Any suggestions gratefully received...OK, it's to do with:
Permissions, the location of the Activity Monitor Application and the Install Receipt file ...
Open a Terminal window and enter
ls -l /Library/Receipts/Essentials.pkg/Contents/Archive.bom
ls -l "/Applications/Utilities/Activity Monitor.app/Contents/Resources/pmTool"
If you get an error about "pmTool" it means you moved Activity Monitor from /Applications/Utilities
Move it back there and then run Disc Utility and Repair Permissions
If you can't repair it means it's lost the Receipt file somehow
Maybe you are looking for
-
Direct database request and session variable value
Hi, I have a problem by doing the following : idea is to have report with a list of all tables in user schema (which are not all in repository physical layer). By clicking on the name of any, should bring answer report which is done as direct databas
-
I have an HP PhotoSmart Plus B210 Printer and an HP Officejet Pro 6830. Both printers are printing "gibberish". It may print 1/2 page of what I want to print, but then it prints a bunch of random letters or symbols or black lines all over the page. T
-
Cannot insert pages into PDFs after signed with a digital signature in adobe 9
Is there any workaround that allows me to insert additional pages into a SIGNED pdf? Background: Author of a form create a form that needs multiple signatures and pages appended and converts to PDF for signing. Person 1 signs it, and sends it to pers
-
Problem playing audiobooks with music app
Ever since the new update listening to my audiobooks has been a challenge. The app does not show chapters any more and now every time I stop listening to a book and then go back later the book starts over. Is there a fix? or another app I can use
-
I want to know how I could use another method of payment in my Request Forms I do have an embed code to generate a buttom... How can I do?? regards Renato