Status of cat6k secondary supervisor
Does anyone know whether the following output corresponds to the Hot standby Sup in slot 6 going into "unknown" status, when Module 2 is in fact unoccupied?
#show facility-alarm status
System Totals Critical: 1 Major: 0 Minor: 0
Source Severity Description [Index]
Physical Slot 2 CRITICAL Active Card Removed OIR Alarm [0]
Are there any syslog and/or SNMP monitoring options when Sups in SSO failover go into "simplex" mode?
#show redund
Redundant System Information :
Available system uptime = XX weeks, XX day, XX hours, 54 minutes
Switchovers system experienced = 0
Standby failures = 1
Last switchover reason = none
Hardware Mode = Simplex
Configured Redundancy Mode = sso
Operating Redundancy Mode = sso
Maintenance Mode = Disabled
Communications = Down Reason: Simplex mode
Current Processor Information :
Active Location = slot 5
Current Software state = ACTIVE
Uptime in current state = XX weeks, XX day, XX hours, 54 minutes
Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVIPSERVICESK9_WAN-M), Version 12.2(##)SXH#, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2008 by Cisco Systems, Inc.
Compiled Thu 24-Jul-08 19:18 by prod_rel_team
BOOT = sup-bootdisk:s72033-advipservicesk9_wan-mz.122-##.SXH#.bin,1;slavesup-bootdisk:s72033-advipservicesk9_wan-mz.122-##.SXH#.bin,1;
CONFIG_FILE =
BOOTLDR =
Configuration register = 0x2102
Peer Processor Information :
Standby Location = slot 6
Current Software state = STANDBY HOT
Uptime in current state = XX weeks, XX day, XX hours, 54 minutes
Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVIPSERVICESK9_WAN-M), Version 12.2(##)SXH#, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2008 by Cisco Systems, Inc.
Compiled Thu 24-Jul-08 19:18 by prod_rel_team
Configuration register = 0x2102
Hi,
possibly this but am not entirely sure
Comment out the Sqlnet.Authentication_services = (NTS) in client "sqlnet.ora".
rgds
alanm
Similar Messages
-
Custom alert for ASA Secondary Status
Hello All.
Here is our dilemma.
We need a custom alert. Something that will trigger an alert if our secondary ASA goes to a "Secondary - Failed" state.
If the primary is active and secondary is in a failed state, we may never know until traffic tries to fail to the secondary and is
unable to do so because it is in a bad state.
We are not looking to see if the secondary firewall goes down nor if it becomes the primary from a failure of the primary, but if
anything changes the secondary status.
To put another way, if the line from "show standby" shows "Secondary - Failed"
we need to know about it because it means redundancy is broken.
We need to know if this line changes from this status:
Other host: Secondary - Standby Ready
I believe there is a monitor in Orion for the load balancers called something like "not
standby hot" designed for the same thing. Basically we need the same type of
monitor for the firewalls.
Any ideas on how to go about making this happen????
All of the posts I have discovered relating to this topic only cover alerts/notifications on whether a pair of devices go from
active to standby and vice versa.
Is this even possible with the OID's on the ASA's?
cfwHardwareInformation
cfwHardwareStatusValue
cfwHardwareStatusDetail
cfwBufferStatInformation
Posts we've already covered:
http://thwack.solarwinds.com/message/132423#132423
http://thwack.solarwinds.com/message/29931#29931
http://thwack.solarwinds.com/message/85319#85319
http://thwack.solarwinds.com/docs/DOC-170819
http://thwack.solarwinds.com/message/171653#171653
http://thwack.solarwinds.com/docs/DOC-118692
http://thwack.solarwinds.com/message/29931#29931
https://supportforums.cisco.com/docs/DOC-1295
http://thwack.solarwinds.com/message/71089#71089
Thank you in advance,
ToddHi,
One option is to use standard AuditTrail functionality on that field, then you'll have the entire chronological history for the field to work the periodic alert logic from.
Regards,
Gareth
Blog: http://garethroberts.blogspot.com/ -
SUPERVISOR WS-X45-SUP7-E REDUNDANCY, STATUS LED ORANGE?!?
hi all,
i have a 4510R+E with 2 SUPERVISOR WS-X45-SUP7-E configured in redundancy mode.
The status leds of 2 supervisor is orange and don't become green!!
Can you help me?
Below you can see the redundancy configuration:
redundancy
mode sso
main-cpu
auto-sync startup-config
Thanks to all!!!
Alberto.You can see the show module:
FERCAM_4510#sh module
Chassis Type : WS-C4510R+E
Power consumed by backplane : 40 Watts
Mod Ports Card Type Model Serial No.
---+-----+--------------------------------------+------------------+-----------
2 48 10/100/1000BaseT EEE (RJ45) WS-X4748-RJ45-E CAT1908L9HK
3 48 10/100/1000BaseT EEE (RJ45) WS-X4748-RJ45-E CAT1903L5R5
4 48 10/100/1000BaseT EEE (RJ45) WS-X4748-RJ45-E CAT1906L6Z7
5 4 Sup 7-E 10GE (SFP+), 1000BaseX (SFP) WS-X45-SUP7-E CAT1910L326
6 4 Sup 7-E 10GE (SFP+), 1000BaseX (SFP) WS-X45-SUP7-E CAT1910L2SX
M MAC addresses Hw Fw Sw Status
--+--------------------------------+---+------------+----------------+---------
2 84b8.024b.8a10 to 84b8.024b.8a3f 1.1 Ok
3 84b8.024b.8e30 to 84b8.024b.8e5f 1.1 Ok
4 84b8.024b.8a40 to 84b8.024b.8a6f 1.1 Ok
5 74a2.e680.2dc0 to 74a2.e680.2dc3 3.0 15.0(1r)SG5 03.07.00.E Ok
6 74a2.e680.2dc4 to 74a2.e680.2dc7 3.0 15.0(1r)SG5 03.07.00.E Ok
Mod Redundancy role Operating mode Redundancy status
----+-------------------+-------------------+----------------------------------
5 Active Supervisor SSO Active
6 Standby Supervisor SSO Standby hot
System Failures:
Power Supply: bad/off (see 'show power') -
ISE 1.1.3.124 secondary node not reachable after registration
G'day All,
I'm constantly seeing that the sync and replication status for my secondary admin/monitor node in the primary node as node not reachable. The secondary still thinks it is in standalone mode. When I run the ISE diag tool connectivity tests I am able successfully ping the devices from each other using both hostname and ip and the nslookup also works fine between both nodes. Ping and nslookups also work from different networks within the environment. The two nodes are in the same vlan on a 6500 vss pair but on different switches of the pair. I'm new to ISE so any help is greatly appreciated.
Thanks All.
JS
Sent from Cisco Technical Support iPhone AppHi Saurav,
Thanks for your prompt repsonse...
I have worked through that section of the document. The registration completes successfully, I've got NTP sync on both nodes and the system time on both nodes is identical.
I am only using the self signed certificates, but following the user guide instructions I have imported the secondary's cert into the primary node.
Just as of about 30 minutes ago, I saw an alarm on the Secondary ISE node stating that a Slow or Stuck Replication has been detected...
As I said in the original post, I can ping the fqdn's from each other so it appears that the DNS requirements have been satisfied.
I've changed the admin account password, I am certain that the ISE DB passwords are correct and the same on both nodes and the timezones for both nodes is the same also....
It looks to me that registration is fine, but the first full replication isn't completing successfully
Thanks,
JS -
Handling DatabaseException, OperationStatus.KEYEXIST, from Secondary Db
Hi,
If I insert a record into my primary db I might do something like:
OperationStatus status = db.put(null, key, data);
if (OperationStatus.SUCCESS != status) {
// handle
However, if I have a also have a secondary database, and if the put operation above were to result in a duplicate key on the secondary database then a DatabaseException is thrown:
Exception in thread "main" com.sleepycat.je.DatabaseException: (JE 3.3.82) Could not insert secondary key in SecDb OperationStatus.KEYEXIST
My question is how can I handle this exception gracefully, and test for the OperationStatus from the secondary db insert? I could catch it:
OperationStatus status = null;
try {
status = db.put(null, key, data);
if (OperationStatus.SUCCESS != status) {
// X: handle primary db error
} catch (DatabaseException e) {
if (OperationStatus.KEYEXIST == status) {
// Y: handle secondary db duplicate record
} else
throw e;
But status is always null, and does not reflect the OperationStatus code from the secondary db put operation, so the line X above never triggers. Is there another method I should be using to test for failed inserts on a secondary database due to duplicate keys?
Thanks,
JoelJoel,
In the newest release, JE 4.0, we attempted to help with this problem by expanding the set of exception classes. In this case, you should get a UniqueConstraintException, which is a subclass of DatabaseException. See http://www.oracle.com/technology/documentation/berkeley-db/je/java/com/sleepycat/je/UniqueConstraintException.html.
Regards,
Linda -
The fact table of InfoCube 0OPA_C11 is missing 2 secondary indexes
Hello experts,
in RSRV i got the above mentioned message for my InfoCube 0OPA_C11. Before I tried to delete, repair and build up my indices in performance tab of the InfoCube.
I followed SAP Note '401242 - Problems with InfoCube or aggregate indexes'.
-> SAP_INFOCUBE_INDEXES_REPAIR repaired indexes and show no further problem, but problem still exists, in performance tab light is red and RSRV brings same message.
Does someone have any idea what further I can do?
Best regards,
PeterPeter, refer to note 928037, it's FAQ about MaxDB indexes. I gave it a quick reading and I believe the answer to this issue lies beneath this doc! There are ways to check consistency of individual index, bad indexes and ways of eliminating them.
Also, refer to the below blog. Check the status of the secondary index in this case. I suspect they are bad or missing. There are ways to recreate them. Keep your basis guys in the loop.
http://wiki.sdn.sap.com/wiki/display/MaxDB/MaxDBIndex%28Secondary+Key%29
Hope this helps.
Edited by: Mann Krishna on Sep 10, 2010 3:50 PM -
Dear all,
i have one primary and two secondary setup
both primary and one secondaryA are running fine. but when i used ./vda-db-status on another secondary B . it showed me down
data node down
actually secondary B was shutdown due to power failure. when we restarted the server it gave us boot error. boot archive..fsck -F ufs /dev/rdisk/....solved the problem
but when we booted it was down..in mysql cluster status. this server is also datanode in cluster.
I checked svcs svc:/application/database/vdadb:core is down showing offline*
in /var/opt/SUNWvda/mysql-cluster/ndb_3.error.log
I found following
Current byte-offset of file-pointer is: 1566
Time: Tuesday 8 June 2010 - 13:08:28
Status: Ndbd file system error, restart node initial
Message: File not found (Ndbd file system inconsistency error, please report a bug)
Error: 2815
Error data: DBLQH: File system open failed. OS errno: 2
Error object: DBLQH (Line: 3083) 0x0000000a
Program: /opt/SUNWvda/mysql/bin/ndbd
Pid: 875
Version: mysql-5.1.37 ndb-7.0.8a
Trace: /var/opt/SUNWvda/mysql-cluster/ndb_3_trace.log.1
***EOM***
Time: Tuesday 8 June 2010 - 13:32:00
Status: Ndbd file system error, restart node initial
Message: File not found (Ndbd file system inconsistency error, please report a bug)
Error: 2815
Error data: DBLQH: File system open failed. OS errno: 2
Error object: DBLQH (Line: 3083) 0x0000000a
Program: /opt/SUNWvda/mysql/bin/ndbd
Pid: 5686
Version: mysql-5.1.37 ndb-7.0.8a
Trace: /var/opt/SUNWvda/mysql-cluster/ndb_3_trace.log.2
***EOM***
Time: Tuesday 8 June 2010 - 13:42:26
Status: Ndbd file system error, restart node initial
Message: File not found (Ndbd file system inconsistency error, please report a bug)
Error: 2815
Error data: DBLQH: File system open failed. OS errno: 2
Error object: DBLQH (Line: 3083) 0x0000000a
Program: /opt/SUNWvda/mysql/bin/ndbd
Pid: 764
Version: mysql-5.1.37 ndb-7.0.8a
Trace: /var/opt/SUNWvda/mysql-cluster/ndb_3_trace.log.3
***EOM***
______________________________________________________________________________________________________________Further i saw Vdadb:core.log
i found following
.........................(logs omitted)
[ Jun 8 13:08:16 Executing start method ("/opt/SUNWvda/lib/vda-db-service start") ]
Configuration:
MGMT_NODE=[0]; NDBD_NODE=[1]; SQL_NODE=[0]; MULTI_HOST_MODE=[1];
NDBD_CONNECTSTRING=[mycompnay.com]; NDBD_INITIAL_ARG=[]; NDBD_NODE_ID=[3];
MYSQL_BIN=[opt/SUNWvda/mysql/bin];
Starting the Sun Virtual Desktop Infrastructure Database service:
- Starting Data Node... 2010-06-08 13:08:18 [ndbd] INFO -- Configuration fetched from 'mycompnay.com:1186', generation: 1
Arguments: [mycompnay.com ]...
Error
[ Jun 8 13:14:46 Method "start" exited with status 95 ]
[ Jun 8 13:31:35 Leaving maintenance because disable requested. ]
[ Jun 8 13:31:35 Disabled. ]
[ Jun 8 13:31:56 Enabled. ]
[ Jun 8 13:31:56 Executing start method ("/opt/SUNWvda/lib/vda-db-service start") ]
Configuration:
MGMT_NODE=[0]; NDBD_NODE=[1]; SQL_NODE=[0]; MULTI_HOST_MODE=[1];
NDBD_CONNECTSTRING=[mycompnay.com]; NDBD_INITIAL_ARG=[]; NDBD_NODE_ID=[3];
MYSQL_BIN=[opt/SUNWvda/mysql/bin];
Starting the Sun Virtual Desktop Infrastructure Database service:
- Starting Data Node... 2010-06-08 13:31:57 [ndbd] INFO -- Configuration fetched from 'mycompnay.com:1186', generation: 1
Arguments: [mycompnay.com ]...
Error
[ Jun 8 13:38:27 Method "start" exited with status 95 ]
[ Jun 8 13:38:27 Leaving maintenance because disable requested. ]
[ Jun 8 13:38:27 Disabled. ]
[ Jun 8 13:42:21 Executing start method ("/opt/SUNWvda/lib/vda-db-service start") ]
Configuration:
MGMT_NODE=[0]; NDBD_NODE=[1]; SQL_NODE=[0]; MULTI_HOST_MODE=[1];
NDBD_CONNECTSTRING=[mycompnay.com]; NDBD_INITIAL_ARG=[]; NDBD_NODE_ID=[3];
MYSQL_BIN=[opt/SUNWvda/mysql/bin];
Starting the Sun Virtual Desktop Infrastructure Database service:
- Starting Data Node... 2010-06-08 13:42:22 [ndbd] INFO -- Configuration fetched from 'mycompnay.com:1186', generation: 1
Arguments: [mycompnay.com ]...
Error
[ Jun 8 13:48:50 Method "start" exited with status 95 ]
any ideas -
Hi,
Have a 10.01.11900 CUC cluster and everything is working fine (no one having issues with voice mail, etc) but the cluster status reports is not consistent.
DBreplication is showing 2 on both servers.
Primary unity server cluster status shows Primary/split brain recovery.
HA Unity server cluster status shows Primary/Secondary.
utils diagnose test - everything tests fine except the tomcat_connectors test.
test - tomcat_connectors : Failed - The HTTPS port is not responding to local requests. Please collect all of the Tomcat logs for root cause analysis: file get activelog tomcat/logs/*
We've shutdown the HA server and rebooted primary, and then waited awhile after primary was back up/active before bringing the HA server back up and still same.
We reset DB replication and same.
On the HA server I made the HA primary and the cluster status flipped to Seconday/Primary and I then made primary the primary again, but the primary server cluster status always shows Split Brain Recovery for the secondary/HA server.
No core dumps on either server and all services are started.
Any one seen this before or have any thoughts? I have a TAC Case on this but so far in same boat.
Would the utils cuc cluster renegotiate command help? Did not replace a server so don't really want to overwrite data to publisher server. Issue seems to be with the publisher since HA shows fine but not sure. I don't want to lose messages/etc so don't want really want to run these commands.
Thanks.Ok, thanks.
The SRM logs indicate the Connection Digital Networking Replication Agent service is not running, however when I start it it stops right away and the cuReplicator log states digital networking is not enabled.
From SRM Log:
23:47:20.100 |17755,,,SRM,7,<svcmon> checkServiceStatus: started service monitoring
23:47:20.100 |17755,,,SRM,7,<svcmon> Service Status: 1 service(s) not running. Service name(s):
23:47:20.100 |17755,,,SRM,7,<svcmon> Connection Digital Networking Replication Agent
23:47:24.674 |28471,,,SRM,11,<Timer-3> [snd] Type: Heartbeat
From Replicator log:
admin:file tail activelog cuc/diag_CuReplicator_00000049.uc
23:42:59.208 HDR|09/14/2014 ,Significant
23:42:59.208 |28914,,,CuReplicator,0,Digital Networking is not enabled. Replicator will stop now.
There is no digital networking setup to other unity systems, and only one location.
Also, the Server role manager can't be restarted from CLI or the GUI so either root or a server reboot.
I compared it to another CUC cluster and deactivated the Digital Networking service and the SRM logs seem happier now, will wait a bit and see if it clears the SBR status up. -
ConfigMgr SQL Server status unknown
Hi All,
I've a primary site server that connect to remote SQL server, then I install server for secondary site, after finish installing I check at Site System Status at my secondary server, it said for ConfigMgr SQL Server have a status
unknownwwhen Show Messages > All, it give me an empty status, where I have to troubleshoot ?
Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.I can telnet the SQL port from secondary server. The log you mention only exist at primary server and nothing suspicious. Do I have to add the computer account secondary server to the SQL DB with same permission for primary server ?
I have another environment that have SQL+SCCM primary in same box, I have secondary server too, but at the site system status there is no ConfigMgr SQL server. Is it correct ? So I mean the ConfigMgr SQL Server only exist at the secondary server if we use
SQL remotely ?
Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.
You don't require SQL for Secondary servers. And I don't think, you need to add secondary servers system account to SQL.
Anoop C Nair - Twitter @anoopmannur
MY BLOG:
http://anoopmannur.wordpress.com
SCCM Professionals
This posting is provided AS-IS with no warranties/guarantees and confers no rights.
Yes I know that, so why the ConfigMgr SQL Server appear ? I'm confuse of that.Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. -
What is the proper procedure when upgrading dual supervisor module on a 6513 switch.
1) Do you upgrade the primary / secondary supervisor first, with the secondary / primary supervisor hanging halfway out of the switch?
2) Do you disable HA?
3) Do you reboot the switch or module after you upload the new image to the primary and secondary supervisor?
4) If your not careful can the supervisor in slot-2 take over as the primary supervisor?
I have the document on upgrading the image but what I'm looking for is a step by step on upgrade a primary and secondary sup and also making sure the secondary sup in slot-2 does not take over as the primary.
Thanks,Hi Friend,
Just have a look at this link will guide you how to upgrade your redundant sup if you are running CATOS or if you are running Native IOS
http://www.cisco.com/warp/customer/473/161.html#maintask3
HTH, if yes please rate the post
Ankur -
Can 2 supervisor engine's ports both work at same time?
As I know that, when 1 supervisor engine is working, another engine's ports are totally down in redundancy mode .
Is there any way to let 2 supervisor engine's ports are both working at same time?The two supervisor engines in a redundant supervisor engine configuration have different responsibilities. The active supervisor engine is responsible for controlling the system bus and all line cards. All protocols are running on the active supervisor engine and it performs all packet forwarding. The standby supervisor engine does not communicate with the line cards. It receives packets from the network and populates its forwarding tables with this information, but does not participate in any packet forwarding. The relevant protocols on the system are initialized, but not active, on the standby supervisor engine. The Cisco Catalyst 6500 Series supervisor engines are hot-swappable and the standby supervisor engine can be installed in a running, active system. Also please note that redundant supervisor engines do not perform load sharing. The active supervisor engine is providing the entire packet-forwarding intelligence for the system (N+1 redundancy). If the active supervisor engine fails, the standby supervisor engine can still maintain the same system load.
The standby supervisor engine polls the active supervisor engine through the Ethernet out-of-band channel (EOBC) every 5 to10 milliseconds to monitor the online status of the active supervisor engine. The active supervisor engine might go offline for a variety of reasons such as hardware failures, system overload conditions, memory corruption issues, removal from chassis, being reset by the operator, or real-time diagnostics-driven supervisor switchover (also known as Generic Online Diagnostics) 1. The standby supervisor engine detects this type of failure and becomes the new active supervisor engine. The Catalyst OS software on the supervisor engine is responsible for restoring the protocols, line cards, and forwarding engines to normal operation. This restoration takes place through a fast switchover or a high-availability switchover.
HTH -
UCCX:Secondary node is in partial service & testing the HA mode
Hi,
Running UCCX 8.5 in HA mode.
Primary UCCX services are INSERVICE status,but in Secondary UCCX service status (UCCX engine ans subsystem)shows PARTIAL SERVICE.
Let me know how to make it INSERVICE.
Also let me know any procedure to test the HA mode is working fine or not!!!
Help me out!!
Thanks & Regards,
KrishnaHi,
Please expand the Engine under the arrow and check as to which subsystem is in partial service, this will help determine.
-Also make sure to turn on the engine debugging under UCCX serviceability.
Restart the servers during a maintenance window
-Collect the logs from the RTMT tool
-As after the restart the MIVR LOGS should clearly display the cause for which subsystem is in partial service
-Some of the most common reasons can be :
-Application manager service in partial
-CMT telephony subsystem might be in partial too
Also is the setup HAOWAN?
Please send over the CUCM version as well.
Once I know which subsystem is in partial service,I can point you in the right direction
Thanks,
Prashanth -
[Nexus 1000v] VEM can't be add into VSM
hi all,
following my lab, i have some problems with Nexus 1000V when VEM can't be add into VSM.
+ on VSM has already installed on ESX 1 (standalone or ha) and you can see:
Cisco_N1KV# show module
Mod Ports Module-Type Model Status
1 0 Virtual Supervisor Module Nexus1000V active *
Mod Sw Hw
1 4.2(1)SV1(4a) 0.0
Mod MAC-Address(es) Serial-Num
1 00-19-07-6c-5a-a8 to 00-19-07-6c-62-a8 NA
Mod Server-IP Server-UUID Server-Name
1 10.4.110.123 NA NA
+ on ESX2 that 's installed VEM
[root@esxhoadq ~]# vem status
VEM modules are loaded
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch0 128 3 128 1500 vmnic0
VEM Agent (vemdpa) is running
[root@esxhoadq ~]#
any advices for this,
thanks so muchHi,
i'm having similar issue: the VEM insatlled on the ESXi is not showing up on the VSM.
please check from the following what can be wrong?
This is the VEM status:
~ # vem status -v
Package vssnet-esx5.5.0-00000-release
Version 4.2.1.1.4.1.0-2.0.1
Build 1
Date Wed Jul 27 04:42:14 PDT 2011
Number of PassThru NICs are 0
VEM modules are loaded
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch0 128 4 128 1500 vmnic0
DVS Name Num Ports Used Ports Configured Ports MTU Uplinks
VSM11 256 40 256 1500 vmnic2,vmnic1
Number of PassThru NICs are 0
VEM Agent (vemdpa) is running
~ # vemcmd show port
LTL VSM Port Admin Link State PC-LTL SGID Vem Port
18 UP UP F/B* 0 vmnic1
19 DOWN UP BLK 0 vmnic2
* F/B: Port is BLOCKED on some of the vlans.
Please run "vemcmd show port vlans" to see the details.
~ # vemcmd show trunk
Trunk port 6 native_vlan 1 CBL 1
vlan(1) cbl 1, vlan(111) cbl 1, vlan(112) cbl 1, vlan(3968) cbl 1, vlan(3969) cbl 1, vlan(3970) cbl 1, vlan(3971) cbl 1,
Trunk port 16 native_vlan 1 CBL 1
vlan(1) cbl 1, vlan(111) cbl 1, vlan(112) cbl 1, vlan(3968) cbl 1, vlan(3969) cbl 1, vlan(3970) cbl 1, vlan(3971) cbl 1,
Trunk port 18 native_vlan 1 CBL 0
vlan(111) cbl 1, vlan(112) cbl 1,
~ # vemcmd show port
LTL VSM Port Admin Link State PC-LTL SGID Vem Port
18 UP UP F/B* 0 vmnic1
19 DOWN UP BLK 0 vmnic2
* F/B: Port is BLOCKED on some of the vlans.
Please run "vemcmd show port vlans" to see the details.
~ # vemcmd show port vlans
Native VLAN Allowed
LTL VSM Port Mode VLAN State Vlans
18 T 1 FWD 111-112
19 A 1 BLK 1
~ # vemcmd show port
LTL VSM Port Admin Link State PC-LTL SGID Vem Port
18 UP UP F/B* 0 vmnic1
19 DOWN UP BLK 0 vmnic2
* F/B: Port is BLOCKED on some of the vlans.
Please run "vemcmd show port vlans" to see the details.
~ # vemcmd show port vlans
Native VLAN Allowed
LTL VSM Port Mode VLAN State Vlans
18 T 1 FWD 111-112
19 A 1 BLK 1
~ # vemcmd show trunk
Trunk port 6 native_vlan 1 CBL 1
vlan(1) cbl 1, vlan(111) cbl 1, vlan(112) cbl 1, vlan(3968) cbl 1, vlan(3969) cbl 1, vlan(3970) cbl 1, vlan(3971) cbl 1,
Trunk port 16 native_vlan 1 CBL 1
vlan(1) cbl 1, vlan(111) cbl 1, vlan(112) cbl 1, vlan(3968) cbl 1, vlan(3969) cbl 1, vlan(3970) cbl 1, vlan(3971) cbl 1,
Trunk port 18 native_vlan 1 CBL 0
vlan(111) cbl 1, vlan(112) cbl 1,
~ # vemcmd show card
Card UUID type 2: ebd44e72-456b-11e0-0610-00000000108f
Card name: esx
Switch name: VSM11
Switch alias: DvsPortset-0
Switch uuid: c4 be 2c 50 36 c5 71 97-44 41 1f c0 43 8e 45 78
Card domain: 1
Card slot: 1
VEM Tunnel Mode: L2 Mode
VEM Control (AIPC) MAC: 00:02:3d:10:01:00
VEM Packet (Inband) MAC: 00:02:3d:20:01:00
VEM Control Agent (DPA) MAC: 00:02:3d:40:01:00
VEM SPAN MAC: 00:02:3d:30:01:00
Primary VSM MAC : 00:50:56:ac:00:42
Primary VSM PKT MAC : 00:50:56:ac:00:44
Primary VSM MGMT MAC : 00:50:56:ac:00:43
Standby VSM CTRL MAC : ff:ff:ff:ff:ff:ff
Management IPv4 address: 10.1.240.30
Management IPv6 address: 0000:0000:0000:0000:0000:0000:0000:0000
Secondary VSM MAC : 00:00:00:00:00:00
Secondary L3 Control IPv4 address: 0.0.0.0
Upgrade : Default
Max physical ports: 32
Max virtual ports: 216
Card control VLAN: 111
Card packet VLAN: 112
Card Headless Mode : Yes
Processors: 8
Processor Cores: 4
Processor Sockets: 1
Kernel Memory: 16712336
Port link-up delay: 5s
Global UUFB: DISABLED
Heartbeat Set: False
PC LB Algo: source-mac
Datapath portset event in progress : no
~ #
On VSM
VSM11# sh svs conn
connection vcenter:
ip address: 10.1.240.38
remote port: 80
protocol: vmware-vim https
certificate: default
datacenter name: New Datacenter
admin:
max-ports: 8192
DVS uuid: c4 be 2c 50 36 c5 71 97-44 41 1f c0 43 8e 45 78
config status: Enabled
operational status: Connected
sync status: Complete
version: VMware vCenter Server 4.1.0 build-345043
VSM11# sh svs ?
connections Show connection information
domain Domain Configuration
neighbors Svs neighbors information
upgrade Svs upgrade information
VSM11# sh svs dom
SVS domain config:
Domain id: 1
Control vlan: 111
Packet vlan: 112
L2/L3 Control mode: L2
L3 control interface: NA
Status: Config push to VC successful.
VSM11# sh port
^
% Invalid command at '^' marker.
VSM11# sh run
!Command: show running-config
!Time: Sun Nov 20 11:35:52 2011
version 4.2(1)SV1(4a)
feature telnet
username admin password 5 $1$QhO77JvX$A8ykNUSxMRgqZ0DUUIn381 role network-admin
banner motd #Nexus 1000v Switch#
ssh key rsa 2048
ip domain-lookup
ip domain-lookup
hostname VSM11
snmp-server user admin network-admin auth md5 0x389a68db6dcbd7f7887542ea6f8effa1
priv 0x389a68db6dcbd7f7887542ea6f8effa1 localizedkey
vrf context management
ip route 0.0.0.0/0 10.1.240.254
vlan 1,111-112
port-channel load-balance ethernet source-mac
port-profile default max-ports 32
port-profile type ethernet Unused_Or_Quarantine_Uplink
vmware port-group
shutdown
description Port-group created for Nexus1000V internal usage. Do not use.
state enabled
port-profile type vethernet Unused_Or_Quarantine_Veth
vmware port-group
shutdown
description Port-group created for Nexus1000V internal usage. Do not use.
state enabled
port-profile type ethernet system-uplink
vmware port-group
switchport mode trunk
switchport trunk allowed vlan 111-112
no shutdown
system vlan 111-112
description "System profile"
state enabled
port-profile type vethernet servers11
vmware port-group
switchport mode access
switchport access vlan 11
no shutdown
description "Data Profile for VM Traffic"
port-profile type ethernet vm-uplink
vmware port-group
switchport mode access
switchport access vlan 11
no shutdown
description "Uplink profile for VM traffic"
state enabled
vdc VSM11 id 1
limit-resource vlan minimum 16 maximum 2049
limit-resource monitor-session minimum 0 maximum 2
limit-resource vrf minimum 16 maximum 8192
limit-resource port-channel minimum 0 maximum 768
limit-resource u4route-mem minimum 32 maximum 32
limit-resource u6route-mem minimum 16 maximum 16
limit-resource m4route-mem minimum 58 maximum 58
limit-resource m6route-mem minimum 8 maximum 8
interface mgmt0
ip address 10.1.240.124/24
interface control0
line console
boot kickstart bootflash:/nexus-1000v-kickstart-mz.4.2.1.SV1.4a.bin sup-1
boot system bootflash:/nexus-1000v-mz.4.2.1.SV1.4a.bin sup-1
boot kickstart bootflash:/nexus-1000v-kickstart-mz.4.2.1.SV1.4a.bin sup-2
boot system bootflash:/nexus-1000v-mz.4.2.1.SV1.4a.bin sup-2
svs-domain
domain id 1
control vlan 111
packet vlan 112
svs mode L2
svs connection vcenter
protocol vmware-vim
remote ip address 10.1.240.38 port 80
vmware dvs uuid "c4 be 2c 50 36 c5 71 97-44 41 1f c0 43 8e 45 78" datacenter-n
ame New Datacenter
max-ports 8192
connect
vsn type vsg global
tcp state-checks
vnm-policy-agent
registration-ip 0.0.0.0
shared-secret **********
log-level
thank you
Michel -
Nexus 1000v VEM module bouncing between hosts
I'm receiving these error messages on my N1KV and don't know how to fix it. I've tried removing, rebooting, reinstalling host B's VEM but that did not fix the issue. How do I debug this?
My setup,
Two physical hosts running esxi 5.1, vcenter appliance, n1kv with two system uplinks and two uplinks for iscsi for each host. Let me know if you need more output from logs or commands, thanks.
N1KV# 2013 Jun 17 18:18:07 N1KV %VEM_MGR-2-VEM_MGR_DETECTED: Host 192.168.52.100 detected as module 3
2013 Jun 17 18:18:07 N1KV %VEM_MGR-2-MOD_ONLINE: Module 3 is online
2013 Jun 17 18:18:08 N1KV %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 3 (Unexpected Node Id Request)
2013 Jun 17 18:18:09 N1KV %VEM_MGR-2-MOD_OFFLINE: Module 3 is offline
2013 Jun 17 18:18:13 N1KV %VEM_MGR-2-VEM_MGR_DETECTED: Host 192.168.51.100 detected as module 3
2013 Jun 17 18:18:13 N1KV %VEM_MGR-2-MOD_ONLINE: Module 3 is online
2013 Jun 17 18:18:16 N1KV %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 3 (Unexpected Node Id Request)
2013 Jun 17 18:18:17 N1KV %VEM_MGR-2-MOD_OFFLINE: Module 3 is offline
2013 Jun 17 18:18:21 N1KV %VEM_MGR-2-VEM_MGR_DETECTED: Host 192.168.52.100 detected as module 3
2013 Jun 17 18:18:21 N1KV %VEM_MGR-2-MOD_ONLINE: Module 3 is online
2013 Jun 17 18:18:22 N1KV %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 3 (Unexpected Node Id Request)
2013 Jun 17 18:18:23 N1KV %VEM_MGR-2-MOD_OFFLINE: Module 3 is offline
2013 Jun 17 18:18:28 N1KV %VEM_MGR-2-VEM_MGR_DETECTED: Host 192.168.51.100 detected as module 3
2013 Jun 17 18:18:29 N1KV %VEM_MGR-2-MOD_ONLINE: Module 3 is online
2013 Jun 17 18:18:44 N1KV %PLATFORM-2-MOD_DETECT: Module 2 detected (Serial number :unavailable) Module-Type Virtual Supervisor Module Model :unavailable
N1KV# sh module
Mod Ports Module-Type Model Status
1 0 Virtual Supervisor Module Nexus1000V ha-standby
2 0 Virtual Supervisor Module Nexus1000V active *
3 248 Virtual Ethernet Module NA ok
Mod Sw Hw
1 4.2(1)SV2(1.1a) 0.0
2 4.2(1)SV2(1.1a) 0.0
3 4.2(1)SV2(1.1a) VMware ESXi 5.1.0 Releasebuild-838463 (3.1)
Mod MAC-Address(es) Serial-Num
1 00-19-07-6c-5a-a8 to 00-19-07-6c-62-a8 NA
2 00-19-07-6c-5a-a8 to 00-19-07-6c-62-a8 NA
3 02-00-0c-00-03-00 to 02-00-0c-00-03-80 NA
Mod Server-IP Server-UUID Server-Name
1 192.168.54.2 NA NA
2 192.168.54.2 NA NA
3 192.168.51.100 03000200-0400-0500-0006-000700080009 NA
* this terminal session
~ # vemcmd show card
Card UUID type 2: 03000200-0400-0500-0006-000700080009
Card name:
Switch name: N1KV
Switch alias: DvsPortset-1
Switch uuid: e6 dc 36 50 c0 a9 d9 a5-0b 98 fb 90 e1 fc 99 af
Card domain: 2
Card slot: 3
VEM Tunnel Mode: L3 Mode
L3 Ctrl Index: 49
L3 Ctrl VLAN: 51
VEM Control (AIPC) MAC: 00:02:3d:10:02:02
VEM Packet (Inband) MAC: 00:02:3d:20:02:02
VEM Control Agent (DPA) MAC: 00:02:3d:40:02:02
VEM SPAN MAC: 00:02:3d:30:02:02
Primary VSM MAC : 00:50:56:b6:0c:b2
Primary VSM PKT MAC : 00:50:56:b6:35:3f
Primary VSM MGMT MAC : 00:50:56:b6:d5:12
Standby VSM CTRL MAC : 00:50:56:b6:96:f2
Management IPv4 address: 192.168.51.100
Management IPv6 address: 0000:0000:0000:0000:0000:0000:0000:0000
Primary L3 Control IPv4 address: 192.168.54.2
Secondary VSM MAC : 00:00:00:00:00:00
Secondary L3 Control IPv4 address: 0.0.0.0
Upgrade : Default
Max physical ports: 32
Max virtual ports: 216
Card control VLAN: 1
Card packet VLAN: 1
Control type multicast: No
Card Headless Mode : No
Processors: 4
Processor Cores: 4
Processor Sockets: 1
Kernel Memory: 16669760
Port link-up delay: 5s
Global UUFB: DISABLED
Heartbeat Set: True
PC LB Algo: source-mac
Datapath portset event in progress : no
Licensed: Yes
~ # vemcmd show card
Card UUID type 2: 03000200-0400-0500-0006-000700080009
Card name:
Switch name: N1KV
Switch alias: DvsPortset-0
Switch uuid: e6 dc 36 50 c0 a9 d9 a5-0b 98 fb 90 e1 fc 99 af
Card domain: 2
Card slot: 3
VEM Tunnel Mode: L3 Mode
L3 Ctrl Index: 49
L3 Ctrl VLAN: 52
VEM Control (AIPC) MAC: 00:02:3d:10:02:02
VEM Packet (Inband) MAC: 00:02:3d:20:02:02
VEM Control Agent (DPA) MAC: 00:02:3d:40:02:02
VEM SPAN MAC: 00:02:3d:30:02:02
Primary VSM MAC : 00:50:56:b6:0c:b2
Primary VSM PKT MAC : 00:50:56:b6:35:3f
Primary VSM MGMT MAC : 00:50:56:b6:d5:12
Standby VSM CTRL MAC : 00:50:56:b6:96:f2
Management IPv4 address: 192.168.52.100
Management IPv6 address: 0000:0000:0000:0000:0000:0000:0000:0000
Primary L3 Control IPv4 address: 192.168.54.2
Secondary VSM MAC : 00:00:00:00:00:00
Secondary L3 Control IPv4 address: 0.0.0.0
Upgrade : Default
Max physical ports: 32
Max virtual ports: 216
Card control VLAN: 1
Card packet VLAN: 1
Control type multicast: No
Card Headless Mode : Yes
Processors: 4
Processor Cores: 4
Processor Sockets: 1
Kernel Memory: 16669764
Port link-up delay: 5s
Global UUFB: DISABLED
Heartbeat Set: False
PC LB Algo: source-mac
Datapath portset event in progress : no
Licensed: Yes
! ports 1-6 connected to physical host A
interface GigabitEthernet1/0/1
description VMWARE ESXi Trunk
switchport trunk encapsulation dot1q
switchport mode trunk
switchport nonegotiate
spanning-tree portfast trunk
spanning-tree bpdufilter enable
spanning-tree bpduguard enable
channel-group 1 mode active
! ports 7-12 connected to phys host B
interface GigabitEthernet1/0/7
description VMWARE ESXi Trunk
switchport trunk encapsulation dot1q
switchport mode trunk
switchport nonegotiate
spanning-tree portfast trunk
spanning-tree bpdufilter enable
spanning-tree bpduguard enable
channel-group 2 mode activeok after deleteing the n1kv vms and vcenter and then reinstalling all I got the error again,
N1KV# 2013 Jun 18 17:48:12 N1KV %VEM_MGR-2-VEM_MGR_REMOVE_STATE_CONFLICT: Removing VEM 3 due to state conflict VSM(NodeId Processed), VEM(ModIns End Rcvd)
2013 Jun 18 17:48:13 N1KV %VEM_MGR-2-MOD_OFFLINE: Module 3 is offline
2013 Jun 18 17:48:16 N1KV %VEM_MGR-2-VEM_MGR_DETECTED: Host 192.168.52.100 detected as module 3
2013 Jun 18 17:48:16 N1KV %VEM_MGR-2-MOD_ONLINE: Module 3 is online
2013 Jun 18 17:48:22 N1KV %VEM_MGR-2-VEM_MGR_REMOVE_STATE_CONFLICT: Removing VEM 3 due to state conflict VSM(NodeId Processed), VEM(ModIns End Rcvd)
2013 Jun 18 17:48:23 N1KV %VEM_MGR-2-MOD_OFFLINE: Module 3 is offline
2013 Jun 18 17:48:34 N1KV %VEM_MGR-2-VEM_MGR_DETECTED: Host 192.168.52.100 detected as module 3
2013 Jun 18 17:48:34 N1KV %VEM_MGR-2-MOD_ONLINE: Module 3 is online
2013 Jun 18 17:48:41 N1KV %VEM_MGR-2-VEM_MGR_REMOVE_STATE_CONFLICT: Removing VEM 3 due to state conflict VSM(NodeId Processed), VEM(ModIns End Rcvd)
2013 Jun 18 17:48:42 N1KV %VEM_MGR-2-MOD_OFFLINE: Module 3 is offline
2013 Jun 18 17:49:03 N1KV %VEM_MGR-2-VEM_MGR_DETECTED: Host 192.168.52.100 detected as module 3
2013 Jun 18 17:49:03 N1KV %VEM_MGR-2-MOD_ONLINE: Module 3 is online
2013 Jun 18 17:49:10 N1KV %VEM_MGR-2-VEM_MGR_REMOVE_STATE_CONFLICT: Removing VEM 3 due to state conflict VSM(NodeId Processed), VEM(ModIns End Rcvd)
2013 Jun 18 17:49:11 N1KV %VEM_MGR-2-MOD_OFFLINE: Module 3 is offline
2013 Jun 18 17:49:29 N1KV %VEM_MGR-2-VEM_MGR_DETECTED: Host 192.168.51.100 detected as module 3
2013 Jun 18 17:49:29 N1KV %VEM_MGR-2-MOD_ONLINE: Module 3 is online
2013 Jun 18 17:49:35 N1KV %VEM_MGR-2-VEM_MGR_REMOVE_STATE_CONFLICT: Removing VEM 3 due to state conflict VSM(NodeId Processed), VEM(ModIns End Rcvd)
2013 Jun 18 17:49:36 N1KV %VEM_MGR-2-MOD_OFFLINE: Module 3 is offline
2013 Jun 18 17:49:53 N1KV %VEM_MGR-2-VEM_MGR_DETECTED: Host 192.168.51.100 detected as module 3
2013 Jun 18 17:49:53 N1KV %VEM_MGR-2-MOD_ONLINE: Module 3 is online
2013 Jun 18 17:49:59 N1KV %VEM_MGR-2-VEM_MGR_REMOVE_STATE_CONFLICT: Removing VEM 3 due to state conflict VSM(NodeId Processed), VEM(ModIns End Rcvd)
2013 Jun 18 17:50:00 N1KV %VEM_MGR-2-MOD_OFFLINE: Module 3 is offline
2013 Jun 18 17:50:05 N1KV %VEM_MGR-2-VEM_MGR_DETECTED: Host 192.168.52.100 detected as module 3
2013 Jun 18 17:50:05 N1KV %VEM_MGR-2-MOD_ONLINE: Module 3 is online
Host A
~ # vemcmd show card
Card UUID type 2: 03000200-0400-0500-0006-000700080009
Card name:
Switch name: N1KV
Switch alias: DvsPortset-0
Switch uuid: e6 dc 36 50 c0 a9 d9 a5-0b 98 fb 90 e1 fc 99 af
Card domain: 2
Card slot: 1
VEM Tunnel Mode: L3 Mode
L3 Ctrl Index: 49
L3 Ctrl VLAN: 52
VEM Control (AIPC) MAC: 00:02:3d:10:02:00
VEM Packet (Inband) MAC: 00:02:3d:20:02:00
VEM Control Agent (DPA) MAC: 00:02:3d:40:02:00
VEM SPAN MAC: 00:02:3d:30:02:00
Primary VSM MAC : 00:50:56:b6:96:f2
Primary VSM PKT MAC : 00:50:56:b6:11:b6
Primary VSM MGMT MAC : 00:50:56:b6:48:c6
Standby VSM CTRL MAC : ff:ff:ff:ff:ff:ff
Management IPv4 address: 192.168.52.100
Management IPv6 address: 0000:0000:0000:0000:0000:0000:0000:0000
Primary L3 Control IPv4 address: 192.168.54.2
Secondary VSM MAC : 00:00:00:00:00:00
Secondary L3 Control IPv4 address: 0.0.0.0
Upgrade : Default
Max physical ports: 32
Max virtual ports: 216
Card control VLAN: 1
Card packet VLAN: 1
Control type multicast: No
Card Headless Mode : Yes
Processors: 4
Processor Cores: 4
Processor Sockets: 1
Kernel Memory: 16669764
Port link-up delay: 5s
Global UUFB: DISABLED
Heartbeat Set: False
PC LB Algo: source-mac
Datapath portset event in progress : no
Licensed: No
Host B
~ # vemcmd show card
Card UUID type 2: 03000200-0400-0500-0006-000700080009
Card name:
Switch name: N1KV
Switch alias: DvsPortset-0
Switch uuid: bf fb 28 50 1b 26 dd ae-05 bd 4e 48 2e 37 56 f3
Card domain: 2
Card slot: 3
VEM Tunnel Mode: L3 Mode
L3 Ctrl Index: 49
L3 Ctrl VLAN: 51
VEM Control (AIPC) MAC: 00:02:3d:10:02:02
VEM Packet (Inband) MAC: 00:02:3d:20:02:02
VEM Control Agent (DPA) MAC: 00:02:3d:40:02:02
VEM SPAN MAC: 00:02:3d:30:02:02
Primary VSM MAC : 00:50:56:a8:f5:f0
Primary VSM PKT MAC : 00:50:56:a8:3c:62
Primary VSM MGMT MAC : 00:50:56:a8:b4:a4
Standby VSM CTRL MAC : 00:50:56:a8:30:d5
Management IPv4 address: 192.168.51.100
Management IPv6 address: 0000:0000:0000:0000:0000:0000:0000:0000
Primary L3 Control IPv4 address: 192.168.54.2
Secondary VSM MAC : 00:00:00:00:00:00
Secondary L3 Control IPv4 address: 0.0.0.0
Upgrade : Default
Max physical ports: 32
Max virtual ports: 216
Card control VLAN: 1
Card packet VLAN: 1
Control type multicast: No
Card Headless Mode : No
Processors: 4
Processor Cores: 4
Processor Sockets: 1
Kernel Memory: 16669760
Port link-up delay: 5s
Global UUFB: DISABLED
Heartbeat Set: True
PC LB Algo: source-mac
Datapath portset event in progress : no
Licensed: Yes
I used the nexus 1000v java installer so I don't know what it keeps assigning the same UUID nor do I know how to change it.
Here is the other output you requested,
N1KV# show vms internal info dvs
DVS INFO:
DVS name: [N1KV]
UUID: [bf fb 28 50 1b 26 dd ae-05 bd 4e 48 2e 37 56 f3]
Description: [(null)]
Config version: [1]
Max ports: [8192]
DC name: [Galaxy]
OPQ data: size [1121], data: [data-version 1.0
switch-domain 2
switch-name N1KV
cp-version 4.2(1)SV2(1.1a)
control-vlan 1
system-primary-mac 00:50:56:a8:f5:f0
active-vsm packet mac 00:50:56:a8:3c:62
active-vsm mgmt mac 00:50:56:a8:b4:a4
standby-vsm ctrl mac 0050-56a8-30d5
inband-vlan 1
svs-mode L3
l3control-ipaddr 192.168.54.2
upgrade state 0 mac 0050-56a8-30d5 l3control-ipv4 null
cntl-type-mcast 0
profile dvportgroup-26 trunk 1,51-57,110
profile dvportgroup-26 mtu 9000
profile dvportgroup-27 access 51
profile dvportgroup-27 mtu 1500
profile dvportgroup-27 capability l3control
profile dvportgroup-28 access 52
profile dvportgroup-28 mtu 1500
profile dvportgroup-28 capability l3control
profile dvportgroup-29 access 53
profile dvportgroup-29 mtu 1500
profile dvportgroup-30 access 54
profile dvportgroup-30 mtu 1500
profile dvportgroup-31 access 55
profile dvportgroup-31 mtu 1500
profile dvportgroup-32 access 56
profile dvportgroup-32 mtu 1500
profile dvportgroup-34 trunk 220
profile dvportgroup-34 mtu 9000
profile dvportgroup-35 access 220
profile dvportgroup-35 mtu 1500
profile dvportgroup-35 capability iscsi-multipath
end-version 1.0
push_opq_data flag: [1]
show svs neighbors
Active Domain ID: 2
AIPC Interface MAC: 0050-56a8-f5f0
Inband Interface MAC: 0050-56a8-3c62
Src MAC Type Domain-id Node-id Last learnt (Sec. ago)
0050-56a8-30d5 VSM 2 0201 1020.45
0002-3d40-0202 VEM 2 0302 1.33
I cannot add Host A to the N1KV it errors out with,
vDS operation failed on host 192.168.52.100, An error occurred during host configuration. got (vim.fault.PlatformConfigFault) exception
Host B (192.168.51.100) was added fine, then I moved a vmkernel to the N1KV which brought up the VEM and got the VEM flapping errors. -
Nexus 1K VEM module shutdown (with DELL BLADE server)
Hello, This is Vince.
I am doing one of PoC with important customer.
Can anyone help me to explain what the problem is?
I have been found couples of strange situation in a Nexus 1000V with DELL BLADE server)
Actually, Network diagram is like below.
I installed each two Vsphere Esxi on the Dell Blade server.
As Diagram shows each server is connected to Cisco N5K via M8024 Dell Blade Switch.
- two N1KV VM are installed on the Esxi. (of course as Primary and Secondary)
- N5K is connected to M8024 in vPC.
- VSM and VEM are checking each other via Layer3 control interface.
- the way of uplink's port-profile port channel LB is mac pinning.
interface control0
ip address 10.10.100.10/24
svs-domain
domain id 1
control vlan 1
packet vlan 1
svs mode L3 interface control0
port-profile type ethernet Up-Link
vmware port-group
switchport mode trunk
switchport trunk allowed vlan 1-2,10,16,30,77-78,88,100,110,120-121,130
switchport trunk allowed vlan add 140-141,150,160-161,166,266,366
service-policy type queuing output N1KV_SVC_Uplink
channel-group auto mode on mac-pinning
no shutdown
system vlan 1,10,30,100
state enabled
n1000v# show module
Mod Ports Module-Type Model Status
1 0 Virtual Supervisor Module Nexus1000V ha-standby
2 0 Virtual Supervisor Module Nexus1000V active *
3 332 Virtual Ethernet Module NA ok
4 332 Virtual Ethernet Module NA ok
Mod Sw Hw
1 4.2(1)SV2(2.1a) 0.0
2 4.2(1)SV2(2.1a) 0.0
3 4.2(1)SV2(2.1a) VMware ESXi 5.5.0 Releasebuild-1331820 (3.2)
4 4.2(1)SV2(2.1a) VMware ESXi 5.5.0 Releasebuild-1331820 (3.2)
Mod Server-IP Server-UUID Server-Name
1 10.10.10.10 NA NA
2 10.10.10.10 NA NA
3 10.10.10.101 4c4c4544-0038-4210-8053-b5c04f485931 10.10.10.101
4 10.10.10.102 4c4c4544-0043-5710-8053-b4c04f335731 10.10.10.102
Let me explain what the strange things happened from now on.
If I move the Primary N1KV on the module 3 to the another Esxi of the module 4, VEM will be shutdown suddenly.
Here is sys logs.
2013 Dec 20 15:45:22 n1000v %VEM_MGR-2-VEM_MGR_REMOVE_NO_HB: Removing VEM 4 (heartbeats lost)
2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Ethernet4/7 is detached (module removed)
2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Ethernet4/8 is detached (module removed)
2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet1 is detached (module removed)
2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet17 is detached (module removed)
2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet9 is detached (module removed)
2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet37 is detached (module removed)
2013 Dec 20 15:46:53 n1000v %VEM_MGR-2-MOD_OFFLINE: Module 4 is offline
If I wanna make it works again then I have to do two things.
First of all, It should be selected on the Source MAC Check the way of vSwitch's Load balance.
(Port ID check is the default)
Second of all, the the order of Switch's fail over is very important.
If I change this order then VEM will be off in very soon.
Here you go, the screen capture file of These option. (you may not understand these Korean letters.)
In my opinion, the main problem is the link part between Esxi and M8024.
As you saw, Each Esxi is connected to two M8024 Dell Blade switches separately.
I saw the manual for the way N1K's uplink Load balance.
Even though there are 16 different port-channel LB way,
but It should be used only the way of src-mac If there is no supporting port-channel option in the upstreaming switches.
But I don't know exactly why this situation happened.
Can anyone help me how I make it works better.
Thanks in advance.
Best Regards,
VinceThere's not enough information to determine the reason by those two outputs alone. All those commands tell us is the VSM is removing/attaching the VEM.
The normal cause for the VEM to flap is a problem with the Control VLAN communication. The loss of 6 consecutive heart beats will cause the VEM to detach from the VSM. We need to isolate the reason why.
-Which version of 1000v & ESX?
-Are multiple VEMs affected or just one?
-Are the VSM's interfaces hosted on the DVS or vSwitch?
-What is the network topology between the VEM and VSM (primarily the control VLAN)
-Do you have the Cisco SR # I can take a look into it. TAC is your best course of action for an issue like this. There will likely need to be live troubleshooting into your network environment to determine the cause.
Regards,
Robert
Maybe you are looking for
-
After upgrading to Mountain Lion Quicktime doesn't work
I have a 2009 macbook pro and just updated to mountain lion and now all my limewire movies that are in AVI format wont play on quicktime but they will play on FLV. I didnt have this problem before i upgraded. any help would be great!
-
[JS CS3] Why is menu item greyed out?
Hello, I have the following script that works fine on both my Mac and PC. However, I have sent it two other people and both report the same problem that the menu item shows up in the menus but all the options are greyed out. I thought it might be a p
-
New tab will not open in newly updated firefox
When you click on the "+" sign to open a new tab, nothing happens. Same thing when you select "file" and "new tab" or press ctrl+T. I have rolled back this version of firefox three days in a row and it keeps updating to this version. I need to be abl
-
Creating a PDF-Mail-Attachment via Abap Mapping possible ?
Hi folks, I am trying to build a szenario like: Getting an Idoc -> sending it to abap mapping -> map a pdf from smartforms in abap -> map the from/to for the mail payload -> come back from abap mapping -> send it with the mail adapter . But now I am
-
Audio Choppy After Render, file included.
I am a novice at FCP, actually movie editing to be truthful, so maybe i'm missing something here.. But in the movie file I attached, for some reason the audio is choppy. Why is this? This happens to all the projects I create.. any help? http://www.ba