N7K keepalive issue

Dear All,
we have two CISCO nexus 7010, we was originally have one link between two nexus for keepalive, we decided to make it two links for redundancy purpose,
we made L3 port channel with below configuration,
interface Ethernet1/1                                     
  channel-group 100 mode active
  no shutdown
interface Ethernet2/1
  channel-group 100 mode active
  no shutdown
interface port-channel100
  vrf member vrf_ka
  ip address 192.168.1.2/24
vpc domain 10
  peer-keepalive destination 192.168.1.1 source 192.168.1.2 vrf vrf_ka
  peer-gateway
  reload restore
note: module 1 is Base-T, while module 2 is SFP-based
since that time we recieve below errors, and we can`t recognize the reson of such error
Nexus_BK %VPC-6-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 10, VPC peer-keepalive received on interface Po100
Nexus_BK %VPC-2-PEER_KEEP_ALIVE_RECV_FAIL: In domain 10, VPC peer keep-alive receive has failed
Nexus_BK %VPC-6-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 10, VPC peer-keepalive received on interface Po100
Nexus_BK %VPC-6-PEER_KEEP_ALIVE_RECV_SUCCESS: In domain 10, vPC peer keep-alive receive is successful
Nexus_BK %VPC-6-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 10, VPC peer-keepalive received on interface Po100
Nexus_BK %VPC-2-PEER_KEEP_ALIVE_RECV_FAIL: In domain 10, VPC peer keep-alive receive has failed
Nexus_BK %VPC-6-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 10, VPC peer-keepalive received on interface Po100
Nexus_BK %VPC-6-PEER_KEEP_ALIVE_RECV_SUCCESS: In domain 10, vPC peer keep-alive receive is successful
we also notice that HSRP between two nexus flapping right after the above errors
if anyone has any clue for this issue
thanks in advance
/Ali

Dear Reza,
Thanks for your fast response, in the document you post it you can find in page 27 the below strong recommendation for keepalive link
Dedicated link(s) (1-Gigabit Ethernet port is enough) configured as L3. Port-channel with 2 X 1G port is even better.
and that what we depend on for creating port channel,
but what i am not sure from is there is a problem with port channel with one fiber link and the other is UTP as this is our situation and what is the relation between HSRP and keepalive or it is just a considance
i hope to find any solution to this strange problem
again thanks for you support
/Ali

Similar Messages

  • Cisco WISM2 started keepalive issue

    Hi all,
    I faced a problem in my WISM2 in 6500 when I issue the command of #show WISM status I found that the status of the wism is "Started KeepAlive "
    not operational 
    please advice if there is a solution except that I reload the WISM.

    Is your WiSM service-vlan configured?
    Steve

  • Keepalive Issue

    Hello all,
    Wondering if anyone can help me here. I've a Cisco CSS 11500 7.20 Build 104. I've 5 WEB services configured on a CSS whereby at the same time nearly every night we have a service transition change on 1 service only going from ALIVE state, to DYING and finally DOWN. It stays in DOWN state for a short period of time but always comes back ALIVE by itself. This issue never happens during the day.
    I've done an Ethereal capture where I think the issue might be. The HTTP/1.1 200 ok (text/html) is missing from the sessions when the issue happens.
    Many thanks for any insight you maybe able to give to this issue. Where is the problem the server of the CSS.
    If I can provide any further informaiton please let me know.
    Regards,
    Michael.

    Hi Michael,
    If you have the service on the CSS configured to do an HTTP keepalive, it would wait a 200 OK from the server when it does the probe. If the server fails to send the 200 OK, then the expected behavior on the CSS is to move the service to DYING and then to DOWN state.
    Taking an sniffer trace - as you've done - is the best way to see this behavior. Have you been able to determine what is happening with the application on the server at the time this occurs?
    Given that this is happening at the same time every night, I would suspect there is a process on the server that stops the WEB application for a while, failing to respond to the HTTP keepalive from the CSS. Thanks!
    Regards,
    Jose Quesada.

  • NAS & N7K cabling issue

    I am having an issue with connectivity between NAS box (QNAP TS-EC1279U-RP) and Nexus 7K.
    I have a NAS device with 2 * 10G Copper ports which will be connected to Nexus 7K via patch panels. All the ports on my 7K are SFP+ ports and I need to find an SFP or converter or a cable which will support this setup and run at 10G without interoperability issues.
    Can someone suggest any coverter, or anything else quickly. I heard about Twinax cables but not sure about the same.
    What considerations, issues should I keep in mind with the same.?

    The link that was posted to contact Openreach is the correct contact point as it is damaged Openreach infrastructure and nothing to do with BT Retail your service provider this is the actual page you need to use faulty line
    Plant
    http://www.openreach.co.uk/orpg/home/submitFeedback.do?contactReason=complain_damage_external_line_p...
    Only Openreach can deal with this you will not be charged for the visit
    The statement applies to line faults and internal wiring faults Not line plant problems
    If you want to say thanks for a helpful answer,please click on the Ratings star on the left-hand side If the reply answers your question then please mark as ’Mark as Accepted Solution’

  • Office extend 1142 and dtls keepalive failure

    Hi
    I am setting up office extend with 1142 APs on a 5508 controller.  All seems ok and I see my SSIDs on the remote AP.   However when I try to connect I don't get a dhcp address and the connection fails.  When I look at logs and some debugs I see dtls keepalive failures and the AP is actually disconnecting and re-associating with the controller.
    As a troubleshooting step I decided to disable Data encryption through the AP advanced tab and after the AP resets all is now working.
    Would anyone have an idea why data encryption would cause the issue ?  I have opened the standard 5246 and 5247 UDP ports on my firewall.  Have I missed out some other port that may need opened ?
    Many thanks, St.

    Scott
    The AP is changed from Local mode to H-REAP mode.
    In the H-REAP tab we have Enable Office Extend ticked.
    In the Advanced tab to get this to work the Data Encryption box is unticked and the text below says Current Dta Encryption Status is plain text.  I can't think of any other settings related to office extend other than the NAT stuff on the management interface and allowing 5246 and 5247 through the firewall.
    So if these settings are being correctly reported the question is why do I then see Data and ctrl being encrypted when I do "Show dtls connections"  If I have unticked Data encryption I expect to see only ctrl connections being encrypted.
    I can't see any other config issue that would allow dhcp and a connection to work with Data encryption disabled and cause it to fail with Data encryption enabled. 
    The AP always joins the controller no matter what the Data encryption setting is.  However with it unticked the AP retains its connection to the controller and I can get an IP and pass data normally.  With the data encryption box ticked the AP joins the controller then soon afterwards drops off reporting a DTLS keepalive issue.  No IP address and no data passed.  In fact with data encryption ticked I see a message of the form "DTLS plumbed in" or something similar.  Then soon after I get the keepalive error and the AP drops off.
    Thanks, St.

  • Double-sided vpc Nexus7K and NExus 5K

    I have two 5548UP connected to two Nexus 7k forming double sided vPC design. The 5k is acting as access layer top of the rack and 7k as aggregate. The 7k is configured with vrf contexts and vlans with vrrp.
    I followed the best practices for double-sided vpc described in this document
    http://www.cisco.com/c/dam/en/us/td/docs/switches/datacenter/sw/design/vpc_design/vpc_best_practices_design_guide.pdf
    I'm using the same port-channel-id and vpc-id in all 4 devices. I also have different vpc domain-ids for the N5K (domain-id:20) and N7K (domain-id:10)
    All physical connectivity has been verified. I have 16-ports port-channels (8 ports on each peer)
    I have also verified all vpc status: show vpc, show vpc consistency-parameters global, and everything looks good int both N5K and N7K: keepalive is alive, peer adjacency formed ok, all consistency status are success, etc
    I have also checked port-channel configurations and each individual interface member of the port-channel
    The problem I have is the port-channel in the N7K is no coming up and I just can't figure out why. I've checked all I can think of, and tried different things and the only way I can bring the port-channel up is if I shutdown all interfaces in the N5K connecting to the N7K except those between 5K-1 and 7K1 (primary devices) and remove the "vpc 31" command under port-channel 31
    In the Nexus5K, things looks good, this is what I see when I check the status of the vpc
    vPC status
    id     Port        Status Consistency Reason                     Active vlans
    31     Po31        up     success     success                    14-16,24,28
                                                                     -31        
    However, in the N7K this is what I get. There are not consistency errors, but still the status of the port-channel is down.
    vPC status
    id   Port      Status Consistency Reason                  Active vlans
    31   Po31      down   success     success                    -           
    sh int po31
    port-channel31 is down (inactive)
    admin state is up
     vPC Status: Up, vPC number: 31
    Any ideas what could be wrong? has anyone experienced similar issue?
    Any suggestions are welcome.

    Could you provide a quick diagram and the output of 'show port-channel summary' on both switches.
     

  • Workaround CSCea46385 Bug

    I need some details about workaround for CSCea46385 Bug.
    It is mentioned to disable keepalive on Fiber interfaces and uplink interfaces as 12.2SE releases.
    I use GigaStack between 2950 switches. what is recommanded for?
    Is it possible to remove keepalive on all trunk interfaces (Copper, Fiber, Gigastack) between switches?
    Thanks in advaance

    err-disable ports was Fiber port.
    We have a major issue on our LAN with 25 switchs. After a Gbic change on our 6513, all giga trunk ports connected on this 6513 were err-disable at the same time du to keepalive issue. One hour after the second giga port on all affected 2950 were err-disable. It was very strange because the second giga port was plugged on my secundary 6513. I never exactly understand what's happened.
    I take a glance at Bug Tool Web site and there are 2 bugs referenced which could be the origin of my problem: CSCea46385 & CSCeg58877
    Tht's why I would like to disable keepalive on all trunk ports, Fiber, Copper and GigaStack.
    I try to have a full explanation about keepalive on cisco website but it's really confused. If you can explain me exactly how it's working.

  • Using ssh without being asked for a password.

    Hey all,
    I need to access a new network which is now protected by firewalls. These firewalls will disconnect sessions that are idle for over an hour, this is a problem for a lot of Sun protocols that don't use keepalives (another Sun only idea!!!) such as 'rlogin', 'rsh', 'telnet' etc.
    I need to use a protocol such as 'ssh' or equivalent which uses keepalives to remote login to systems inside the protected network to overcome the firewall dropping the sessions. The systems inside this network are all on Solaris 8.
    The thing is that I need to overcome 'ssh' requirement for password authentication, as the users are clicking on a menu application that automatically does a rsh and starts the application without prompting the user for any information (I know you should use ssh with authentication, but in this case I cannot use it). Has anyone been able to configure 'ssh' on a system wide basis for all users to not ask for a password, and use standard NIS authentication with the hosts.equiv instead.
    I have found plenty of example of how to do this in Linux, but since Sun have decided not to implement ssh in the standard way like every other UNIX vendor and to use wrappers, none of those examples will work on Solaris.
    If someone has found a way of overcoming the keepalive issue with rlogin, rsh etc. I'd be really interested in knowing the hack done to get it working, as I would prefer to avoid installing anything on those systems in the new network.
    Thanks for reading,
    Mick.

    You could probably try re-installing SSH using the standard OpenSSH source, not the ones provided by sun.
    More of a pain, as you have to install on all the machines, but it would allow you to do as you said.
    It might also be possible to use a midleman linux machine to do it, but not sure how you would go about doing it that way.
    Not a solution, but some pointers. Hopefully, it helps.

  • CSS11503 Keepalive Script Issue

    I had an issue today where I sent my config via ftp to my CSS11503 (sg0810401) and on several of my keepalives I have a script configure to test connectivity for the LDAP ports.  At the time that I sent my config to my CSS i had not yet loaded the script into the /script directory.  After I loaded my config I restarted my CSS and everything looked good, then I uploaded my script file (ap-kal-ldap-cto).  I checked my services and all of them said they could not find the script in the directory, but I was able to run the script to the IP of one of my services without any issues.  I verify the script by issueing the show script ap-kal-ldap-cto command and it displayed my script just as it had been written.  Another thing I noticed was that when I tried to remove the keepalive from one of the services I was unable to issue the command "no keepalive type script" as the command syntax of "type" was not available.  I did see the other keepalive command syntax of "frequency, hash, http-rspcode, maxfailure, uri and a few others, but no "type" command.
    I change all my keepalives to a ping for now, but does anyone know whats going on with this thing???   I think if I reboot the issue will be resolved, but I really think it should have worked without any issue.

    Good morning,
    There are two different points to be discussed here.
    First of all, why did the CSS complain that the script couldn't be found? The answer is simple, as you said, when the configuration was applied the script was not present on the device. Even if the script is uploaded later, it will not be detected properly. Either a reload or re-applying the keepalive confiugration should fix this.
    This brings me to the second point. To remove a keepalive, the command you need to use is "keepalive type none" instead of "no keepalive type script"
    Regards
    Daniel

  • VPC Compatibility issues on N7k?

    I'm trying to get VPC working on a NX7k that's got a few of the N7K-M148GT-1 copper modules installed.
    Whenever I have these linecards installed, VPC complains that it's not available:
    vPC enable status: Incompatible hardware not enabling vPC
    disabling the copper linecards allows me to now enable VPC. When I re-enable the copper linecards, they show an "ok" status in the output of show module, but they're not available in the box to configure. The interfaces don't show up in any status outputs, I can't configure them, etc.
    I've made sure that I've got the latest EPLD installed on every device in these 7ks.
    n7000-s1-epld.4.1.4.img
    What is suspicious is that I have V1.0 linecards:
    2 48 10/100/1000 Mbps Ethernet Module N7K-M148GT-11 ok
    2 4.1(3) 1.0
    Is anybody aware of a hardware revthat'd make this an incompatible configuration? I can't find any documentaion regarding VPC requirements anywhere.
    (I also have some N7Ks with v1.3 hardware- and that doesn't seem to have any issues with VPC. So I've got a sinking suspicion that I'm stuck.)

    Yes, as a minimum you need 1.3 h/w rev 10/100/1000 Mbps Ethernet Module (N7K-M148GT-11).
    open a TAC case, they can assist you with hardware replacements at no charge to you.

  • Issues with a http get keepalive

    We are having a issue with some keepalives that are causing our web servers to run really slow. I have explained the environment below and provided some configs from the CSS
    I would appreciate any insight you could provide as to why this issue is occuring and how we might avoid them.
    Thanks,
    Jim
    Environment:
    CSS 11150 with 4.01 (Build 19) code.
    We have 2 applications that connect to backend oracle databases. In order to monitor the database connections, we wrote an asp page for each application that queries the database and returns a normal status web page if a connection was established. If the database connection can not be obtained, an error web page is returned. We created 2 services with a keepalive method of get pointing to the asp page we wrote with the expectation that it would create a checksum for the normal web page.
    Issue:
    The problem we are having is that when we activate these two services, the web server that is running the applications slow down considerably. In addition to the speed issues, we also sometimes get a page cannot be displayed error from the application. If you get a page cannot be displayed error but hit refresh in your browser the application comes back. If we suspend the services the server speeds back up and we have no issues with the application.
    Configuration:
    **************************Service**********************
    service web02qi-compoint-SMDR
    ip address 172.28.1.102
    keepalive type http
    keepalive frequency 10
    keepalive maxfailure 2
    keepalive method get
    keepalive uri "/keepalive/compointkeepalive.asp"
    service web02qi-issuestrk-SMDR
    ip address 172.28.1.102
    keepalive type http
    keepalive frequency 10
    keepalive maxfailure 2
    keepalive method get
    keepalive uri "/keepalive/crmdbcheck.asp"
    ************************Owner*************************
    content compoint1
    protocol tcp
    port 80
    url "/dataproducts/*"
    add service redirect-compoint
    vip address 172.28.1.100
    balance aca
    add service web02qi-compoint-SMDR
    active
    content compoint2
    protocol tcp
    port 80
    url "/dpcompoint/*"
    add service redirect-compoint
    vip address 172.28.1.100
    balance aca
    add service web02qi-compoint-SMDR
    active
    content issuestrk
    protocol tcp
    port 80
    url "/crmdpd/*"
    add service redirect-issuestrk
    vip address 172.28.1.100
    add service web02qi-issuestrk-SMDR
    balance aca
    active

    What happen if you remove your uri in the service, do the server speed return to normal? What is the serivce 'redirect-compoint'? Try to remove any unnecessary commands such as balance aca to see if anything changes?
    What is the response time of your keepalive asp pages? What happen if you run this asp page on another computer while you're access the application on differnent computer? Trying to simulate what CSS is trying to do and see what happen.
    If nothing else works, try to upgrade to the version 5.0.
    Hope this help.
    Brad

  • N7k and UCS wierd issues

    We have some wierd problems in our N7k/UCS core and I just wanted to hear if you have any ideas of the top of your head. The setup is 2 sites with 2 N7k on each site, one USC (two fabrics) on each site. OTV is running between the sites, UCS connected via VPC to both N7k on the site. The UCS:s mainly have Hyper-V hosts although there are a few bare metal servers. We suspect that the problems began when we added a connection from our old 6k Core to the UCS:s. The VLAN:s to our old core is not propagated via OTV between the N7k sites. When they connected the 6k link they just used pin groups, which we now know was not the correct way to do it. We have rectified that error and the connection is now according to “Disjointed L2” recommendations. We have no problems with hosts on the newly connected link to our old 6k core. The symptoms is interesting to say the least;
    We have intermittent total loss of communication between hosts on different VLAN:s routed in the 7k:s eg. 10.1.1.1 to 10.1.2.1 despite the fact that I see arp entrys in the 7k:s. I also see the mac address on the correct uplink and I see the mac address in the UCS. If I do a shut/no shut on the VLAN interfaces on a site different hosts will have the problem (or not).
    I have tried both 6.2.6 and 6.2.8(a), we have rebooted all 7k:s and all UCS:s. The UCS runs 2.2.1c.
    Is this something that rings a bell?

    Dear Prathamesh,
    If UWL ivew is web dynpro based iview then it will not work properly. webdynpro iview are supported with default framework page in the external facing portal (light framework page)
    You may try accessing SAP portal through fully qualified domain name.
    Also look into SAP portal URL is in SAP recommended format. refer to SAP Note - 654326 Domain restrictions in a portal environment  and  SAP note 654982 URL requirements due to Internet standards
    Hope it will helps
    Best Regards
    Arun Jaiswal

  • Nexus 7000 w/ 2000 FEX - Interface Status When N7K has power issues

    I have multiple nexus 2Ks connected to two N7Ks with my servers connecting to multiple N2Ks accordingly with dual NIC failover capability.
    What happens, or what state do the N2K interfaces go into if one of my N7K's looses power given the N2K is effectivly still receiving power?

    Hi Ans,
    You are rigth, I have defaulted againt the port, now configured with switchport mode FEX, and now the FET-10G is validated
    NX7K-1-VDC-3T-S1-L2FP(config-if)#  description FEX-101
    NX7K-1-VDC-3T-S1-L2FP(config-if)#   switchport
    NX7K-1-VDC-3T-S1-L2FP(config-if)#   switchport mode fex-fabric
    NX7K-1-VDC-3T-S1-L2FP(config-if)#   fex associate 101
    NX7K-1-VDC-3T-S1-L2FP(config-if)#   medium p2p
    NX7K-1-VDC-3T-S1-L2FP(config-if)#   channel-group 101
    NX7K-1-VDC-3T-S1-L2FP(config-if)#   no shutdown
    NX7K-1-VDC-3T-S1-L2FP(config-if)#
    NX7K-1-VDC-3T-S1-L2FP(config-if)# sh int e7/33 status
    Port             Name            Status    Vlan      Duplex  Speed   Type
    Eth7/33          FEX-101         notconnec 1         auto    auto    Fabric Exte
    NX7K-1-VDC-3T-S1-L2FP(config-if)#
    Thanks for your help, and have a nice weekend.
    Atte,
    EF

  • Dedicated VPC keepalive for specific VDC on N7K?

    Hi,
    I heard some said about I did wrong using vrf mngt0 on Sup sharing between various VDC. Instead of using the dedicated one for each VDC.
    It that true? Any official reference or that just rumour?
    However, how can I prove by *show* that my vrf mngt0 is working so well? show vpc keepalive and looking for the statistic?
    All openion are welcome. Thanks.
    Nipat CCIE#29422

    Hi,
    You should use a dedicated mgmt0 for each VDC.  That is one of the reasons for having a separate VDC, so you can have each customer access their own VDC and not the others.
    HTH

  • WAE and N7K issue

    Issue Details- WAAS Optimizers were not optimizing traffic between two locations, drastically dropping performance on FTP connections and also seeing disconnection WAE from CM.There is somthing which is being pushed from CM that causing WCCP disconnect but not sure about it.
    Jun 14 01:58:46 APDC4R10-NWAE02 wccp: %WAAS-WCCP-5-500024: Removing router 0.0.0.0 from router table.
    Issue is sporadic in nature, SR has been open and TAC has given action plan when issue come again but, i am sure same issue happened earlier somewhere and solution must be in place rather reactive approch to wait for issue to come.
    Appreciate if any one has already know the solution on this.
    Tkx

    Hello Kiran,
    Because your  WCCP tunnel is going randomly down, I  believe this is  either  a design issue or a WCCP configuration  problem.
    Because is randomly  happening  is hard to run captures at the same time of failure but we can still review the captures when it is actually working.
    There are four WCCP V2 messages:
       * Here I AM
       * I See You
       * Redirect Assign
       * Removal Query
    Each WCCP message comprises a WCCP Message Header followed by a number of message components, for example if the length value  or any  component header is not set as expected one might expect to see WCCP errors.
    here are Nexus WCCP compatibilities notes:
    -Assignment methods supports only mask assisments. this is the  same as saying that the Nexus and the WAAS device are L2 connected and  should be properly configure to run mask assigments .. not hash.
    -In addition any packets being " bypass return" should  go via L2.
    -Packet egress redirection goes via IP forwarding and negotiated L2 return as well.
    -WCCP GRE return is not supported, WCCP GRE  redirection is not supported
    It will be nice if you  upgrade to a newer WAAS version  you're  about 30 versions away from the latest one, there  have been many open/fix caveats for WCCP/WAAS previous codes, in   addition to the enhacements you're missing.
    good luck,

Maybe you are looking for