[OSPF/LDP/PW Fast convergence] ASR9k/ME3800/ME3600

Hello, Netpro:
I'm having a bit of a problem with OSPF/LDP Pseudowire fast convergence.
I have a test lab scenario with 2 ASR9000, 1 ME3800X and 1 ME3600X.
They are connected in an L3 MPLS ring, like this:
ME3800X---[a]---ASR1--[b]--ASR2---[c]---ME3600X
  |                                                                      |
  |--------------------------------[d]-----------------------|   
The MEs are connected back to back closing the loop.
The ring is configured for: OSPF fast convergence timers, LDP session protection, MPLS LDP  sync and BFP for OSPF on all interfaces.
For testing purposes I've configured a pseudowire between the MEs and forced the path via OSPF to go through the ASRs.
My problem is that I do not get the same responses when breaking the ring in different places. For example: if I break the ring in (b) or (d) i get ~100ms loss (when breaking the d connection, I change OSPF so that the PW takes the direct route between MEs). If I break the ring in (a) or ( c ) I get ~ 500ms loss.
Also for testing purposes, I ran a similar test but with regular L3 interfaces ( NO pseudowire) between MEs and for all scenarios i get 60-100ms. So, basically I'm assuming OSPF is converging like it's supposed to and the issue is within LDP.
Any ideas?
Regards,
c.

Yes, I did test traffic engineering, but there's a bug (on 3600/3800) that when a PW changes paths, it will stop passing traffic, so that part is stuck.
Is there an OSPF problem with ASR? I didn't understand very well your comment
My description of my scenario is just for simplicity purposes. The real question here is, what is going on between an IOS box and an XR box that when you cause a fault in between those 2, LDP recovers in 500ms and when you cause a fault in same-OS boxes, recovery is 100ms.

Similar Messages

  • DMVPN fast convergence with ebgp and capacity planning

    Hi.
    In a dual hub/dual dmvpn design, EBGP is running over the GRE DMVPN tunnel (without ipsec/NO encryption). Once the main hub is down, it takes so long for the spokes to detect the primary hub is down (bgp hold down timer) and then converge to the secondary hub.
    Apart from the bgp timers tunning, is there any other way to achieve fast convergence? and without overutilizing resources? (memory/cpu).
    The hub routers are ASR1001 and there will be ~70 dmvpns, with ~100 spokes per DMVPN.
    Thanks,
    Carlos.

    Marcin,
    After deleting the command "if-state nhrp" from both tunnels, I see that one of the tunel changes it state to up/up and I can recover reachability to the remote hub.
    R21#show ip int brief
    Interface                  IP-Address      OK? Method Status                Protocol
    FastEthernet0/0            unassigned      YES NVRAM  administratively down down
    FastEthernet1/0            172.16.254.3    YES DHCP   up                    up
    FastEthernet1/1            unassigned      YES NVRAM  administratively down down
    Loopback0                  21.21.21.21     YES manual up                    up
    Tunnel178                  178.178.178.21  YES NVRAM  up                    down
    Tunnel179                  179.179.179.21  YES manual up                    up
    R21#config t
    Enter configuration commands, one per line.  End with CNTL/Z.
    R21(config)#int tun 178
    R21(config-if)#no if
    R21(config-if)#no if-state nh
    R21(config-if)#no if-state nhrp
    R21(config-if)#int tun 179
    R21(config-if)#no if-state nhrp
    R21(config-if)#
    *Jun 27 00:18:06.511: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel178, changed state to up
    R21(config-if)#^Z
    R21#show ip
    *Jun 27 00:18:16.103: %SYS-5-CONFIG_I: Configured from console by console
    R21#show ip int brief
    Interface                  IP-Address      OK? Method Status                Protocol
    FastEthernet0/0            unassigned      YES NVRAM  administratively down down
    FastEthernet1/0            172.16.254.3    YES DHCP   up                    up
    FastEthernet1/1            unassigned      YES NVRAM  administratively down down
    Loopback0                  21.21.21.21     YES manual up                    up
    Tunnel178                  178.178.178.21  YES NVRAM  up                    up
    Tunnel179                  179.179.179.21  YES manual up                    up
    R21#show ip nhrp nhs detail
    Legend: E=Expecting replies, R=Responding, W=Waiting
    Tunnel178:
    178.178.178.1   E priority = 0 cluster = 0  req-sent 6  req-failed 0  repl-recv 0 (01:11:41 ago)
    Tunnel179:
    179.179.179.1  RE priority = 0 cluster = 0  req-sent 1020  req-failed 0  repl-recv 522 (00:00:01 ago)
    R21#
    From the last output of "show ip nhrp nhs detail" I see only peer from tunel 179 is marked as RE: Responding, Expecting replies. Want to know why the peer from tunel 178 is not also in that state?
    I can ping both nbma (physical) and virtual (tunnel) ip address of both hubs.
    ########## Ping to hub 1 nbma (physical) address:
    R21#ping vrf dmvpn 200.0.0.178
    Type escape sequence to abort.
    Sending 5, 100-byte ICMP Echos to 200.0.0.178, timeout is 2 seconds:
    Success rate is 100 percent (5/5), round-trip min/avg/max = 200/303/436 ms
    ########## Ping to hub 2 nbma (physical) address:
    R21#ping vrf dmvpn 201.0.0.178
    Type escape sequence to abort.
    Sending 5, 100-byte ICMP Echos to 201.0.0.178, timeout is 2 seconds:
    Success rate is 100 percent (5/5), round-trip min/avg/max = 144/220/268 ms
    ########## Ping to hub 1 virtual (tunnel) address:
    R21#ping 178.178.178.1
    Type escape sequence to abort.
    Sending 5, 100-byte ICMP Echos to 178.178.178.1, timeout is 2 seconds:
    Success rate is 100 percent (5/5), round-trip min/avg/max = 164/186/228 ms
    ########## Ping to hub 2 virtual (tunnel) address:
    R21#ping 179.179.179.1
    Type escape sequence to abort.
    Sending 5, 100-byte ICMP Echos to 179.179.179.1, timeout is 2 seconds:
    Success rate is 100 percent (5/5), round-trip min/avg/max = 228/395/560 ms
    R21#
    R21#show ip nhrp nhs detail
    Legend: E=Expecting replies, R=Responding, W=Waiting
    Tunnel178:
    178.178.178.1   E priority = 0 cluster = 0  req-sent 74  req-failed 0  repl-recv 0 (01:15:08 ago)
    Tunnel179:
    179.179.179.1  RE priority = 0 cluster = 0  req-sent 1088  req-failed 0  repl-recv 590 (00:00:02 ago)
    R21#
    Thanks,
    Carlos Trujillo.

  • L3-MPLS VPN Convergence

    Perhaps someone on this group can identify the missing timers/processing-delays in end-to-end client route convergence
    Scenarios:
    a) BGP New route Advertised by Cleint(CPE1)
    b) BGP Route withdrawn by Client(CPE1)
    PE-to-RR i-M-BGP (Logical)
    ========= ----RR------ ======
    " | | "
    CPE1---->PE1------->P1-------->P2---->PE2----->CPE2
    | |
    --------->P3-------->P4-------
    Routing:
    - eBGP btw CPE and PE (any routing prot within Cust site),
    - OSPF, LDP in Core,
    Timers/Steps I'm aware of:
    - Advertisement of routes from CE to PE and placement into VRF
    - Propagation of routes across the MPLS VPN backbone
    - Import process of these routes into relevant VRFs
    - Advertisement of VRF routes to attached VPN sites
    - BGP advertisement-interval: Default = 5 seconds for iBGP, 30 for eBGP
    - BGP Import Process: Default = 15 seconds
    - BGP Scanner Process Default = 60 seconds
    Would appreciate if you someone can identify any missing process-delay, timers? specially w.r.t RR.
    Thanks
    SH

    Check the LDP/TDP timers in the core. Remember if a link fails in the core, reroute occurs, LDP/TDP binding needs to be renewed. tags are binded on those routes being in the routing table (IGP). So, there is a delay possible from a core prespective:
    mpls ldp holdtime
    mpls ldp discovery hello [holdtime | interval]
    In case you are using TE check these:
    mpls traffic-eng topology holddown
    mpls traffic-eng signalling forwarding sync
    mpls traffic-eng fast-reroute timers promotion
    I believe the latter one onyl applies to SDH. In which you use segment loss feature.
    Regards,
    Frank

  • IOS XR OSPF subsecond timer vs BFD

    Dear all,
    We recently got problem with subsecond OSPF timer on CRS.
    On previously, we can configure OSPF hello timer min dead-interval 3 on 7606 router, which mean OSPF hello is at 333 ms.
    Unfortunatelly, CRS cannot be configured with 333 ms for hello timer.
    And theres some of our TAC case with result that we need configure BFD.
    If we configure BFD on top of OSPF, is it recommended ?
    Is we got the same behaviour like having 333ms on OSPF, as expected when we configure BFD on top of OSPF at CRS ?
    And our network has different router like ASR9000 series, 7600 series router, ME6524 series. Is all has the same behaviour when we config the same BFD configuration and parameter ?
    Is it configuring the BFD on routing protocol is recomended ? replacing the sub-second timer configuration ?
    Thanks,
    Budi L

    Hi,
    In a general sense, For a network with very low convergence requirements (sub second), choosing the 'right' methods to achieve the optimum network convergence is a long excercise that highly depends on network design and Hardware/Software features. So as a start, make sure you know what your current software and HW can and can not support. Then look at what will be the outcome of combining different fast convergence and High availability features together.
    So you have 3x blocks of different ways to enhance fast convergence and high availability:
    1- LDP-IGP sync: be careful with LDP-IGP sync as XR has a different behavior than IOS (IGP adj is always build so there is no concept of holddown timer).
    2- OSPF process tuning : LSA and SPF throttle tuning
    3- Link failure detection tuning : BFD
    I would start with LDP-IGP sync and tuning the link failure detection can be done after or before the OSPF protocol tuning.
    Example: Making sure that Link failure detection mechanism is compatible with your HA configuration. For example BFD is not handled in Hw on 7600 so you can't use it if you also want SSO/NSF on this box as well. The neighbor will switch to a backup path before OSPF GR has a chance to start.
    You have to plan for more 2 different kind of convergence events here, a down convergence event (BFD and protocol timers detect this), and an up convergence event (not timer based, service establish based on what facilities are available). The state between the start of the up convergence and fully converged is a transient state where multiple things start to happen depending on the facilities available (first interface up, then IGP, maybe PIM, maybe BGP, etc). At scale, this state is often very non-deterministic, rarely happens in the same order twice, and requires some amount of system analysis to make sure there are enough resources available to re-establish all the different services without the risk of flapping anything.
    The more processes you add to the router (OSPF, BGP, PIM, and now BFD) the more critical it is to pay attention to how much time and resources each process requires. Reducing protocol timers can exaggerate the conditions in the transient state, so there are tradeoffs for having low convergence vs. scale/performance. For instance if BGP is not fully reconverged and a neighbor flaps because the CPU is busy, the whole process may have to start all over again and there is potential that the router may never fully recover if that scenario repeats.
    BFD and IGP timers are mutually exclusive, It is always recommended to turn BFD on with IGP. Be sure not to tune OSPF too aggressive. Also to keep in mind that BFD is not SSO aware (Think IOS vs XR), so when there is a SUP fail over, BFD will observe microflaps hence you will see some packet loss. That's why you have to tune the values to an optimum level e.g. for a MAN 200-300ms interval for BFD control packets should be fine but you have to test it.
    Thanks,
    Ahmed

  • What are the pros and cons using ospf as the PE/CE protocol

    pros and cons using ospf as the PE/CE Protocol

    Hi,
    Well in short,
    pros:
    well deployed protocol with good user experience
    fast convergence
    no redistribution required on ce side
    cons;
    type 1 & 2 routes (O) from one site are known as type 3 summary routes (OIA) in the other site
    you need a sham link configured on the pe routers to get the routes as type1&2 at each remote site
    If was deploying mpls vpn, i would use either ospf or bgp as pe/ce protocol.
    http://www.cisco.com/en/US/docs/ios/12_2t/12_2t8/feature/guide/ospfshmk.html#wp1024995
    HTH
    LR

  • Disable BFD in multiple Router Sub interfaces that participates in OSPF

    Hi team,
    Please help me on this. Here is the scenario:
    We are on an enterprise set up and running on 100+ routers.
    We have 200 to 300+ sub interfaces for virtual circuits
    Our protocol is OSPF over MPLS
    One of our provider in LA encountered link flaps on SONET causing our LA router that is directly connected to that link to recalculate multiple times.
    Recalculation of OSPF routes caused disconnection of users in LA VM's.
    We were advised by our provider in LA to disable BFD so minor link flaps will no affect recalculation of routes.
    We are now tasked by our design team to Disable BFD in multiple Router Sub interfaces that participates in OSPF.
    My questions are:
    What is the implication in disabling all BFD in routers' interface and sub interface?
    Will this improve recalculation of OSPF routes in cause of link flaps or it will totally ignore the link flaps?
    Will the routers only recognize a "full down" status of the interface?
    How can we Disable BFD in multiple Router Sub interfaces that participates in OSPF in a faster way? Or do we have to do this one by one?
    Please advise before we present this to the CAB and implementation. Thank you.

    My questions are:
    What is the implication in disabling all BFD in routers' interface and sub interface?
    Answer:  the implication would be eliminating sub-second millisecond convergence.
    BFD detect failure at the link layer very fast , once detected it informs the upper layer protocol about the failure causing it to converge immediately. 
    Will this improve recalculation of OSPF routes in cause of link flaps or it will totally ignore the link flaps?
    Answer: if your Provider experiencing intermittent flaps, then yes it will be advisable to turn BFD off. this however doesn't totally ignore the link flaps, once the upper protocol detect the failure based on the dead interval parameter on OSPF, it will recalculate OSPF routes again.  Keep in mind, if you have redundant or more links to your provider , then I wouldn't recommend disabling BFD , as it should improve Convergence and you shouldn't notice the failure. 
    Will the routers only recognize a "full down" status of the interface?
    Answer: disabling BFD allows the router recognize a full down status once the upper protocol dead interval occurs or full down status of interface. which ever occurs the earliest.
    How can we Disable BFD in multiple Router Sub interfaces that participates in OSPF in a faster way? Or do we have to do this one by one?
    You can disable it one by one. or if you have configuration management software, it allows you to do it for all nodes at a time. but this depends if you have it or not.
    Please consider not to disable BFD if you have multiple OSPF links towards your provider from any branch, as it shouldn't impact your VMs, it should rather improve Convergence at milliseconds which is absolutely not noticeable.
    BR,
    Mohamed 

  • QOS for Routing updates, LDP, etc

    Hi.
    We are needing to provide QOS to our voice traffic and we were thinking in configure Precedence 7 to this traffic. The traffic goes through an MPLS Network of 7600 as PE and 12000 as P. Can we configure the routing traffic, LDP and other signaling generated on the routers to use a service class as if the traffic were marked as precedence 6? how can we mark this traffic as precedence 6 changing the default? Is 7 the default value?
    Thanks

    Hello Pedro,
    I did some tests years ago
    all routing protocols including OSPF, LDP and BGP uses precedence 6 with the execption of some RSVP-TE messages that are sent out with precedence 0.
    Precedence 7 is not used in protocol messages.
    However, the recommended value for voip bearer packets is precedence 5 that fits with DSCP EF Expedite Forwarding.
    Using precedence 7 is not recommended but you can and you don't need to remark signalling packets.
    I would use precedence 5 and I did in the past and also now we use it for voip.
    Hope to help
    Giuseppe

  • MSTP / RSTP convergence

    Hi All
    I am looking at issues with convergence on ring of 3550 / 3750.
    What is the limit of switches that can be put on a ring topology ? I am not sure but my issues seems to be related to number of trunks, each time I add a trunk convergence times increases.
    Any thoughts....before I segment the ring.
    As a side mote, looking after this issues has cost me the nickname of Frodo Baggins.
    TIA
    Cisco Lad

    Hi Sam,
    Ok, so at least there are things to do. The ports that are in "Bound PVST" are not running MST but plain old slow legacy PVST. That would not be a big deal if it was some access switches connected to your ring, but in your particular case, this is the root port. Even when doing interaction between PVST and MST, we suggest that the root be located on the side of the MST region (for performance reasons). Here, the root is obviously on the PVST side.
    Don't be too concerned about the "Pre-STD-Rx" flag. We recently released a new version of MST that is fully IEEE compliant (our initial release was proprietary because it was delivered before the MST standard was out). The switch that is displaying this "Pre-STD-Rx" flag is running this IEEE standard version and has detected a pre-standard neighbor. As a result, it automatically start sending pre-standard BPDUs on this port. The detection is not 100% reliable, that's why we display this flag to encourage network administrator to hardcode the neighbor type as pre-standard on the interface. To put it short, to get rid of the message: -1- configure "spanning-tree mst pre-standard" on the interface -2- or upgrade the neighbor to the latest release. -1- is supposed to be a temporary solution before -2- is possible.
    So in the end, please make sure that all the switches in the ring are running MST. You will not get fast convergence if your netork is only partly running MST.
    Some port may also be running PVST because they used to be connected to a PVST neighbor that has been later converted to MST. You can do a shut/no shut on the link or a "clear spann detected-protocol" on both sides to fix that.
    Let me know what you find out!
    Regards,
    Francois

  • MPLS TE Fast ReRoute

    Hi Experts,
    I'm just getting started with MPLS TE and wondering on how fast the "fast reroute" feature can be.
    I'm planning to create two tunnels for a specific traffic of my network, and looks like MPLS TE with FRR is the most reliable option if we are talking about a really 0% packet loss network.
    I saw on some documentations that with MPLS TE is possible to reroute the traffic with 50 ms of RTT  and no packet loss at all, considering that the backup tunnel is so reliable as the primary is.
    Is this true? I'm new on this subject so I would like to know more about what I could achieve in terms of high availability.
    Regards
    Paulo Varanda

    Hi,
    Yes MPLS-TE with FRR gives faster convergence in range of 50ms (usually 50ms is standard convergence time for SDH/Sonet network). But there are some pre-requisities for MPLS-TE FRR to provide that faster convergence.
    Tunnel Headend -- Router 1 --- Router 2 ---- Router 3--- Tunnel Tailend
                            -- Router 4 ---- Router 5----
    MPLS-TE FRR protects a particular link or a particular node.
    For link protection, the concept is to have a primary tunnel protected by a backup tunnel. The backup tunnel path should be on completely different and fault tolerant physical path when the primary tunnel path fails i.e. both the tunnels should not be in same SRLG links. In the above case if link between Router1-Router2-Router3 fails the tunnel should fallback over Router 4 and Router 5.
    Detecting the link or node goes down should require a keepalive mechanism, usually RSVP hellos are used to detect the failure.
    Node protection by default provides link protection. So when Router 2 goes down the traffic falls back over backup path.
    MPLS-TE FRR wokrs by pre-signalling LSP over both primary and secondary paths even before the failure occurs. In normal conditions (with multiple path-option), only when primary LSP on primary path goes down, LSP gets signalled over secondary path option.
    HTH
    Arun

  • BFD on interface

    Dear all,
    Can we configure bfd on interface level for the purpose that if remote IP address is unreachable, bring down the interface.
    Regards

    Correct, in XR BFD runs as part of an application. The application can be ospf, isis or static.
    Tying BFD onto an interface will make you send alerts, traps, syslogs etc, but that is about it.
    You can trigger on the syslog message and have EEM shut or unshut an interface if that is the desire, but tying it to an application is more powerful.
    For instance, advertising max metric, or bring the routing ADJ down is more graceful in terms of fast convergence routing.
    regards
    xander

  • Thesis ideas - ip networks

    Guys!
    Please help with thesis ideas for new topics in IP networks? Any burning issues that need to be covered?
    Sent from Cisco Technical Support iPad App

    ForumI have been tasked by a client to redesign their IP network.  Their current network is comprised of:Many
    Frame Relay WAN Circuits that have many spokes and two hubs.  The two
    hubs are their Headquarters and Disaster Recovery sites.3 MPLS WAN connections.Their are several LANs located at the Headquarters and at the Disaster Recovery Site.There
    are several Routing protocols currently in use including BGP towards
    the Internet and MPLS circuits, EIGRP on inside networks, some RIP, and
    some static.The
    current IP addressing is what I would call "Hodge Podge".  For example,
    at the headquarters, there are several LAN's that are in the
    192.168.X.0/24 IANA range.  Then at the Disaster Recovery Site there
    are some 172.16.X.X/24 networks.I
    wanted to get a larger perspective from some of you whom have designed
    networks in the best... I am interested in Best Practices, and would
    also like to know if there is any recommeded documentation available
    that would assist in my readdressing scheme.Thanks for any advice availableKevin
    Hi Kevin,
    Genral practices for redesigning is you will be given a bolck for DC and DR and all depends how is the DC setup is like how many zones are there and from where the traffic will be routed all those stuff,depending on zone seggregation peoples used to sunet the given ip address based on the availabilty of host in each zones.
    Like /22 or /23 subnets in each zones depends on servers residing.routing protocols genrally prefers are OSPF or EIGRP in internal lan for fast convergence and load balancing also.
    Check out the below link hope that helps !!
    http://www.ciscosystems.com/application/pdf/en/us/guest/netsol/ns107/c649/ccmigration_09186a008073377d.pdf
    Remember to rate the helpful post
    Ganesh.H

  • Simplest Data Center Interconnect?

    Hi all,
    What's a simple way to implement a L2 network across 2 L3 DCs connected by 2x1Gig links using a 6504-E with SUP720-3C?! The DCs are only a few kilometres apart and our local service provider can only provide 1Gig fiber links between DCs (which I can then configure as L2 or L3). I do not want to simply configure flat L2 across both DCs - I would like to keep each DC as a separate L3 site and run OSPF for fast convergence and therefore avoid spanning-tree altogether.
    At the moment each DC uses 3750 switches connected by L3 links and runs EIGRP. We then use separate hardware (7200) and L2TPv3 to create some shared L2 networks across that. We're moving to the 6500 platform and so it's a good opportunity to redesign things - and hopefully I can minimise the amount of hardware needed and consolidate using only the 6500 platform in each DC.
    I also have a Cisco ACE appliance to fit at each site and to have redundancy for these they need to live in a shared network! That's what happens when the design process starts after the kit has already been brought (not my choice btw!).
    Any ideas?

    Howdy,
    The 2x1Gig links are to connect the 2 DCs together - but the question is what's the best way to do this? For example, best practice dictates that sites should be L3 only. However, I also need some kind of L2 connectivity for certain clustered services which require L2.
    What I've ended up doing is a bit of both L2 and L3. Basically I created a L2 etherchannel which only allows 2 things - a VLAN which is used to provide a small /30 link so that I can create SVIs on each end and run L3 on top; and VLANs which are used as pure L2 which run HSRP. Here's the config:
    DC1 switch
    interface Port-channel1
    description Link to DC2 - Po1
    switchport
    switchport trunk encapsulation dot1q
    switchport trunk native vlan 2
    switchport trunk allowed vlan 2,120
    switchport mode trunk
    interface Vlan2
    ip address 10.x.x.9 255.255.255.252
    ip ospf network point-to-point
    interface Vlan120
    description Shared VLAN
    ip address 10.120.0.253 255.255.255.0
    standby 120 ip 10.120.0.254
    standby 120 priority 150
    standby 120 preempt
    DC2 switch
    interface Port-channel1
    description Link to DC1 - Po1
    switchport
    switchport trunk encapsulation dot1q
    switchport trunk native vlan 2
    switchport trunk allowed vlan 2,120
    switchport mode trunk
    interface Vlan2
    ip address 10.x.x.10 255.255.255.252
    ip ospf network point-to-point
    ip ospf priority 0
    interface Vlan120
    description Shared VLAN
    ip address 10.120.0.252 255.255.255.0
    standby 120 ip 10.120.0.254
    standby 120 preempt
    It does seem to work ok - for example I have different networks at each DC which I can reach independently and I have a couple of VLANs which stretch across sites. The only problem that I can see is that there would be serious problems if the 2 Gig links went down between the 2 switches - each would then be HSRP master. Also there's a trombone effect with traffic from DC2 using DC1 as its default gateway but there's no way around this unless we use OTV or similar!
    Any thoughts are very welcome! Thank you.

  • Ask the Expert: Scaling Data Center Networks with Cisco FabricPath

    With Hatim Badr and Iqbal Syed
    Welcome to the Cisco Support Community Ask the Expert conversation. This is an opportunity to learn and ask questions about the Cisco FabricPath with Cisco technical support experts Hatim Badr and Iqbal Syed. Cisco FabricPath is a Cisco NX-OS Software innovation combining the plug-and-play simplicity of Ethernet with the reliability and scalability of Layer 3 routing. Cisco FabricPath uses many of the best characteristics of traditional Layer 2 and Layer 3 technologies, combining them into a new control-plane and data-plane implementation that combines the immediately operational "plug-and-play" deployment model of a bridged spanning-tree environment with the stability, re-convergence characteristics, and ability to use multiple parallel paths typical of a Layer 3 routed environment. The result is a scalable, flexible, and highly available Ethernet fabric suitable for even the most demanding data center environments. Using FabricPath, you can build highly scalable Layer 2 multipath networks without the Spanning Tree Protocol. Such networks are particularly suitable for large virtualization deployments, private clouds, and high-performance computing (HPC) environments.
    This event will focus on technical support questions related to the benefits of Cisco FabricPath over STP or VPC based architectures, design options with FabricPath, migration to FabricPath from STP/VPC based networks and FabricPath design and implementation best practices.
    Hatim Badr is a Solutions Architect for Cisco Advanced Services in Toronto, where he supports Cisco customers across Canada as a specialist in Data Center architecture, design, and optimization projects. He has more than 12 years of experience in the networking industry. He holds CCIE (#14847) in Routing & Switching, CCDP and Cisco Data Center certifications.
    Iqbal Syed is a Technical Marketing Engineer for the Cisco Nexus 7000 Series of switches. He is responsible for product road-mapping and marketing the Nexus 7000 line of products with a focus on L2 technologies such as VPC & Cisco FabricPath and also helps customers with DC design and training. He also focuses on SP customers worldwide and helps promote N7K business within different SP segments. Syed has been with Cisco for more than 10 years, which includes experience in Cisco Advanced Services and the Cisco Technical Assistance Center. His experience ranges from reactive technical support to proactive engineering, design, and optimization. He holds CCIE (#24192) in Routing & Switching, CCDP, Cisco Data Center, and TOGAF (v9) certifications.
    Remember to use the rating system to let Hatim and Iqbal know if you have received an adequate response.  
    They might not be able to answer each question due to the volume expected during this event. Remember that you can continue the conversation on the Data Center sub-community Unified Computing discussion forum shortly after the event. This event lasts through Dec 7, 2012.. Visit this support forum often to view responses to your questions and the questions of other Cisco Support Community members.

    Hi Sarah,
    Thank you for your question.
    Spanning Tree Protocol is used to build a loop-free topology. Although Spanning Tree Protocol serves a critical function in these Layer 2 networks, it is also frequently the cause of a variety of problems, both operational and architectural.
    One important aspect of Spanning Tree Protocol behavior is its inability to use parallel forwarding paths. Spanning Tree Protocol forms a forwarding tree, rooted at a single device, along which all data-plane traffic must flow. The addition of parallel paths serves as a redundancy mechanism, but adding more than one such path has little benefit because Spanning Tree Protocol blocks any additional paths
    In addition, rooting the forwarding path at a single device results in suboptimal forwarding paths, as shown below, Although a direct connection may exist, it cannot be used because only one active forwarding path is allowed.
    Virtual PortChannel (vPC) technology partially mitigates the limitations of Spanning Tree Protocol. vPC allows a single Ethernet device to connect simultaneously to two discrete Cisco Nexus switches while treating these parallel connections as a single logical PortChannel interface. The result is active-active forwarding paths and the removal of Spanning Tree Protocol blocked links, delivering an effective way to use two parallel paths in the typical Layer 2 topologies used with Spanning Tree Protocol.
    vPC provides several benefits over a standard Spanning Tree Protocol such as elimination of blocker ports and both vPC switches can behave as active default gateway for first-hop redundancy protocols such as Hot Standby Router Protocol (HSRP): that is, traffic can be routed by either vPC peer switch.
    At the same time, however, many of the overall design constraints of a Spanning Tree Protocol network remain even when you deploy vPC such as
    1.     Although vPC provides active-active forwarding, only two active parallel paths are possible.
    2.     vPC offers no means by which VLANs can be extended, a critical limitation of traditional Spanning Tree Protocol designs.
    With Cisco FabricPath, you can create a flexible Ethernet fabric that eliminates many of the constraints of Spanning Tree Protocol. At the control plane, Cisco FabricPath uses a Shortest-Path First (SPF) routing protocol to determine reachability and selects the best path or paths to any given destination in the Cisco FabricPath domain. In addition, the Cisco FabricPath data plane introduces capabilities that help ensure that the network remains stable, and it provides scalable, hardware-based learning and forwarding capabilities not bound by software or CPU capacity.
    Benefits of deploying an Ethernet fabric based on Cisco FabricPath include:
    • Simplicity, reducing operating expenses
    – Cisco FabricPath is extremely simple to configure. In fact, the only necessary configuration consists of distinguishing the core ports, which link the switches, from the edge ports, where end devices are attached. There is no need to tune any parameter to get an optimal configuration, and switch addresses are assigned automatically.
    – A single control protocol is used for unicast forwarding, multicast forwarding, and VLAN pruning. The Cisco FabricPath solution requires less combined configuration than an equivalent Spanning Tree Protocol-based network, further reducing the overall management cost.
    – A device that does not support Cisco FabricPath can be attached redundantly to two separate Cisco FabricPath bridges with enhanced virtual PortChannel (vPC+) technology, providing an easy migration path. Just like vPC, vPC+ relies on PortChannel technology to provide multipathing and redundancy without resorting to Spanning Tree Protocol.
    Scalability based on proven technology
    – Cisco FabricPath uses a control protocol built on top of the powerful Intermediate System-to-Intermediate System (IS-IS) routing protocol, an industry standard that provides fast convergence and that has been proven to scale up to the largest service provider environments. Nevertheless, no specific knowledge of IS-IS is required in order to operate a Cisco FabricPath network.
    – Loop prevention and mitigation is available in the data plane, helping ensure safe forwarding that cannot be matched by any transparent bridging technology. The Cisco FabricPath frames include a time-to-live (TTL) field similar to the one used in IP, and a Reverse Path Forwarding (RPF) check is also applied.
    • Efficiency and high performance
    – Because equal-cost multipath (ECMP) can be used the data plane, the network can use all the links available between any two devices. The first-generation hardware supporting Cisco FabricPath can perform 16-way ECMP, which, when combined with 16-port 10-Gbps port channels, represents a potential bandwidth of 2.56 terabits per second (Tbps) between switches.
    – Frames are forwarded along the shortest path to their destination, reducing the latency of the exchanges between end stations compared to a spanning tree-based solution.
        – MAC addresses are learned selectively at the edge, allowing to scale the network beyond the limits of the MAC addr

  • Importing not-just-1-best bgp route to VRF in XR in case of unique RD per PE

    I'm trying to import  BGP prefix from several different sources into VRF for fast convergence. When RD on local and remote PE match, it works right away. But if RDs are different, then I can see many different routes in "sh bgp vpnv4 unicast rd x:x (remote PE's RD)" with NOT-IN-VRF flag, but only best one is present in "sh bgp vpnv4 unicast vrf YYY" or  "sh bgp vpnv4 unicast rd y:y (RD of local PE)". 
    As I understand, in IOS it is handled like this:
    router bgp 1
    address-family ipv4 vrf YYY
      import path selection all
      import path limit 4
    But can not figure out how to do it in XR. Any suggestions? Do not want to roll back to same-RD-on-all-PEs approach, as IOS doesn't do much of add-paths for VPNv4 ;(. 

    I dont know if this will exactly suit your needs but you can enable PIC (Prefix independent convergence) with the additional-paths command.
    The exact command depends on your XR version (additional-paths install backup or additional-paths election)
    Refer to the document:
    http://www.cisco.com/c/en/us/td/docs/routers/crs/software/crs_r4-1/routing/command/reference/b_routing_cr41crs/b_routing_cr41crs_chapter_01.html#wp2841279186

  • Nexus Design Question

    Here's the scenario:
    We're working on a new DC design. Dual 6509VSS in core, dual Nexus 5k and 14 FEXs. The 2ks will be dual attached to the 5ks.
    Now, I read somewhere else that the 5k is limited to a maximum of 16 vPCs.
    So the question comes up... does that mean that the current design with a vPC to the core VSS, the peer vPC and all 14 vPCs to the FEX are topping us out?
    Also, all of the designs I've seen have only 2 5ks peering together. Can you have 3 or 4 5ks peer together?
    Thanks.

    I have to make the same design: VSS + 2x N5000. Currently i don't plan to connect the N5000 w. VPC upstream to C6500-VSS (i do plan VPC downstream). I plan to connect each N5000 individually to the VSS chassis with 2x 10G MEC. On the N5000, this is just a local portchannel. Since the connection between the N5000 is only used for VPC and is not considered as a real L2 link, there is no STP loop in this design.
    If the downsteam devices alternate their active/passive links between N5-1 and N5-2, both N5s will carry traffic.
    PS. i have more confidence in a simple portchannel. You don't want VPC problems (crash/early deployment problems) on the uplinks to VSS, basically isolating your datacenter or risking L2 loops.
    PPS. one remark i still have here is: on the N5K, you don't want to connect the uplinks towards VSS on adjacent ports using the same port-group. If the port-group ASIC fails, you will be blackholing traffic in the datacenter as the downstream interfaces will remain up. On C3750 switches, this is solved by Link State Tracking, however this is not supported on N5K, which i really regret. But Cisco says: we don't need Link State Tracking on N5K, as the N5K supports VPc to work around this....However, i feel this is not an excuse to not implement LinkStateTracking in N5K. VPc indeed is usefull if the upstream switches are L2 connected, however, if the upstreams are VSS switches a simple portchannel with link state tracking would suffice and have faster convergence....
    regards,
    Geert

Maybe you are looking for