ISR G2 and GRE fragmentation/reassembly

Hi,
We plan to use GRE tunnels between CPE (ISR G2 if we stick to Cisco routers) and LNS (ASR1006 - L2TP and GRE aggregation), above PPP.
PPP MTU is 1500 bytes, and the GRE tunnel will set its MTU to 1476 bytes.
Subscribers link could range from 1M SDSL lines to 16M SDSL/EFM lines.
Using ip tcp_mss_adjust on the tunnel interface will prevent ip fragmentation from happening for TCP traffic.
But we could still see ip fragmentation for non TCP traffic (UDP, IPSEC...) with packets > 1476 Bytes.
For these fragmented datagrams, reassembly will be handled by the destinations hosts.
We are investigating a solution where ip fragmentation/reassembly would be done only between CPE and LNS.
Usually, in the situation that i have described above, the end-user ip datagrams entering the CPE from a LAN interface and sent through the GRE tunnel are fragmented, then the 2 resulting fragments are encapsulated into 2 GRE packets and sent toward the tunnel destination (the LNS). There, the 2 IP fragments are popped out of the GRE packets and sent toward their ip destination. The destination host have to reassemble the 2 fragments.
The idea would be to configure an IP MTU = 1500 at the GRE interface level, so that the end-user IP datagram will not be fragmented. The CPE will create a 1524 bytes GRE datagram, and fragment the GRE datagram (not the end-user datagram encapsulated within). The 2 fragments will be sent to the GRE tunnel destination (the ASR1006), and the ASR will reassemble the initial GRE packet, and pop the end-user IP datagram from it.
=> the end-user systems won't see any fragmentation of their traffic,
=> most of the traffic is TCP and will never be fragmented thanks to mss_adjust, so this mecanism will only be triggered by non TCP packets > 1476B,
=> the CPE and LNS will have to handle IP GRE reassembly for non TCP traffic, for packets > 1476 bytes.
At LNS side, this process is handled on QFP (with hardware acceleration), and maybe we will ask for a CPOC to check ASR performance with ESP40 and ESP100.
At CPE side, it is more than likely done in process switching. Anyway, in worst case scenario, 16Mb/s full duplex needs only 2666 packets per second to fill the line both ways (1333 pps downstream, 1333 upstream).
Is 2666 pps (== 5333 fragments per seconds) something that a ISRG2 CPE (cisco898/lantic, c1941 and above) can handle without CPU exhaustion ?

isclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
What you're doing, being somewhat unusual, you'll probably not find performance documentation for it.
If if you had process switching performance values, I suspect fragmentation processing might be even worst.
About a year ago, had a case of a pair of 2800s take a huge jump in CPU usage. These routers were using GRE tunnels, and were configured with mss-adjust. However, remote site added a few security cameras which sent their video via UDP, and as you noted, mss-adjust did not help those streams.
Our "cure" was usage of jumbo Ethernet on VPN backside which avoided the need to fragment any 1477..1500 sized packets. CPU utilization hugely dropped for the same volume of traffic.
So, at least on the 2800 series, fragmentation was very CPU intensive. BTW, it didn't show as process CPU; it was part of interrupt CPU.
Unfortunately, we didn't bother trying to analyze how "costly" the fragmentation was relative to PPS, but for traffic before vs. after, with and without fragmentation, CPU hit was huge (something like 20% vs. 80%).

Similar Messages

What does ip fragment reassembly do

hi can someone pls tell me what is the meaning of ip reassembly mode in the global configuration where it gives a option for operating system.
i mean what does this option actually do.
can someone pls guide me.
regards
sebastan

When a datagram is fragmented by normal methods there is never any fragment overlap or overwrite. Where one fragment ends, the next fragment begins at the very next bit. And all operating system assemble these fragments exactly the same.
But fragments can and some times do overlap. One fragment might end at say byte 1400. The next fragment should begin at byte 1401, but on occasion you will have an overlap where that next fragment begins at byte 1399 or earlier. So long as both fragments have the exact same data for those bytes that overlap, then the packet will still be reassembled the same by all operating systems.
BUT if the 2 fragments have DIFFERENT data for that same area of the reassembled datagram, then we call this an overwrite. And each operating system can have a different way with how it deals with the overwrites and chooses which data to accept.
Say for example that the first fragment ended at byte 1400 and had "ab" at bytes 1399 and 1400.
The next fragment is an overwrite and begins and byte 1399 and has "xy" at bytes 1399 and 1400.
One operating system will reassemble these and end up with "ab", while another will reassemble and up with "xy".
Each operating system has their own method of determining whether it will be "ab" or "xy".
In fact there are about 8 different ways that these packets can be reassemmbled depending on how they were sent, how they overlap, and their offset order.
Hackers understand this and will use it to attempt to evade the sensor.
The hacker will determine the operating system of the end host and will then try to send his attack in such away so that the end host will see it as "ab" and get hacked, but the sensor reassemble it as "xy" and thinks there is nothing wrong.
It would be great if the sensor could reassemble the fragments and analyze them in ever one of the 8 possible ways that operating systems can reassemble them.
But this is too cpu and memory intensive for the sensor to be able to handle.
So instead of trying all 8 possibilities the users chooses the operating system that is the most common in their network. The sensor will then reassemble the fragments in the same method as that operating system.
Understand that this ONLY applies to Fragment OverWrites.
For normal fragments where one fragment ends and the next begins, and for fragment Overlaps where both fragments have the same data; this setting doesn't matter because all operating system will reassembly them the same way.
So if you are concerned about this, then you need to monitor for the fragment OverWrite alarm.
The operating system configuration only comes into play when the fragments OverWrite one another, and you will see the fragment OverWrite alarm being triggered.

SA520 fragment reassembly problem

I'm baffled by this one. I found that the SA520 does not seem to be able to be able to reassemble fragmented packets. I have 2 sites and I setup a site-to-site IPSEC link. The problem started that small packets less than 1409 bytes could be transmitted across the link, but not larger ones. This caused problems and caused me to do more testing. I found that even when pinging the LAN IP from a local computer, I couldn't ping larger than 1472, which I expect if I set the NoFragment bit. But if I don't set the NoFragment bit, why can't it reassemble the 2 packets from a 1475 byte ping?
I did a packet trace (from the SA520's UI) and looked at the .CAP file with WireShark. I see the 2 fragments for each ping request (the first one, and then the 3 extra bytes, totaling 1475 bytes) and then nothing else until exactly 30 seconds later. At that time I get a ping response of "Type: 11 (Time-to-live exceeded)" with a code of "Code: 1 (Fragment reassembly time exceeded)".
So, it seems that the SA520 doesn't think it got all the packets, or it just refused to put them back together. I get roughly the same results pinging the SA520 on the other side of the IPSEC link. (which right now is a cable connecting the 2 together in my lab)
This seems like a bug to me, but I can't believe no one else has had any problem like this. Anyone?

After a couple calls with Cisco support, I found the reason and solution. In the Firewall -> Attacks configuration page, there is an option for "Block Fragmented Packets" that is checked by default. It seems that not only does this block regular WAN traffic that is fragmented, but also blocks traffic that is part of any IPSEC VPN tunnel. Now that I know it, it seems like something I should have found, however, I would have thought that the firewall would not have blocked traffic within the tunnel.
After changing that, all the symptoms I described above went away. I could ping successfully with any size packet I desired,
Thanks,
Grant

Difference between organizarion and GRE

Hi All,
I would like to know what is the difference between an Organization and GRE?
Thanks
Anil

Organization is a generic term and can be of any classification (Business Group, HR-Organization, Legal-Entity a.k.a GRE etc..)
Where as a GRE is an organization with classification as GRE, a.k.a Legal Entity / Legal EMployer / Tax Unit.
If you're referring to the organization on the Person-Assignment screen, it is the HR-Organization.
Cheers,
VB

Need a common manage bean for page and its fragments

Hi,
I'm using Jdev 11.1.2.3.0. I have a page and its fragments (i using <af:region> to include these fragments).
I want to create a mutual bean used for that page and its fragments.
Beause my page fragments are contained in other BTF, so i cannot use PageFlow Bean scope and
I don't want to use session scope and application scope.
Could someone have any ideas ?
Thank you.

Hi Frank,
I used Shared DC, but now I got an issue
I have <af:panelGroupLayout> in my page fragment and i want to bind it to RichPanelGroupLayout in my common bean (already create Data Control for this bean).
I cannot do this, I just can bind attributes of RichPanelGroupLayout to my page fragment.
Do you have any ideas? Can we bind directly RichPanelGroupLayout in DC to page fragment like this: <af:panelGroupLayout id="panelGroupLayout11" layout="scroll" *binding="#{bindings.panelGroupLayout}"* ...
Thank you very much
Thanh Hoang

OSI and IPSEC and Gre

Guys which OSI layer IPSEC AND GRE sits in??? which layer they belong to.....

Hi,
GRE adds another layer3 to the existing layer3 packet.
IPSec transport mode adds a layer that sits between layer3 and layer4, to encrypt data within the layer4 PDU.
IPSec tunnel mode encapsulates existing layer3 packet in a new layer for encryption, then adds a new layer3.
So generally speaking, IPSec and GRE are said to be layer3 protocols.
Cheers:
Istvan

Difference between jsf and jsf fragment

hi all,
now i am working with layout and menu framework.
our page have main menu and side menu then in the middle there is main area.
so far i use dynamic region and jsf fragment for main area.
because i can only use jsff in dynamic region. So when user click on main menu or side menu, i change dynamically in dynamic region.
So all of my functional pages become jsff.
is there any disadvantages for using like this?
is it anti design patterns?
With Regards,
WP

Hi,
I am having UI which includes 5 tabs.
The tabs present separate functionalities and are independent of each other.
I am not sure, whether to make 5 .jsff fragments and put them into one .jspx or make 5 different .jspx files.
Please comment on which approach to follow, or any general UI design guidelines to be looked into.
Thanks !

Router Switching Performance in Packets Per Second (PPS) : ISR 4431 and 4431

Hi,
In this document, I am able to find the Routing Performance for all routeurs except ISR 4000 series.
http://www.cisco.com/web/partners/downloads/765/tools/quickreference/routerperformance.pdf
I would like to know what is the Router Switching Performance in Packets Per Second (PPS) and Mbps for ISR 4431 and 4431 routers.
Fast/CEF Switching : PPS and Mbps
Anybody had a documents or information about this ?
Regards,
Nurul Kabir KHAN

Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
I've not been able to find anything beyond a bandwidth capacity rating, such as 500 Mbps upgradable to 1 Gbps for the 4431.
I did find http://www.cisco.com/c/dam/en/us/products/collateral/routers/4000-series-integrated-services-routers-isr/enterprise-routing-portfolio-poster.pdf?mdfid=283967372
The point of interest for the foregoing, is the performance listings for the 800 series routers. Assuming their bandwidth performances ratings are using a similar performance methodology for all the routers, we can look at whitepapers, like the attached, and presume the 4000 series bandwidths are a total aggregate for typical traffic with most typical "WAN" features enabled. I.e. presume 500/1,000 Mbps is maximum recommended aggregate bandwidth usage with typical "WAN" traffic and typical "WAN" features.
PS:
Documents like: http://www.cisco.com/web/partners/downloads/765/tools/quickreference/routerperformance.pdf can be very easily misunderstood when trying to predict real-world performance. I suspect Cisco's latest bandwidth recommendations are trying to provide an easy to understand values for sizing routers for typical usage.
The attachment shows how feature usage, and traffic content, impacts ISR performance, which is why the older document can so easily mislead.

Thoughts on Stream-to-Disk Application and Memory Fragmentation

I've been working on a LabVIEW 8.2 app on Windows NT that performs high-speed streaming to disk of data acquired by PXI modules. I'm running with the PXI-8186 controller with 1GB of RAM, and a Seagate 5400.2 120GB HD. My current implementation creates a separate DAQmx task for each DAQ module in the 8-slot chassis. I was initially trying to provide semaphore-protected Write to Binary File access to a single log file to record the data from each module, but I had problems with this once I reached the upper sampling rates of my 6120's, which is 1MS/sec, 16-bit, 4-channels per board. With the higher sampling rates, I was not able to 'start off' the file streaming without causing the DaqMX input buffers to reach their limit. I think this might have to do with the larger initial memory allocations that are required. I have the distinct impression that making an initial request for a bunch of large memory blocks causes a large initial delay, which doesn't work well with a real-time streaming app.
In an effort to see if I could improve performance, I tried replacing my reentrant file writing VI with a reentrant VI that flattened each module's data record to string and added it to a named queue. In a parallel loop on the main VI, I am extracting the elements from that queue and writing the flattened strings to the binary file. This approach seems to give me better throughput than doing the semaphore-controlled write from each module's data acq task, which makes sense, because each task is able to get back to acquiring the data more quickly.
I am able to achieve a streaming rate of about 25MB/sec, running 3 6120s at 1MS/sec and two 4472s at 1KS/sec. I have the program set up where I can run multiple data collections in sequence, i.e. acquire for 5 minutes, stop, restart, acquire for 5 minutes, etc. This keeps the file sizes to a reasonable limit. When I run in this mode, I can perform a couple of runs, but at some point the memory in Task Manager starts running away. I have monitored the memory use of the VIs in the profiler, and do not see any of my VIs increasing their memory requirements. What I am seeing is that the number of elements in the queue starts creeping up, which is probably what eventually causes failure.
Because this works for multiple iterations before the memory starts to increase, I am left with only theories as to why it happens, and am looking for suggestions for improvement.
Here are my theories:
1) As the streaming process continues, the disk writes are occurring on the inner portion of the disk, resulting in less throughput. If this is what is happening, there is no solution other than a HW upgrade. But how to tell if this is the reason?
2) As the program continues to run, lots of memory is being allocated/reallocated/deallocated. The streaming queue, for instance, is shrinking and growing. Perhaps memory is being fragmented too much, and it's taking longer to handle the large block sizes. My block size is 1 second of data, which can be up to a 1Mx4x16-bit array from each 6120's DAQmx task. I tried added a Request Deallocation VI when each DAQmx VI finishes, and this seemed to help between successive collections. Before I added the VI, task manager would show about 7MB more memory usage than after the previous data collection. Now it is running about the same each time (until it starts blowing up). To complicate matters, each flattened string can be a different size, because I am able to acquire data from each DAQ board at a different rate, so I'm not sure preallocating the queue would even matter.
3) There is a memory leak in part of the system that I cannot monitor (such as DAQmx). I would think this would manifest itself from the very first collection, though.
4) There is some threading/threadlocking relationship that changes over time.
Does anyone have any other theories, or comments about one of the above theories? If memory fragmentation appears to be the culprit, how can I collect the garbage in a predictable way?

It sounds like the write is not keeping up with the read, as you suspect. Your queues can grow in an unbounded fashion, which will eventually fail. The root cause is that your disk is not keeping up. At 24MBytes/sec, you may be pushing the hardware performance line. However, you are not far off, so there are some things you can do to help.
Fastest disk performance is achieved if the size of the chunks you write to disk is 65,000 bytes. This may require you to add some double buffering code. Note that fastest performance may also mean a 300kbyte chunk size from your data acquisition devices. You will need to optimize and double buffer as necessary.
Defragment your disk free space before running. Unfortunately, the native Windows disk defragmentor only defragments the files, leaving them scattered all over the disk. Norton's disk utilities do a good job of defragmenting the free space, as well. There are probably other utilities which also do a good job for this.
Put a monitor on your queues to check the size and alarm if they get too big. Use the queue status primitive to get this information. This can tell you how the queues are growing with time.
Do you really need to flatten to string? Unless your data acquisition types are different, use the native data array as the queue element. You can also use multiple queues for multiple data types. A flatten to string causes an extra memory copy and costs processing time.
You can use a single-element queue as a semaphore. The semaphore VIs are implemented with an old technology which causes a switch to the UI thread every time they are invoked. This makes them somewhat slow. A single-element queue does not have this problem. Only use this if you need to go back to a semaphore model.
Good luck. Let us know if we can help more.
This account is no longer active. Contact ShadesOfGray for current posts and information.

G4 flat panel - issues with repairing and de-fragmenting using Norton

I have a 4 year old G4 which has started to run very slowly - time for another de-fragmentation. Problem is that the Norton Systemworks is struggling. Running Disk Dr it completes the partitions and file check but only 2/3rds of directories (it gets to attributes structure only) then I get an error notice - Unable to continue scanning Error 23005.I tried running Disk Dr three times. It suggests running volume recover which I have now done twice, on both occasions it has got virtually to the end and then I get an error message - An error has occurred while trying to Create and View Virtual Disk.
It would seem there are issues too great for Norton to deal with. Can anyone please advise me if there is a solution to this problem? I wouldn't mind an excuse to buy a new desk top Mac but do hate to be wasteful!

Hello ET:
First, I suggest you TRASH the Norton software at once! Norton software (with the possible exception of AV stuff, which is unnecessary anyway) is POISON to a Mac running OS X. Many people (me included) had Norton software clobber their systems). Symantec has dropped support for Macs several years ago.
Defragmentation, in most cases, is unnecessary on a Mac running OS X. OS X "defrags" file of less than 20 MB on the fly.
A really good (pricey, at $90 US, but worth it) disk utility is DiskWarrior. DW is the "gold standard" of directory repair (I suspect you have some from the Norton software). A less expensive first step would be to run repair disk from your software install DVD.
If you want a good (and inexpensive) utility to tune things up, take a look at Cocktail. Cocktail has a "pilot" function that works well (I run it once in awhile).
Your last option (I am looking for an excuse myself) is a good one! However, you do need to get your current system in reasonable shape if you want to port data to a new system (should you take that path).
In any event, get rid of the Norton software. Use the uninstaller that is on their website. NU hides stuff all over the system. It took me quite awhile to ferret it all out myself. I did put it on a little boat, lit it, and simulated a Viking funeral.

Intergration ISR, UWL and Interactive forms

Hi,
I pretty new in this business, so i have some questions:
- how can I intergrate the ISR date nodes in the adobe form? I only have fielindex, -value, -name
-How can I integrate the adobe form webdynpro into the UWL?
Regards

Found it. The problem occurs on the conbination ECC SP8 and Netweaver 13.
See note 874130, but make sure to read it in German.
Message was edited by: Hans Gmelig Meyling

Data distribution scheme and database fragmentation

Hi all,
I'm working on a scenario (University) involving the fragmentation of a central database. A company has regional offices i.e. (England, Wales, Scotland) and each regional office has differing combinations of business areas. They currently have one central database in their head office and my task is to "design a data distribution scheme". By scheme does this mean something like horizontal / vertical fragmentation? Also can somebody point me to an Oracle specific example of creating a fragmented table? I've tried to search online and have found the "partition by" keyword but not much else except for database linking - but I'm thinking this is more concerned with querying than actually creating the fragments.
Many thanks for your time

>
Partitioning is what the tutor meant by "fragmentation". So if there is a current central database and I have created new databases for each regional office I could run something like the below statement on the regional databases to create a bespoke version of the employee table filtered by data relevant to them? This is all theoretical and we don't have to develop the database, I just want to get the syntax correct - Thanks!
>
There you go talking about 'new databases' again. You said your original task was this
>
my task is to "design a data distribution scheme".
>
Is the task to give the regions access to their own data in the ONE central DB? Or to actually create a new DB for each region that contains ONLY that regions data?
So are we talking ACCESS to a central DB by region? Or are we talking replication of the entire central DB to multiple regions?
Your example table is partitioned by region. But if each region has their own DB why would you put data for other regions in it?
If you are wanting each region to have access to their own data in the central DB then you could partition the central DB tables like your example:
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id INT
PARTITION BY LIST(region_id) (
PARTITION Wales VALUES IN (2)
); But if you are creating a regional DB that includes data only for that region there is no need to partition it.

Can one logical table had bothfragmented and non-fragmented table sources?

Hi,
I created one logical table with three logical table sources as explained below.
1. Inventory Item logical table source with fragmentation clause of ITEM_TYPE='INVENTORY', I checked source combination feature for this LTS.
1. Punch out logical table source with fragmentation clause of ITEM_TYPE='PUNCHOUT', I checked source combination feature for this LTS.
3. Category logical table source without any fragmentation.
The relation between category and item is one to many.
I am getting errors in answers if i try to query all attributes of item for a category in dimension only query.Could somebody validate whether i can create one Logical table with fragmeted as well as non-fragmented logical table sources or not?

Can you share the error messages you are getting?
regards
John
http://obiee101.blogspot.com

Dynamic vlan on ISR 8xx and other platforms...

Hi,
Whether 8xx and ISR platforms can support the dynamic vlan?.SOHO ISRs got the integrated wireless service feature. Anyone has the feature matrix?. Because, We want to implement dynamic host based access for the SOHO users using ISE.
thanks,
Ramesh

Hi,
Are you talking about the dynamic vlan assignment to the users authenticating against the ISE?
If yes , then that can be done.
Regards
Dhiresh

CSS 11501 and GRE

Greetings:
I have a 3550-48 EMI switch sitting behind a CSS and I need to establish a GRE tunnel to another switch on the other side of the CSS. In the end configuration it will not be possible to bypass the CSS to establish the tunnel.
I have successfully established the GRE tunnel between the two switches around the CSS in my lab environment, so I know the basic configuration is correct.
I have a feeling that the problem lies in the layer-3 translation at the CSS (since GRE uses a different protocol ID than IP).

I actually have been attempting to NAT. Unfortunately, in my configuration the systems on the "unauthorized" side of the CSS don't know about the internal address of the 3550.
Can you send me the configuration you used in your lab?
We currently use the same technique using a PIX as the edge device and it works fine (and I know that the CSS performs a different type of service and is not a firewall by nature).

ISR G2 and GRE fragmentation/reassembly

Similar Messages

Maybe you are looking for