"Hung" connections

Hi.
We are trying to determine why tcp based connections are consistently being hung on our Sun E450 with Solaris 8. We can watch the packets go back and forth with snoop on the server even when the connection is hung, but the specific service daemon doesn't seem to send back the right response to the workstation. The response seems to be zero bytes in size. This is happening within telnet, ftp, ssh.
Is there a known kernel/ip interface/streams/socket bug that causes this problem?
Thanks,
Marty Schlafer
[email protected]

Interestingly the problem was caused by packet fragmentation over a VPN. Changing the MTU on the VPN ends to be large enough for the ip header, ipsec header, and the rest of the packet data solved the problem.

Similar Messages

  • Troubleshooting hung connection NM-32a

    When I dial in to 2620 terminal server through a dial up modem, the connection hangs at  "CONNECT 115200" . The connection will not progress to the banner/login. Has anyone experienced this?

    How does your config looks like? Did you tried to lower the speed below 115200 and see if it's working?

  • Mail Fails To Connect In My Office Network, Works on Reboot

    This is baffling me and my network admin:
    Mail fails to connect to any of my accounts at work. But it recovers if I reboot my machine.
    Here is what happens:
    - Mail works just fine at home. Connects to gmail, iCloud, and a 3rd party email account no problems.
    - I get to the office
    - I log in to the office WiFi with my password - for this I have to open a browser window, and enter my credentials.
    (I kind of assume that this is the problem ^^)
    - All my internet connections work fine, I can use the browser to access Gmail etc
    - Mail refuses to connect to any of my email servers, and all accounts go offline
    - Connection Doctor says "Mail can connect to the internet" but all individual connections are red, not reachable. I checked "detail" too - its empty.
    Then I reboot my Mac, and after that everything's happy. Mail connects to my accounts normally, and Connection Doctor marks all as green.
    Since it works on reboot, we can rule out network problems like closed ports etc. So the only thing I can think of is that Mail initially tries to connect as soon as I am on WiFi, then hangs (because I am not logged in yet), and somehow this hang is of a kind that does not recover (I've left it on for hours at a time, no difference, it never recovers).
    Anyone have any ideas about this? Is there some way I can restart networking daemons to kill the hung connections. Or any other way I can diagnose this?

    A captive portal, which is what you have, can work in any of at least three ways: by HTTP redirection, IP redirection, or DNS poisoning. I'm guessing that yours is the last type. In that case, a TTL of zero should be used. If the TTL is non-zero, then Internet applications may not work until the DNS cache is flushed. That happens when you reboot.
    You should discuss the possibility with your network admin. If the network configuration isn't going to be changed, you may be able to avoid the need to reboot by flushing the DNS cache manually, as described in the article linked below:
    OS X: How to reset the DNS cache

  • Cannot connect to hyperion financial reporting in EPM 11.1.2.1

    Dear all,
    I cannot connect to FR in EPM 11.1.2.1 on test server , and this problem avoid us from migrating to production .
    I successfully conigured FR and install it on our system . But we still facing the same problem . We already followed some steps published in OTN .
    *1- Reconfigured FR and redeploy and reconfigure web server. but nothing changed*
    *2- Create new user and provision him to all roles then refresh the security filters but nothing changed*
    The test server is 2008 Win OS , 64 bit.
    EPM version is 11.1.2.1 .
    we received the following 2 errors .
    *1- in FR : "You are not authorized to use this functionality. Contact your administrator."*
    *2-In workspace : "Required application module reporting.HRImport is not configured"*
    Kindly, support.

    For the below issues, we had the same issues earlier....
    we received the following 2 errors .
    1- in FR : "You are not authorized to use this functionality. Contact your administrator."
    2-In workspace : "Required application module reporting.HRImport is not configured"
    First error will come in different situations....
    ->If FR studio/BI installed in different server, you are trying to connect from other machine, it will give the error.
    For this you need to contact firewall/windows team to open the required ports to be open (ex: 19000/8297,etc), we have asked firewall team to open all hyperion ports from main server to source & vice versa, then it worked.
    Note: If BI/FR installed unix, you are trying to access from windows, you need to get all ports to be opened vise versa. please make you are in the same network. try to ping telnet ports from source to destination for each port and vice versa.
    ->Also the same (1st error) will come if BI services down & not able to ping the FR installed server/work space services. and make sure your admin id should have all admin privilizes.
    Second error:
    -> Whenever workspace/FR studio services are not up, then will get the error, try to restart directly from /bin folder or try to check the status once you restart (sometimes due to closewait/hung connections), will face issue.
    Actuall the same error will come even in 9.3.1 version.
    Note: I am just sharing my environment issues and fixes, faced same issues & error and get it resolved with help of windows team. apologize if i am wrong in your envrionment & issue.

  • Clearing TCP Connections

    Hi,
    Is there a WAAS command that will allow me to clear a TCP connection thats been hung for 500+ hours? The WAE is running 4.0.13.b23.
    Thanks,
    Mike

    Thanks Zach,
    The app (the only app I've had issue with) in question is a home grown product and I'm not quite sure whats causing the WAE to keep connections so long. I've check asymetric routing etc...but traffic is routing correclty. I think it's an app issue but the developers here don't like it when I say that:) My local SE and CCIE worked on this alittle last week in there lab but I had to finally remove the client and server from my WCCP ACL at the central location. Once I did that it removed the hung connections at the remote WAE (it's inline at the site) but when I logged into the 7300 at the central location I noticed they never cleared on that device. It's not building any new connections just hanging onto the old....for instance some have been open for 1000+ hours. I just don't want to force a reboot to get them cleared but it looks like my only option.
    Mike

  • Site to Site Replication only works for a few hours in the morning (each morning)

    We have been fighting an odd active directory replication issue for over a month now and I am hoping that someone can provide some insight. We have 5 AD servers in the following orientation...
    Site HQ
    - PRIME running Windows 2008 R2
    - AD2 running Windows 2008 R2
    Site COLO
    - AD3 running Windows 2008 (not R2)
    - AD3NEW running Windows 2008 R2
    Site BRANCH
    - AD4 running Windows 2008 R2
    The domain is at the Windows 2008 Functional Level.
    There are always on site to site VPNs between all 3 sites and IP Intersite Transports Site Links defined for all 3 possible connections with Cost of 100 and interval of 15. Each IP site link is configured with a schedule of available all day long.
    Every day the following sequence of events happens...
    * Somewhere between 6:30 and 7:30am all the servers start to sync with each other perfectly. We can make AD changes and they replicate across all servers without issues. During this time all the repadmin commands work well across all servers.
    * Typically somewhere in the 10:30 to 11:30am time frame we start to get errors replicating data - specifically between the HQ and COLO sites. This manifests itself as Event 1232 Call Timeout from the DC RPC Client and and Event 1925 from the KCC. Additionally
    repadmin commands fail when attempting to connect to the BRANCH servers.
    * For the rest of the day the intra-site replication between PRIME and AD2 work fine - and periodically the BRANCH AD server is updated as well. But the COLO sites remain unreplicated and continue to get errors for the remainder of the day. While this down
    - the ability to ping and remote desktop between the servers is perfectly fine - so even if there were a network hiccup that happens - the network is stable for hours without the sites recovering.
    * Magically the next morning around 6:30 and 7:30am all the servers are able to replicate without issue and we get 3-5 hours of immediate replication and then it happens again.
    As I stated above - there is always on site-to-site VPN connections between all 3 sites that are actively monitored by PRTG. These connections remain open all day long. The Site topology has the COLO servers attempting to replicate with the HQ servers -
    and both sites have 100MB data connections that remain active during the entire time. Additionally PRTG bandwidth monitoring shows that these links have no spikes in traffic anywhere near the max capacity of those links during the time that the outages begin
    nor during the rest of the day.
    Does anyone have any insight as to why these servers would stop communicating with each other about the same time every day and report errors? Also why it would magically start to work again each day without any changes being made to the network or the AD
    configuration?
    This has been going on for over a month now. When it first started to happen we had 1 Windows 2008 server and 2 Windows 2003 servers in the HQ. We phased out the Windows 2003 servers and upgraded the functional level to Windows 2008 - that did not solve
    the problem. We tried to put a new Windows 2008 R2 server out at the COLO site hoping that if it was limited to the other server then only the one server would be impacted. But now they both appear to be having connectivity issues at the same time.
    It is as if there is one hung connection that is blocking all the other syncs to this site and then someone each morning that bottleneck is released.
    Thank you in advance for any direction you can provide.

    As was stated above - ALL Domain Controllers have direct access to each other through Firewall to Firewall site to site VPNs and the Inter-Site Transport Links mirror that setup. So from the OS perspective any of the AD servers can directly connect to any
    other one.
    There are 3 IP Inter-Site Transport Links defined
    HQ < - > COLO   (Contains HQ and COLO sites) Cost 100  Replication Interval 15
    HQ < - > BRANCH  (Contains HQ and Branch sites) Cost 100 Replication Interval 15
    COLO < - > BRANCH  (Contains COLO and Branch sites) Cost 100 Replication Interval 15
    And on IP Inter-Site Transports "Bridge all site links" is enabled (although disabling it doesn't fix this problem as we have already tried that).
    Right now the servers are claiming (via Active Directory) to be unable to replicate with each other. But I am able to do direct pings as well as open stream sockets using "telnet <otherserver> <port>" on ports 3268 (gc), 88 (kerberos),
    389 (ldap), 135 (replication), 636 (ldap ssl), 53 (DNS). So there is nothing that I can see between the servers that is blocking TCP connectivity.
    I cannot seem to make this any clearer. The sites are 100% functional and responsive for several hours per day - and then mysteriously go into a state of complete denial for a lack of a better word for the rest of the day - only to return back to normal
    again reliably each morning.
    It is as if the sites get into a mode where something in the RPC area are simply refusing to talk to each other despite the servers having full access at the network level.
    Another data point to add to this mystery. While it is in the state where the HQ and COLO servers are refusing to sync with each other. You can launch the AD Users and Computers snap-in, right mouse click on the domain, change the Current Directory server
    and all 5 servers show up as ONLINE. You can pick any of them (including the one that is unable to replicate with) and make a direct change on that server.
    So while the servers are complaining about being unable to talk to each other - the snap in is connecting between those servers and is able to modify it without issue.
    Conversely - when the replication is failing the DNS management tool is unable to connect to the remote servers (i.e. COLO can show itself and the other COLO server. HQ can show PRIME, DC2, and DC4 without issue. But no overlap).
    Not sure that helps at all - but shows our frustration when two servers refuse to replicate but you can easily remote connect from one to the other and make the change.

  • Office 2010 & 2007 - Excel and Access File Locking Out On the Network With Multiple Users

    This is also posted in the Office 2010 - IT Pro General Discussions, but was suggested to repost here, since a definitive answer was not found.
    Hi,
    An issue that's happening is that Excel and Access files are locking on the network. We're currently using Office 2007 and 2010.
    Here are some different scenarios that are happening:
    When opening the file it is locked out by “User X” which is the person that has the file locked out and no one else can open the file.
    When opening the file it is locked out by “User Y” which is NOT the actual person, but is locked out by “User X” and no one else can access the file.
    When opening the file  it is locked out by “…another user” which is generic and no one else can access the file.
    The two more common events are incident 1 and 2 with 3 happening the less common.
    This message will continue until the sessions are closed through computer management on the file server.
    The file server is running Windows Server 2003.
    This does happen on both Windows XP and Windows 7 clients.
    This does happen for users using Office 2007 and Office 2010.
    There are two sets of Office 2010 Users when it comes to patches. Everyone has the most current patches with Office 2010 SP2 while anyone that has Microsoft Project 2010 is using all the current update before Office 2010 SP2.
    All users that are using Office 2007 have all the current patches and service packs.
    Another variable is that we have users that will leave a file open on the network for 3+ days and after a while it will lock the file out.
    Also we have Shadow Copy that runs daily on the system which I'm not for sure if that impacts anything if a file is opening during that time.
    Any ideas on how to mitigate the lock out issues would be appreciated.
    Thanks,
    Binary Process
    Edit November 12, 2013: This issue can occur if and if not another person actually has the file open. If the person doesn't have the file open then there is a hung connection which needs to be disconnected by going to the Computer Management of the File
    server.

    Hi Binary,
    I know that the description of the hotfix does not relate to the issue. The purpose is to install it for upgrading SMB related file.
    A similar issue I encountered before:
    http://social.technet.microsoft.com/Forums/windowsserver/en-US/b7fcc59b-52d9-4a02-863a-1a529bcb8cb1/temp-doc-etc-files-dont-close-after-a-file-closes-this-causes-locked-files?forum=winserverfiles
    It is resolved by upgrading SMB files so maybe it will help on your case.
    Another hotfix which may related:
    http://support.microsoft.com/kb/983458
    If you have any feedback on our support, please send to [email protected]

  • Can't write to flash1: in a 3 switch stack

    I have a 3 switch stack at a remote office consisting of a 48 port 3750X, 24 port 3750X PoE, and a 48 port 3750v2.  Currently this stack is functioning normally and is running 12.2(55)SE3 (universal on the X's, ipservices on the 3750v2).  I want to upgrade to SE8 so that's 12.2(55)SE8 as I see in the release notes there are some vulnerabilities fixed in SSH.
    For the life of me, I cannot copy the new file to flash1: which is the 24 port 3750X PoE.  This switch has 20 IP phones and 2 Cisco Access points on it and is vital to the branch office.  I have spare non poe switches but not spare poe switches (looking for approval), so go figure I am having the issue with the flash memory on this switch.
    If I try to tftp the file c3750e-universalk9-mz.122-55.SE8.bin to flash1, I get Error opening flash1:/c3750e-universalk9-mz.122-55.SE8.bin (Bad file number).
    I can get the file tftp'd to flash2 (the 48 port 3750X) fine, but if I try to copy it from flash2 to flash1, I get the same error (bad file number).
    Cisco TAC reccomened format flash1: , so I tried that and get Error formatting flash1 (Device or resource busy).   OK so on my own I figure maybe its corrupt, lets run fsck /test flash1: and see.  NOPE can't run that either... Error fscking flash1: (Device or resource busy).   Ok now lets see if any other sessions are hung where something or someone has locked flash1... so I run systat and I am the only user on the switch stack.  Ok so no one else has a hung connection that may have had a write in process frozen.
    So what should I do?  The uptime is 30 weeks, 23 hours, 3 minutes.  The last switch in the stack, the regular 3750v2 has an uptime of 15 weeks, 1 day, 20 hours, 28 minutes because it was added to the stack back then. 
    Anyway the network is fine and functioning... just looking to standardize all our 3750's to the 122-55.SE8 release because it is the one IOS that we've had the best luck with.

    Well after a few days of SILENCE from Cisco TAC, they got back to me again and said to reseat the flash again!  I said tell me how to reseat the flash when it is soldered onto the motherboard of a 3750X switch?
    So they responded with this and they are going to replace the switch:
    The error message( Error opening flash1:/c3750e-universalk9-mz.122-55.SE8.bin (Bad file number)) indicates that the flash is not visible,
    This issue can be solved by the following steps :
    Reseat the flash if you cannot,
    Format the flash if you cannot,
    Replace the device(hardware failure)
    As I can see from this case that the device is unable to format the flash, the only way to solve the issue is to replace the switch 1,
    I can tell you support has really gone downhilll.  So much that I am considering Extreme Networks, Arista and Broacade for an upcoming 10gbps switching project.  Another case we had with them was due to a random switch stack crash on 12.2.58SE2.  Well they had the entire show tech ouput and they still reccomended to move to 15.0.2SE4.  Well that was BAD advice.  When you have 6 3750V2's and 3 3750X's in one stack of 9, release 15 doesn't have enough memory to even run a console.  We had to break the stack and one by one go to 12.2.55SE8.  I don't know what Cisco is thinking.  How can they say oh run this release but then they see its a huge mixed breed stack with a large configuration and they don't take memory into consideration?  I even asked are you sure, and they said yes we assure you this release will solve your problems.
    So between that bad advice and this bad advice, I am really discouraged at throwing more money into "SmartNet" contracts.

  • Unable to open the Planning application on planning desktop(9.2.x)

    Hi
    I am unable to open the application on the planning desktop.
    All the possible solutions like unlockapp.exe,restarting of Services has been performed but stiil i am unable to open the application.Also the essbase service is unstable ie whenever i restart(stop and start also) the essbase services is up but after a refresh of services the essbase service is getting stopped.Even the DSN conection between planning and essbase seems to not work.
    Any solution for the issue faces above would be really helpfull.
    Thanks and regards
    krishnatilak

    Hi Tilak,
    May be there is problem with hung connections, try to kill hung connections and try to open app, if not works try to reboot the server,sometimes it will help for services failing. after reboot the server check services and hung connections, this time hung connection wud nt present, then try to open app, it may help.
    still problem means check the install files are correct or somebody deleted any files....bin folder
    thnx
    :-)

  • Why is it when I launch Element 9, the window says "gathering user Info"?

    The bar on this window continues to spin, but nothing EVER happens.  It never really shows me any user information.  Not sure what this is even supposed to do since it never works.

    No, I mean you can log out until you are in PSE, then log in if you want to be connected. Or stop syncing before you launch PSE. Usually once you've eliminated the hung connection attempt it will work okay until the servers go weird again.
    BTW, you seem to think I work for adobe. I don't.

  • App hung on "waiting"; during update, I lost my wifi connection; now cannot delete to reinstall (no X), syc does not help; itunes store says it is installed and I cannot reinstall; deleting from itunes and synching does not remove "hung app"; HELP

    App hung on "waiting"; during update, I lost my wifi connection and the app hung; now cannot delete to reinstall (no X), synching does not help; itunes app store says it is installed, therefore I cannot reinstall; deleting the app from itunes and synching does not reinstall the app on the ipod; the app developers have not helped and told me to check with Apple (which I am doing, since I am past my 90 day tech support). Hoping someone can help with this hung "waiting' app problem.

    - First try resetting the iPod:
    Reset iPod touch:  Press and hold the On/Off Sleep/Wake button and the Home
    button at the same time for at least ten seconds, until the Apple logo appears.
    - Next download the app on the computer and try to sync.
    - Last, restore from backup.

  • Icloud backup - my ipad is hung with the dialogue box saying "this iPad hasn't been backed up in 4 weeks. Backup happen when this iPad is plugged in locked and connected to WiFi". Even after doing what is said, I am not able to do anything with the iPad -

    My ipad is hung with the dialogue box saying "this iPad hasn't been backed up in 4 weeks. Backup happen when this iPad is plugged in, locked and connected to WiFi". Even after doing that, I am not able to do anything with the iPad - neither able to reboot nor able to get to the homescreen. The dialogue box doesn't go at all even after pressing OK a hundred times. I treid tacking the back up through itunes and through 3G conncetion but of no avail. Please help

    I too have been having this issue since about 12/14.  That is around the same time I got the Ipad for my kids.  I am unsure if it is related.  I have tried logging out of icloud/deleting it and trying again.  Removing most of the apps to sync.  I reset my router.  All of the attempts have produced zero results.  I will be anxiously watching this thread for some help.  Good luck to you also

  • Connecting my iphone to mac laptop hung and shows black screen

    whenever i connect my iphone to my mac laptop. laptop hungs and shows black screen. I even tried connecting it to windows causing it to crash.
    Any problem with IOS 6. mine is iphone 4S running ios 6.
    please suggest any solution.

    i have no idea but i will find out.
    it recognises that the livebox is switched on though and asks if i want to connect so does that mean its not the n network
    Message was edited by: hrh_23

  • Screen sharing connect dialog hung

    last nite i tried 2 connect 2 a m.b.pro (running 10.6.8) over wifi from my 10.7.4 m.b.pro. it was still spinning when i closed up 4 the nite, and now this morning the dialog is still spinning, and even when selected, the dialog's cancel button is disabled...see attached screengrab:
    also note that even tho the dialog is selected, the menubar still shows the previous f/g app (ffx as i type this msg;-) so i don't even kow what process 2 kill;-\
    perhaps i should file a bug report, but i'll have 2 figure out again where apple hides that;-\
    also note that i am able 2 screen share no problem w/ my 10.6.8 minimac (ethernetted to the wifi router) so the only difference is wifi2wifi...nope: i just tried s.s'ing from the minimac to the problem m.b. (via s.s from my m.b.) and got the same hung dialog:
    bouncing screen sharing doesn't help, but bouncing finder allowed s.s. to finally go thru, but the dialog is still there:-( now  i got 2 machines w/zombie dialogs:-P

    No, not looking to disable Screen Sharing, just to clear the dialog box (it floats above all other windows) without restarting.
    Found this post that resolves the issue:
    "This happens to me often if I have a problem connecting a remote computer -- those connecting windows just never go away if they hang. The issue is that NetAuthAgent is hanging. To get rid of these without rebooting:
    From GUI:
    1. Launch the Activity Monitor program.
    2. In the Process Name list, look for "NetAuthAgent". It may show as "Not Responding".
    3. Select NetAuthAgent, and then click Quit Process (the red stop sign button at the top of the Activity Monitor window). Then, click Force Quit.
    From Terminial:
    1. ps ax | grep NetAuthAgent
    2. kill <pid-of-the-process-found-above>
    This should get rid of the dialog boxes and the hung process, and VNC/Screen Sharing should continue to work, as it will launch NetAuthAgent again when it needs it."
    This did just what I was looking to do!

  • I'm on mac OS 10.7.4 and when i try to update my FF it gets hung up saying "dowlosing ff...connecting to update server" and gets stuck

    When I try to upgrade my FireFox browner i get this pop up window that gets hung up. it says "Downloading Firefox." "connecting to the update server..." and gets stuck there

    If there are problems with updating or with the permissions then easiest is to download the full version and trash the currently installed version to do a clean install of the new version.
    Download a new copy of the Firefox program and save the disk image (dmg) file to the desktop
    *Firefox 14.0.x: http://www.mozilla.org/en-US/firefox/all.html
    *Trash the current Firefox application to do a clean (re-)install
    *Install the new version that you have downloaded
    Your profile data is stored elsewhere in the Firefox Profile Folder, so you won't lose your bookmarks and other personal data if you uninstall and (re)install Firefox.
    *http://kb.mozillazine.org/Profile_folder_-_Firefox

Maybe you are looking for