Solaris 10 pannic under High TCP connect timeout rate
Today I find one of my server is rebooted.the messages is as follows:
Dec 20 08:14:27 hosta unix: [ID 836849 kern.notice]
Dec 20 08:14:27 hosta unix: [ID 836849 kern.notice]
Dec 20 08:14:27 hosta ^Mpanic[cpu3]/thread=2a1009c3cc0:
Dec 20 08:14:27 hosta ^Mpanic[cpu3]/thread=2a1009c3cc0:
Dec 20 08:14:27 hosta unix: [ID 165833 kern.notice] CONN_DEC_REF: connp(6000656c600) has ref = 0
Dec 20 08:14:27 hosta unix: [ID 165833 kern.notice] CONN_DEC_REF: connp(6000656c600) has ref = 0
Dec 20 08:14:27 hosta unix: [ID 100000 kern.notice]
Dec 20 08:14:27 hosta unix: [ID 100000 kern.notice]
Dec 20 08:14:27 hosta genunix: [ID 723222 kern.notice] 000002a1009c3420 ip:squeue_enter_chain+3a8 (60001175d80, 30013e14740, 30013e
14740, 2060, 0, 0)
Dec 20 08:14:27 hosta genunix: [ID 723222 kern.notice] 000002a1009c3420 ip:squeue_enter_chain+3a8 (60001175d80, 30013e14740, 30013e
14740, 2060, 0, 0)
Dec 20 08:14:27 hosta genunix: [ID 179002 kern.notice] %l0-3: 000000007beaa184 000006000656c600 0000000000000088 0000000000002069
Dec 20 08:14:27 hosta genunix: [ID 179002 kern.notice] %l0-3: 000000007beaa184 000006000656c600 0000000000000088 0000000000002069
Dec 20 08:14:27 hosta %l4-7: 0000000000002060 000000007befac00 0000000000000000 000006000656c600
Dec 20 08:14:27 hosta %l4-7: 0000000000002060 000000007befac00 0000000000000000 000006000656c600
Dec 20 08:14:27 hosta genunix: [ID 723222 kern.notice] 000002a1009c34d0 ip:ip_input+854 (300003bc668, 0, 0, 0, 700dbc08, 1)
Dec 20 08:14:27 hosta genunix: [ID 723222 kern.notice] 000002a1009c34d0 ip:ip_input+854 (300003bc668, 0, 0, 0, 700dbc08, 1)
Dec 20 08:14:27 hosta genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 00000000e0000000 0000030013e14740 0000060001175d80
Dec 20 08:14:27 hosta genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 00000000e0000000 0000030013e14740 0000060001175d80
Dec 20 08:14:27 hosta %l4-7: 0000000000000001 00000000f0000000 0000000000000006 0000030013e14740
Dec 20 08:14:27 hosta %l4-7: 0000000000000001 00000000f0000000 0000000000000006 0000030013e14740
Dec 20 08:14:27 hosta genunix: [ID 723222 kern.notice] 000002a1009c35f0 unix:putnext+218 (600012bb910, 600012bb720, 30013e14740, 10
0, 600012bb9b0, 0)
Dec 20 08:14:27 hosta genunix: [ID 723222 kern.notice] 000002a1009c35f0 unix:putnext+218 (600012bb910, 600012bb720, 30013e14740, 10
0, 600012bb9b0, 0)
Dec 20 08:14:27 hosta genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000000 0000000000000000 00000000000056a0
Dec 20 08:14:27 hosta genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000000 0000000000000000 00000000000056a0
Dec 20 08:14:27 hosta %l4-7: 000000000000010d 00000000700b6fb8 000000007be179b8 fffffd5eff642000
Dec 20 08:14:27 hosta %l4-7: 000000000000010d 00000000700b6fb8 000000007be179b8 fffffd5eff642000
Dec 20 08:14:28 hosta genunix: [ID 723222 kern.notice] 000002a1009c36a0 pfil:pfilmodrput+2d8 (600012bb9b0, 30013e14740, 2a1009be000
, 0, 60001897df0, 3001796d000)
Dec 20 08:14:28 hosta genunix: [ID 723222 kern.notice] 000002a1009c36a0 pfil:pfilmodrput+2d8 (600012bb9b0, 30013e14740, 2a1009be000
, 0, 60001897df0, 3001796d000)
Dec 20 08:14:28 hosta genunix: [ID 179002 kern.notice] %l0-3: 00000000018a89b8 0000000000000001 0000000000000000 0000000000000001
Dec 20 08:14:28 hosta genunix: [ID 179002 kern.notice] %l0-3: 00000000018a89b8 0000000000000001 0000000000000000 0000000000000001
Dec 20 08:14:28 hosta %l4-7: 0000000000000000 ffffffffffffffff 000000000183d5c0 0000030000fea000
Dec 20 08:14:28 hosta %l4-7: 0000000000000000 ffffffffffffffff 000000000183d5c0 0000030000fea000
Dec 20 08:14:28 hosta genunix: [ID 723222 kern.notice] 000002a1009c3760 unix:putnext+218 (600012bbba0, 600012bb9b0, 30013e14740, 10
0, 600012baf70, 0)
Dec 20 08:14:28 hosta genunix: [ID 723222 kern.notice] 000002a1009c3760 unix:putnext+218 (600012bbba0, 600012bb9b0, 30013e14740, 10
0, 600012baf70, 0)
Dec 20 08:14:28 hosta genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000000 0000000000000000 0000000000005810
Dec 20 08:14:28 hosta genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000000 0000000000000000 0000000000005810
Dec 20 08:14:28 hosta %l4-7: 000000000000010d 00000000703afb48 000000007bb9c33c fffffd5eff642000
Dec 20 08:14:28 hosta %l4-7: 000000000000010d 00000000703afb48 000000007bb9c33c fffffd5eff642000
Dec 20 08:14:28 hosta genunix: [ID 723222 kern.notice] 000002a1009c3810 ce:ce_drain_fifo+52c0 (1, 30013e14740, 600013b6f00, 7bb9777
c, 7bb97b80, 3000ea29d02)
Dec 20 08:14:28 hosta genunix: [ID 723222 kern.notice] 000002a1009c3810 ce:ce_drain_fifo+52c0 (1, 30013e14740, 600013b6f00, 7bb9777
c, 7bb97b80, 3000ea29d02)
Dec 20 08:14:28 hosta genunix: [ID 179002 kern.notice] %l0-3: 0000000000008100 00000600012baf70 0000000000000000 00000000010b6cc8
Dec 20 08:14:28 hosta genunix: [ID 179002 kern.notice] %l0-3: 0000000000008100 00000600012baf70 0000000000000000 00000000010b6cc8
Dec 20 08:14:28 hosta %l4-7: 0000000000000000 000000007bb98174 000000000105e320 00000600012cd1a0
Dec 20 08:14:28 hosta %l4-7: 0000000000000000 000000007bb98174 000000000105e320 00000600012cd1a0
Dec 20 08:14:28 hosta unix: [ID 100000 kern.notice]
Dec 20 08:14:28 hosta unix: [ID 100000 kern.notice]
Dec 20 08:14:28 hosta genunix: [ID 672855 kern.notice] syncing file systems...
Dec 20 08:14:28 hosta genunix: [ID 672855 kern.notice] syncing file systems...
Dec 20 08:14:29 hosta genunix: [ID 733762 kern.notice] 4
Dec 20 08:14:29 hosta genunix: [ID 733762 kern.notice] 4
I found "http://sunsolve.sun.com/search/document.do?assetkey=1-26-102551-1",but I already applied this patch.any suggestion?
I suffered this problem for 2days,the server reboot three or four times one day.sorry for my poor English.
If you have a support contract then file a bug report. Also be sure to tell them that you have already applied the corrective patch and list the patch number.
This is why the last two digits are allowed to go up sequentially.
alan
Similar Messages
-
How to set TCP connection timeout in solaris 9
Hello All,
I am new to solaris. While using oracle, sometimes I face tcp connection timeout.
The timeout happens after a long delay like more than 8 min. I want to reduce the tcp connection timeout to 2 min in solaris.
Please help me to change this setting.
My current configuration is
SunOS testmachine 5.9 Generic_122300-13 sun4u sparc SUNW,Sun-Fire-V440
Thanks
PurushothThere's a fair amount of tunables. Without known what is timing out (dns, lost packet...), it's hard to say what you want to tweak. The list of parameters can be seen by using ndd:
ndd /dev/tcp \?
or
ndd /dev/ip \?
and can be set by using ndd -set (see ndd(1M) ). Note that anything you set has to be reset on reboot, so you have to stick this in a script somewhere, or know what the variable translates to to stick it into /etc/system.
-r -
Does anyone know of a way to increase the TCP connection timeout on Linux (RedHat ES 3.0, 2.4.21-9.0.3.ELsmp). We currently always keep a "dead" server in our imqAddressList for failover. The server has nothing listening on the portmapper port, 7676. When I telnet over regular Internet it takes less than a second to get a connection refused response:
telnet: Unable to connect to remote host: Connection refusedWhen I telnet to this server over a low bandwidth satellite connection, I get a timeout after 3 minutes:
[root@client# time telnet server 7676
Trying X.X.X.X...
telnet: connect to address X.X.X.X: Connection timed out
real 3m15.393s
user 0m0.010s
sys 0m0.000sWe currently have 3 servers in our imqAddressList and imqReconnectAttempts is set to 0. However, since one of the 3 servers is dead, 1/3 of the time it takes over 3 minutes to get a connection. I'd imagine that the socket connection from IMQ is exhibiting the same behavior as telnet.
Is there anywhere that I can tweak this timeout?
Thanks,
AaronYou need to call connect on a socket set to non-blocking mode with fcntl, and then use select with a timeout to limit the amount of time you will wait for the connect to complete. If select returns because you timed out, then close the socket and return an error. If select returns because of an event on the socket, you use getsockopt to determine if the connect succeeded or not.
See Stevens, Unix Network Programming Vol 1 for details. Comments in the code I'm looking at say page 411.
Hope this helps. -
Hello,
our customer has a problem with correct closing TCP connections on the ACE. TCP session (HTTP protocol) is closed _correctly_ (we can see it in the sniffer output), but 'sh conn' on the ACE shows it as 'established' (session is already closed). TCP timeout is set to default (60min).
Any new connection from the same src port (because many connection to the service) is closed after TCP session is established.
When I try generate 200 concurrent sessions TCP sessions in my lab, this are on the ACE closed correctly. Customer's traffic is around 20-30.000 concurrent session, but I can't generate so much traffic.
SW version on the ACE: 3.0(0)A1(3b)
thx
martinThanks Gilles!
The problem occurs only with traffic from WAP nodes (too many short HTTP requests).
We try it upgrade to A1(5b), but I'm not sure, if this is our problem...
Bug description:
Symptom:
With L7 LB configuration, Some times connections do not close.
Conditions:
SYN sent to Real server may result in ACK coming from server. ACE TCP module was not handling this ACK correctly.
...but our traffic is only L4 LB and we have a problem with connection state on the ACE from both sides (client and server). on the client and server side is connection closed properly, but on the ACE module ('sh conn') we can see it in 'established' state. It's closed after TCP timeout and that is not correct.
martin -
Tcp Connection timeout on ASA for vpn traffic
Hello All
I need an answer please.
I wanted to give tcp conenction timeout as unlimited for some IPs coming through VPN.
So, I created an access-list defining the traffic for which I want this tcp timeout.
Then a class map, policy map, entered set timeout to '0'
Applied it under default service-policy, which is applied as global (by default).
My doubt is should I apply the service policy on the interface or the global will work.
Just a silly doubt
Thanks in advance.Hi,
I think it should work just fine if you attach it to the default "policy-map" configuration that you have attached globally on the ASA.
You might want to configure the timeout value as something long rather than setting it as unlimited.
- Jouni -
Hi
I'm using utl_tcp to send some information. My problem is, it takes approx. 21 seconds the connection to time out, when host does not exist. So no read, no write, only bulding up the connection takes 21 seconds to time out, and I want to set this to a lower value, if it is possible. I read the documentation, and googled a lot, a found some tips, that is related to sqlnet.ora, but those don't seem to work.
thanks for the answers in advance894414 wrote:
I'm using utl_tcp to send some information. My problem is, it takes approx. 21 seconds the connection to time out, when host does not exist. So no read, no write, only bulding up the connection takes 21 seconds to time out, and I want to set this to a lower value, if it is possible. I read the documentation, and googled a lot, a found some tips, that is related to sqlnet.ora, but those don't seem to work.SQL*Net setting should not be applicable to a manual/programmatic tcp client created in PL/SQL code.
I doubt that you can do anything with increasing speed of the timeout. This is a protocol stack issue as the connect() socket call explains. The issue with PL/SQL is that async calls cannot be made - so wrapper calls like tcp sockets in PL/SQL need to be synchronous. -
Hi all,
I am new to the world of labview and am attempting to build a VI which sends commands to a 750-881 WAGO controller at periodic intervals of 10ms.
To set each of the DO's of the WAGO at once I therefore attempt to send the Modbus fc15 command every 10ms using the standard Labview TCP write module.
When I run the VI it works for about a minute before I recieve an Error 56 message telling me the TCP connection has timed out. Thinking this strange, I decided to record the number of bytes sent via the TCP connection whilst running the program. In doing so I noticed that the connection broke after exactly 113655 Bytes of data had been sent each time.
Thinking that I may have been sending too many messages I increased the While-loop delay from 10ms to 20, 100 and 200 ms but the error remained. I also tried playing with the TCP connection timeout and the TCP write timeout but neither of these had any effect on the problem.
I cannot see why this error is occuring, as the program works perfectly up untill the 113655 Bytes mark.
I have attached a screenshot of the basic VI (simply showing a MODBUS command being sent every second) and of a more advanced VI (where I am able to control each DO of the WAGO manually by setting a frequency at which the DO should switch between ON and OFF).
If anybody has any ideas on where the problems lie, or what I could do to further debug the program this would be greatly appreciated.
Solved!
Go to Solution.
Attachments:
Basic_VI.png 84 KB
Expanded_VI.png 89 KBAvdLinden wrote:
Hi ThiCop,
Yes the error occurs after exactly 113655 bytes every time. The timeout control I would like to use is 10ms, however even increasing this to 1s or 10s does not remove the error, which leads me to believe that this is not the issue (furthermore, not adding any delay to the while loop, thus letting it run at maximum speed, has shown that the TCP connection is able to send all 113655 bytes in under 3 seconds again pointing towards the timeout control not being the issue here).
I attempted Marco's suggestion but an having difficulty translating the string returned into a readable string, (rightnow the response given is " -# + ").
As to your second suggestion, I implemented something similar where I created a sub VI to build a TCP connection, send a message and then close the connection. I now build each message and then send the string to this subVI which successfully sends the command to my application. Whilst not being the most elegant method of solving the issue, it has resolved the timeout problem meaning I am able to send as many commands as I want. So in that sense the problem has been solved.
If you still have tips on how to correctly read the TCP read output, I would however like to see if I could not get my first program to work as it is slightly more robust in terms of timing.
Modbus TCP RTU is a binary protocol, as you show in your Basic VI, where you format the data stream using byte values. So you have to interprete the returned answer accordingly with the Modbus RTU spec in hand. Now what is most likely happening is that the connection gets hung after a while since you do NOT read the data the device sends as response to your commands. The TCP/IP stack buffers those bytes and at some point the internal buffers overflow and the connection is blocked by the stack. So adding the TCP Read at strategic places (usually after each write) is the proper solution for this. Is there any reason that you didn't use the NI provided Modbus TCP library?
Rolf Kalbermatter
CIT Engineering Netherlands
a division of Test & Measurement Solutions -
Router closes TCP connection after 30 minutes
I have recently replaced my D-Link DIR-100 router with a Cisco Linksys RV042, but unfortunately there seems to be a problem with it.
I have an external TCP connection coming in to a local service, and I therefore set up the router to redirect the incoming connection for the given port to the local PC hosting the service. This worked perfectly. I also opened the Firewall access rules to allow all data from WAN2 to be propagated through. This also worked just fine, and I can connect from the internet to the local PC, just like I could with my old router. Unfortunately this is where the simularities stop. When there is no communication on the TCP connection for more than 30 minutes then the router closes the connection automatically. This is NOT what I want. I only communicate on the TCP connection very rarely, but I do not want it closed automatically - at least not after just 30 minutes.
I did some research on line and it appears that there in some routers are a TCP connection timeout, which in the router I read about, defaulted to 1 day. This would be OK. I experimented and found that if there is communication every 30 minutes then it is not close the connection, but if there is 50minutes between communication then it closes the connection.
As I read that this timeout has to do with security I experimented with the firewall and found the following:
1. Disabling the entire router firewall fixes the problem !!!
2. Disabling just DoS has no effect (problem still exists)
3. Disabling SPI means I cannot connect at all !!! (new and much worse problem)
4. Disabling Block WAN Requests has no effect (problem still exists)
Is there a way to solve this problem without disabling the entire firewall, as that is not what I want to do. I have the system set up for Dual WAN (load balancing), and I only want to allow connections to a handful of ports on the one WAN, and block the other WAN entirely.
P.S. I was referred to the Cisco Small Business Support Community by the Cisco Home community, so I hope this is the right place.Hi Ddb101,
This is a limitation (or feature depending on how you look at it) of the iPhone/iPod touch. 30 minutes after the device locks (usually 5 minutes of inactivity) the network turns off completely to save battery life. You can either turn autolock off globally, or some programs (mine for instance see: http://ootunes.com/app/ ) have an option that disables sleep while the app is running so the stream will keep playing... until your battery dies Only problem is with the screen on the battery actually dies even faster!
Finally, if the device is connected to a constant power source, it shouldn't actually time out. So if you have a way to plug it in, it shouldn't quit after 30 minutes on you...
hope that helps,
also, since there's a link to my site up there, and I sell the app, I should tell you that I might get money if you go to the page and end up buying my app... -
TCP connection for DHCP failover frequently are broken in Solaris 10
Hi
We have two dhcp servers which are installed in Solaris 10 and set to a failover pair. Currently, we can find that tcp connection for dhcp failover protocol are frequently broken. It looks like that primary dhcp server initiatively send FIN message to secondary one but in general, this tcp connection should always keep alive. On the other hand, the tcp connection can not completely be closed right now which FIN_WAIT_2 status in Primary one and CLOSE_WAIT status in secondary would last for a long time.
Will Solaris 10 cause this fault? Is it a known bug in OS?
OS info:
-bash-3.00$ cat /etc/release
Solaris 10 5/08 s10s_u5wos_10 SPARC
Copyright 2008 Sun Microsystems, Inc. All Rights Reserved.
Use is subject to license terms.
Assembled 24 March 2008
-bash-3.00$
-bash-3.00$
-bash-3.00$ uname -a
SunOS edns1 5.10 Generic_142900-03 sun4v sparc SUNW,Netra-T5220
TCP connection info:
Primary DHCP Server:
2012 08 29 03:41:43
PING 172.25.6.137: 56 data bytes 64 bytes from edns2 (172.25.6.137): icmp_seq=0. time=0.678 ms
remote refid st t when poll reach delay offset disp
==============================================================================
*idns1 195.26.151.151 3 u 45 1024 377 0.75 -0.071 0.05
+idns2 195.26.151.151 3 u 162 1024 377 0.93 0.169 0.08
clusternode1-pr 0.0.0.0 16 - - 1024 0 0.00 0.000 16000.0
+clusternode2-pr idns1 4 u 406 1024 376 0.49 -0.154 15.12
172.25.6.133.647 172.25.6.137.58107 49640 0 49640 0 ESTABLISHED
172.25.6.133.647 *.* 0 0 49152 0 LISTEN
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2012 08 29 03:41:47
PING 172.25.6.137: 56 data bytes 64 bytes from edns2 (172.25.6.137): icmp_seq=0. time=0.535 ms
remote refid st t when poll reach delay offset disp
==============================================================================
*idns1 195.26.151.151 3 u 49 1024 377 0.75 -0.071 0.05
+idns2 195.26.151.151 3 u 166 1024 377 0.93 0.169 0.08
clusternode1-pr 0.0.0.0 16 - - 1024 0 0.00 0.000 16000.0
+clusternode2-pr idns1 4 u 410 1024 376 0.49 -0.154 15.12
172.25.6.133.647 172.25.6.137.58107 49640 0 49640 0 FIN_WAIT_2
172.25.6.133.647 *.* 0 0 49152 0 LISTEN
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Secondary DHCP Server:
2012 08 29 03:41:41
PING 172.25.6.133: 56 data bytes 64 bytes from edns1 (172.25.6.133): icmp_seq=0. time=1.26 ms
remote refid st t when poll reach delay offset disp
==============================================================================
*idns1 195.26.151.151 3 u 450 1024 377 0.92 -0.067 0.06
+idns2 195.26.151.151 3 u 552 1024 377 0.96 0.237 0.08
+clusternode1-pr idns1 4 u 360 1024 377 1.85 -0.528 1.51
clusternode2-pr 0.0.0.0 16 - - 1024 0 0.00 0.000 16000.0
172.25.6.137.647 *.* 0 0 49152 0 LISTEN
172.25.6.137.58107 172.25.6.133.647 49640 0 49640 0 ESTABLISHED
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2012 08 29 03:41:45
PING 172.25.6.133: 56 data bytes 64 bytes from edns1 (172.25.6.133): icmp_seq=0. time=1.36 ms
remote refid st t when poll reach delay offset disp
==============================================================================
*idns1 195.26.151.151 3 u 454 1024 377 0.92 -0.067 0.06
+idns2 195.26.151.151 3 u 556 1024 377 0.96 0.237 0.08
+clusternode1-pr idns1 4 u 364 1024 377 1.85 -0.528 1.51
clusternode2-pr 0.0.0.0 16 - - 1024 0 0.00 0.000 16000.0
172.25.6.137.647 *.* 0 0 49152 0 LISTEN
172.25.6.137.58107 172.25.6.133.647 49640 0 49640 0 CLOSE_WAIT
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Thanks!Thanks, but had found a previous discussion with this hint and applied it.
svccfg -s sendmail listprop shows config /local_only = false
Yes, I would really love to fix the fault, but what I would really like is some hints as to how to debug ports under svc control. -
TCP connection inactivity timeout
Is it true that the ACE module has an tcp inactivity timeout of 3600 seconds (1 hour)?
Yes, it is true. These are the default values:
Parameter-map : new
Description : -
Type : connection
nagle : disabled
slow start : disabled
buffer-share size : 32768
inactivity timeout (seconds) : TCP: 3600, UDP: 120, ICMP: 2===================HERE IT IS
embryonic timeout (seconds) : 5
ack-delay (milliseconds) : 200
WAN Optimization RTT (milliseconds): 65535
half-closed timeout (seconds) : 3600
TOS rewrite : disabled
syn retry count : 4
TCP MSS min : 0
TCP MSS max : 1460
tcp-options drop range : 0-0
tcp-options allow range : 0-0
tcp-options clear range : 1-255
selective-ack : clear
timestamp : clear
window-scale : clear
window-scale factor : 0
reserved-bits : allow
random-seq-num : enabled
SYN data : allow
exceed-mss : drop
urgent-flag : allow
conn-rate-limit : disabled
bandwidth-rate-limit : disabled -
Oracle 9.0.1 on Solaris/SPARC infinite connection timeouts
Hello, All!
I have the following system configuration:
Solaris 5.9 Maintenance Update 4;
Sun Ultra 10, 1 UltraSPARC IIi 440Mhz CPU, 1024M RAM;
Oracle 9.0.1 (9i Release 1).
The database was created with JServer option turned on (we're using the integrated CORBA functionality); archive log mode is OFF, database cache advice is ON.
In a nutshell,
==============
the problem is in really infinite connection timeouts of Oracle after a long time of inactivity (say, on Monday after two weekend days).
Long description.
=================
Client-side symptoms.
First, I tried to establish a connection to a published CORBA object, and received no response (I am already waiting for 1 hour 40 minutes).
Then, I tried to connect to the database with SQLPlus, remotely:
$ sqlplus /nolog
SQL> CONNECT SYS/change_on_install@ORCL AS SYSDBA;
All the same, I receive no reply. However, the listener is alive, I can establish a network connection to both ports 1521 and 2481. Client-side logs (specified via sqlnet.ora) say the following:
=========== skipped ===========
[09-АВГ-2004 10:27:24:770] nioqsn: entry
[09-АВГ-2004 10:27:24:770] nioqsn: exit
[09-АВГ-2004 10:27:24:770] nioqrc: entry
[09-АВГ-2004 10:27:24:771] nsdo: cid=0, opcode=84, bl=0, what=1, uflgs=0x20, c
[09-АВГ-2004 10:27:24:771] nsdo: rank=64, nsctxrnk=0
[09-АВГ-2004 10:27:24:771] nsdo: nsctx: state=8, flg=0x400d, mvd=0
[09-АВГ-2004 10:27:24:771] nsdo: gtn=127, gtc=127, ptn=10, ptc=2011
[09-АВГ-2004 10:27:24:771] nsdofls: DATA flags: 0x0
[09-АВГ-2004 10:27:24:771] nsdofls: sending NSPTDA packet
[09-АВГ-2004 10:27:24:771] nspsend: plen=770, type=6
[09-АВГ-2004 10:27:24:771] nttwr: entry
[09-АВГ-2004 10:27:24:771] nttwr: socket 12 had bytes written=770
[09-АВГ-2004 10:27:24:771] nttwr: exit
[09-АВГ-2004 10:27:24:771] nspsend: 770 bytes to transport
[09-АВГ-2004 10:27:24:772] nsdo: nsctxrnk=0
[09-АВГ-2004 10:27:24:772] nsdo: cid=0, opcode=85, bl=0, what=0, uflgs=0x0, cf
[09-АВГ-2004 10:27:24:772] nsdo: rank=64, nsctxrnk=0
[09-АВГ-2004 10:27:24:772] nsdo: nsctx: state=8, flg=0x400d, mvd=0
[09-АВГ-2004 10:27:24:772] nsdo: gtn=127, gtc=127, ptn=10, ptc=2011
[09-АВГ-2004 10:27:24:772] nsdo: switching to application buffer
[09-АВГ-2004 10:27:24:772] nsrdr: recving a packet
[09-АВГ-2004 10:27:24:772] nsprecv: reading from transport...
[09-АВГ-2004 10:27:24:772] nttrd: entry
If I try to connect to the database locally, I receive the same result:
$ sqlplus /nolog
SQL> CONNECT / AS SYSDBA;
-- no reply.
Server-side symptoms.
Oracle logs in ${ORACLE_BASE}/admin/ORCL have no entries referring the last two days. Oracle server debug log (specified via sqlnet.ora) says the following:
========== skipped ===================
nttrd: socket 20 had bytes read=181
nttrd: exit
nsprecv: 181 bytes from transport
nsprecv: tlen=181, plen=181, type=6
nsrdr: got NSPTDA packet
nsrdr: NSPTDA flags: 0x0
nsdo: what=1, bl=2009
nsdo: nsctxrnk=0
nioqrc: exit
nioqsn: entry
nioqrc: entry
nsdo: cid=0, opcode=84, bl=0, what=1, uflgs=0x20, cflgs=0x3
nsdo: rank=64, nsctxrnk=0
nsdo: nsctx: state=8, flg=0x420c, mvd=0
nsdo: gtn=156, gtc=156, ptn=10, ptc=2019
nsdofls: DATA flags: 0x0
nsdofls: sending NSPTDA packet
nspsend: plen=93, type=6
nttwr: entry
nttwr: socket 20 had bytes written=93
nttwr: exit
nspsend: 93 bytes to transport
nsdo: nsctxrnk=0
nsdo: cid=0, opcode=85, bl=0, what=0, uflgs=0x0, cflgs=0x3
nsdo: rank=64, nsctxrnk=0
nsdo: nsctx: state=8, flg=0x420c, mvd=0
nsdo: gtn=156, gtc=156, ptn=10, ptc=2019
nsdo: switching to application buffer
nsrdr: recving a packet
nsprecv: reading from transport...
nttrd: entry
nttrd: socket 20 had bytes read=770
nttrd: exit
nsprecv: 770 bytes from transport
nsprecv: tlen=770, plen=770, type=6
nsrdr: got NSPTDA packet
nsrdr: NSPTDA flags: 0x0
nsdo: what=1, bl=2009
nsdo: nsctxrnk=0
nioqrc: exit
Oracle local client debug log (oracle enterprise manager runs on the same machine) says:
...nsevwait: nsevwait: nsevwait: nsevwait: nsevwait:
nsevwait: nsevwait: nsevwait: nsevwait: nsevwait:
nsevwait: nsevwait: nsevwait: nsevwait: nsevwait: ...
ps -e -o "user,pid,pcpu,pmem,rss,vsz,args" says there're several (3) processes running as oracle, with "args" oracle_ORCL, which eat up nearly all available memory (they have "pmem" values of 25, 25 and 17 per cent, respectively). One of these processes has "pcpu" value varying from 97% to 100%, moreover, this is a user, NOT system time (according to sdtperfmeter, disk/page/swap activity is extremely low; system load holds at value of 4). Here is a sample vmstat output:
cpu
cs us sy id
287 99 1 0
285 100 0 0
280 100 0 0
272 100 0 0
298 98 2 0
275 99 1 0
267 100 0 0
307 97 3 0
282 100 0 0
270 100 0 0
307 98 2 0
287 100 0 0
Here cs is the number of cpu context switches per second;
us is cpu user time;
sy is cpu system time.
The question is: what oracle may be doing and how can I fix the problem?
Thanks in advance.Now, 3 hours later, the first two (of three) connections got established, but subsequent database queries are in the same nearly dead state.
CPU usage remains about 100%, system load 4.
I know that oracle restart (and/or system restart) will cure the problem -- but only until next weekend.
Can this be oracle misconfiguration? -
Proper method to reset tcp connection after timeout error
I have a application that I am building that communicates with a Modbus TCP device. If a communications error occurs I would like to be able the reset the TCP communications. What I have is a control that fires a event when pushed. In this event I have a sequence that first closes the tcp connection and then opens a new connection. My applications starts and runs fine. To test the reset function I removed the ethernet cable from the device and waited of a timeout to occur. I plugged the cable back in and pushed my reset control. Occasionally the reset will occur but most times I will get a time out error at the Open TCP vi. After this, the only way I can establish communications is to exit my application, disable and then enable my network device. Then when I restart my application I have communications with my device.
Any help would be appreciated on how I should be resetting my TCP connection.
Thanks
Terry
Solved!
Go to Solution.Terry S wrote:
I have attached a example vi (LV10) that shows just the TCP connection and Reset. An error will occur when trying to perform the open tcp in the reset event.
As written your code should be fine. There is nothing inherently wrong with it. However depending on the device you are communicating with you may be trying to reestablish the connection too quickly after you closed the connection. The device may not allow multiple connections to it and may require sometime to clean things up on its end after you close a connection. As an experiment trying waiting a short time between the TCP Close and the TCP Open. If possible you may want to try using Wireshark to see what is happening on the network. It can be useful in diagnosing what is going on.
Mark Yedinak
"Does anyone know where the love of God goes when the waves turn the minutes to hours?"
Wreck of the Edmund Fitzgerald - Gordon Lightfoot -
ASA TCP Idle Connection Timeout Suspense
Hello I upgraded our Cisco ASA 5520 with a Cisco ASA 5585. Though both ASA were configured with default TCP Idle Connection Timeout values people are now starting to complaint that idle SSH connections are being terminated. This is proper behavior but they were claiming it didn't occur with the old firewall. Our users are setting keepalives for 1800 seconds to get around this before I can bump the setting to infinite (setting 0). Is there a bug with the feature in older ASA OS?
Hi,
Before looking for a bug I would check the ASA logs (hopefully you are storing them to a separate Syslog server) and see why the connections are torn down (Teardown reason) and how long have they been on the ASAs connection table before they were torn down.
You also have the option to perform traffic capture on the ASA for the traffic in question and confirm why or which party terminates the connection.
I guess you can use the MPF on the ASA to configure separate idle timeouts for just these SSH Connections if you do not want to touch the global timeout values.
I have not run into any problems with the timeout settings on the older softwares. In the newer softwares (8.3+) I have run into these problems. In those situation the ASA has not removed the connection that have reached the timeout value. I have seen connection that have been idle for over 1000h.
- Jouni -
When I create a TCP connection from a VM to the internet, if I'm idle for more than a few minutes (say a SSH session), the TCP flow is torn down by some AZURE networking element in between.
Incoming connections from the internet in don't seem to be affected.
I assume this is an Azure firewall timeout somewhere.
Is there any way to raise this?Hi,
Thanks for posting here.
Here are some suggestions:
[1] - You can make sure the TCP connection is not idle. To keep your TCP connection active you can keeping sending some data before 60 seconds passes. This could be done via chunked transfer encoding; send something or you can just send blank lines to keep
the connection active.
[2] - If you are using WCF based application please have a look at below link:
Reference:
http://code.msdn.microsoft.com/WCF-Azure-NetTCP-Keep-Alive-09f50fd9
[3] - If you are using TCP Sockets then you can also try ServicePointManager.SetTcpKeepAlive(true, 30000, 30000) might be used to do this. TCP Keep-Alive packets will keep the connection from your client to the load balancer open during a long-running HTTP
request. For example if you’re using .NET WebRequest objects in your client you would set ServicePointManager.SetTcpKeepAlive(…) appropriately.
Reference -
http://msdn.microsoft.com/en-us/library/system.net.servicepointmanager.settcpkeepalive.aspx
Hope this helps you.
Girish Prajwal -
WAAS - TCP Connection High-Water Mark
Is there a way to tell the historical statistic for maximum number of TCP connections in a WAE? We are in the middle of a deployment and I am wondering how well we sized our HQ cluster.
Through CM dashboard -> manage devices and go to edge or core device, you will be able to see the connections for each device.
The other option, is to check and see if you are seeing a lot of bypass traffic.
Once WAAS reaches max, traffic will be bypassed until max sessions drops
Maybe you are looking for
-
How can i get all java class names from a package using reflection?
hi, can i get all classes name from a package using reflection or any other way? If possible plz give the code with example.
-
How to set up password from power on (from shut down)
I have 2 powerbooks and set up passwords for them. For older PBG3, it seems working the password from power on from shut down. But, PBG4 doesn't show up to input password when power on from shut down. It just asks from wake of sleep. Where can I set
-
How to Customise Quick Launch in Blog
Hi All, I am trying to customise the css for the quick launch on a blog site and cannot get the text on hover color to display correctly? I have also tried using a theme (built with the palette tool) and the navigationhover text also doesn't display
-
Hallo, I have the following problem: I want to jump after a SelectionScreen On Value-Request Block to the SelectionScreen Output Block. But Abap doesn´t go there afterwards. My "trigger" for the value Request is a normal edit field, so unfortunately
-
Ask for help( simple code but wired error )!
The post I sent just now has some mistakes. This post is correct. My simple code is as follows: import java.io.*; import java.net.*; public class count { float dclient = (float)3.333; float dlan = (float)0.01884; float drouter = (float)0.00115; float