Deadlock bettween two SSLEngines
This may be related to my previous post or may be a different issue. I have the same implementation of my SSLEngine running in client and server...here is what occurs between the two SSLEngines
client writes 100 bytes, server receives
server writes 1217 bytes back, client receives
client writes 139 bytes
client writes 6 bytes
client writes 53 bytes (SSLEngine status goes from WRAP to UNRWAP)
server receives 100 bytes
server receives 98 bytes (SSLEngine status stays in UNWRAP)
This is very similar to my previous post...all the packet sizes are the same, but the server received the 6 bytes and processed the handshake until the failure exception I posted. If it receives all the bytes, it all the sudden doesn't process it and waits for more while the client has no more to send.
What in the world could be going on? Thanks for any ideas,
dean
I think I figured this out, but I still have a deadlock and this time I kind of know what it is....with my fixed code, it is doing this
Last few exchanges.....
client send 6 bytes(status=OK, hsStatus=NEEDWRAP)
client send 53 bytes(status=OK, hsStatus=NEEDUNWRAP)
server receives 59 bytes(status=OK, hsStatus=NEEDUNWRAP)
I notice that when I fed the 59 bytes to the SSLEngine, it only read 6 bytes and did not read up to the 59. I am assuming the engine is missing these bytes and needs them refed to it.....there is a problem with this though.....how do I know when to stop refeeding bytes to the engine and when to get more from the socket as it doesn't have enough.....
I guess I loop till are bytes are consumed.....I will have to go try that.
Similar Messages
-
Possible solution to avoid deadlock when two inserts happen on same table from two different machines.
Below are the details from deadlock trace.
Deadlock encountered .... Printing deadlock information
Wait-for graph
NULL
Node:1
KEY: 8:72057594811318272 (ffffffffffff) CleanCnt:3 Mode:RangeS-S Flags: 0x1
Grant List 2:
Owner:0x00000013F494A980 Mode: RangeS-S Flg:0x40 Ref:0 Life:02000000 SPID:376 ECID:0 XactLockInfo: 0x000000055014F400
SPID: 376 ECID: 0 Statement Type: INSERT Line #: 70
Input Buf: RPC Event: Proc [Database Id = 8 Object Id = 89923542]
Requested by:
ResType:LockOwner Stype:'OR'Xdes:0x0000002AA53383B0 Mode: RangeI-N SPID:238 BatchID:0 ECID:0 TaskProxy:(0x00000027669B4538) Value:0x10d8d500 Cost:(0/38828)
NULL
Node:2
KEY: 8:72057594811318272 (ffffffffffff) CleanCnt:3 Mode:RangeS-S Flags: 0x1
Grant List 2:
Owner:0x0000000B3486A780 Mode: RangeS-S Flg:0x40 Ref:0 Life:02000000 SPID:238 ECID:0 XactLockInfo: 0x0000002AA53383F0
SPID: 238 ECID: 0 Statement Type: INSERT Line #: 70
Input Buf: RPC Event: Proc [Database Id = 8 Object Id = 89923542]
Requested by:
ResType:LockOwner Stype:'OR'Xdes:0x000000055014F3C0 Mode: RangeI-N SPID:376 BatchID:0 ECID:0 TaskProxy:(0x000000080426E538) Value:0x30614e80 Cost:(0/41748)
NULL
Victim Resource Owner:
ResType:LockOwner Stype:'OR'Xdes:0x0000002AA53383B0 Mode: RangeI-N SPID:238 BatchID:0 ECID:0 TaskProxy:(0x00000027669B4538) Value:0x10d8d500 Cost:(0/38828)
deadlock-list
deadlock victim=process5daddc8
process-list
process id=process5daddc8 taskpriority=0 logused=38828 waitresource=KEY: 8:72057594811318272 (ffffffffffff) waittime=2444 ownerId=2994026815 transactionname=user_transaction lasttranstarted=2014-07-25T12:46:57.347 XDES=0x2aa53383b0 lockMode=RangeI-N schedulerid=43 kpid=14156 status=suspended spid=238 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2014-07-25T12:46:57.463 lastbatchcompleted=2014-07-25T12:46:57.463 clientapp=pa hostname=pa02 hostpid=1596 loginname=myuser isolationlevel=serializable (4) xactid=2994026815 currentdb=8 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128056
executionStack
frame procname=mydb.dbo.SaveBill line=70 stmtstart=6148 stmtend=8060 sqlhandle=0x03000800d61f5c056bd3860170a300000100000000000000
INSERT INTO [dbo].[Prod1] .....
inputbuf
Proc [Database Id = 8 Object Id = 89923542]
process id=process5d84988 taskpriority=0 logused=41748 waitresource=KEY: 8:72057594811318272 (ffffffffffff) waittime=2444 ownerId=2994024748 transactionname=user_transaction lasttranstarted=2014-07-25T12:46:57.320 XDES=0x55014f3c0 lockMode=RangeI-N schedulerid=39 kpid=14292 status=suspended spid=376 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2014-07-25T12:46:57.440 lastbatchcompleted=2014-07-25T12:46:57.440 clientapp=pa hostname=pa01 hostpid=1548 loginname=myuser isolationlevel=serializable (4) xactid=2994024748 currentdb=8 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128056
executionStack
frame procname=pa.dbo.SaveBill line=70 stmtstart=6148 stmtend=8060 sqlhandle=0x03000800d61f5c056bd3860170a300000100000000000000
INSERT INTO [dbo].[Prod1]....
inputbuf
Proc [Database Id = 8 Object Id = 89923542]
resource-list
keylock hobtid=72057594811318272 dbid=8 objectname=pa.dbo.prod1 indexname=PK_a id=lock1608ee1380 mode=RangeS-S associatedObjectId=72057594811318272
owner-list
owner id=process5d84988 mode=RangeS-S
waiter-list
waiter id=process5daddc8 mode=RangeI-N requestType=convert
keylock hobtid=72057594811318272 dbid=8 objectname=pa.dbo.prod1 indexname=PK_a id=lock1608ee1380 mode=RangeS-S associatedObjectId=72057594811318272
owner-list
owner id=process5daddc8 mode=RangeS-S
waiter-list
waiter id=process5d84988 mode=RangeI-N requestType=convertDon't know. Perhaps these can help. I scanned the second link but didn't see much about Ending Deadlocks. I'd say the Fourth link probably has better information than the first three links. But maybe read them all just in case the Fourth is missing something
one of the first three have.
Deadlocking
Detecting and Ending Deadlocks
Minimizing Deadlocks
Handling Deadlocks in SQL Server
Google search for "SQL Deadlock"
La vida loca -
Cross Business area Transactions bettween two Business area in Classic GL
Dear experts,
we are not in new GL and we are still in Classic G/L and we have two business areas where we are doing cross business transactions and clinet want tallied trail balance of each month.
How can we achive the above situtaion. I know new G/L where systeom will take care for tallied trail balance by using Zero balance clearing account.
Thanks Reagards,
BramhaiahPlease activate Business area financial statements in OB65.
Indicator: Business area financial statements required?
Indicator that a balance sheet and/or a P&L is to be created per
business area for internal purposes.
Use
If the indicator is set, then the "Business area" field is always ready
for input, regardless of the field control of the posting keys and of
the accounts when you enter documents. This indicator results in
required entries in the Controlling (CO), Materials Management (MM), and
Sales and Distribution (SD) components.
Rgds
Murali. N -
Oracle deadlock - how to use "synchronised" keyword in a transaction?
Hi,
I use WL6.1 SP4, Oracle 8.1.6, with some Java objects which execute a
lot
of SQL queries (mixed update, insert and select) using plain JDBC
calls,
and Weblogic connection pools. These objects are called by servlets.
I experienced recently deadlocks when two users call the object at the
same
time (See error below).
I execute the queries using "synchronized" keyword in the following
way:
synchronized (this)
conConnection.setAutoCommit(false);
executeTransaction(myStatement);
conConnection.commit();
executeTransaction is overriden in sub-classes and is the method which
executes
all the queries.
It calls methods in other objects. These methods are not declared as
synchronized.
1) Should they?
2) Should I use the keyword "synchronized" in another way?
3) This part of code is also called when I do only "select"
statements. I guess
it should only be synchronized when we do "update" and "insert" which
could lead
to a deadlock?
4) Do you have any idea why this deadlock occurs as I use the
"synchronized"
keyword, and one thread should wait until the other one has finished?
Thanks for any idea,
Stéphanie
----------------- error:
<ExecuteThread: '4' for queue: 'default'> <> <> <000000> <SQL request
sent to database: UPDATE PARTICIPANT par SET par.PARTICIPANTLASTRANK =
4 WHERE par.IDPARTICIPANT = 8983566>
<ExecuteThread: '11' for queue: 'default'> <> <> <000000> <SQL request
sent to database: UPDATE PARTICIPANT par SET par.PARTICIPANTLASTRANK =
6 WHERE par.IDPARTICIPANT = 8983570>
ORA-00060: deadlock detected while waiting for resource
at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:134)
at oracle.jdbc.ttc7.TTIoer.processError(TTIoer.java:289)
at oracle.jdbc.ttc7.Oall7.receive(Oall7.java:573)
at oracle.jdbc.ttc7.TTC7Protocol.doOall7(TTC7Protocol.java:1891)
at oracle.jdbc.ttc7.TTC7Protocol.parseExecuteFetch(TTC7Protocol.java:1093)
at oracle.jdbc.driver.OracleStatement.executeNonQuery(OracleStatement.java:2047)
at oracle.jdbc.driver.OracleStatement.doExecuteOther(OracleStatement.java:1940)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:2709)
at oracle.jdbc.driver.OracleStatement.executeUpdate(OracleStatement.java:796)
at weblogic.jdbc.pool.Statement.executeUpdate(Statement.java:872)
at weblogic.jdbc.rmi.internal.StatementImpl.executeUpdate(StatementImpl.java:89)
at weblogic.jdbc.rmi.SerialStatement.executeUpdate(SerialStatement.java:100)
at bfinance.framework.EDBBLBean.executeSQL(EDBBLBean.java:299)Hi Stepanie,
I'd try to group update statement together. Usually it helps.
Regards,
Slava Imeshev
"Stephanie" <[email protected]> wrote in message
news:[email protected]...
Thanks for your answer.
In the case you describe, is there a way to ensure that tx-2 waits for
tx-1
to be finished before beginning?
My transaction which causes the problem is the following (simplified):
UPDATE tableA SET islast=0 WHERE externalid=myid;
for (int i=0; i< aVector.size(); i++) {
INSERT INTO tableA (id, islast, ranking, externalid) (SELECT
SEQ_tableA.nextval, 1, 0, myid);
UPDATE tableA SET ranking = /*calculated ranking */
WHERE externalid=myid AND islast=1;
UPDATE tableB ....
commit;
tx-1 and tx-2 execute this transaction at the same time. tx-1 begins
The deadlock appears when tx-2 executes the second UPDATE tableA
query.
I don't see how I can avoid to execute these two update queries, so if
I can find another way to prevent deadlock, it would be great!
Stéphanie
Joseph Weinstein <[email protected]_this> wrote in message
news:<[email protected]_this>...
Stephanie wrote:
Hi,
I use WL6.1 SP4, Oracle 8.1.6, with some Java objects which execute a
lot
of SQL queries (mixed update, insert and select) using plain JDBC
calls,
and Weblogic connection pools. These objects are called by servlets.
I experienced recently deadlocks when two users call the object at the
same
time (See error below).Hi. The error you are getting isn't necessarily from a lack ofsynchronization
of your java objects. It has to do with the order in which you accessDBMS
data. You are getting ordinary DBMS deadlocks, which are caused when
two DBMS connections each have a lock the other wants, in order toproceed.
The DBMS will quickly discover this and will kill one transaction inorder to
let the other one proceed:
time 0: tx-1 and tx-2 have started.....
time 1: tx-1: update tableA set val = 1 where key = 'A'
time 2: tx-2: update tableB set val = 2 where key = 'B'
time 3: tx-1: update tableB set val = 1 where key = 'B' (waitsbecause tx-2 has the row
locked)
time 4: tx-2: update tableA set val = 2 where key = 'A' (waitsbecause tx-1 has the row
locked)
This is a deadlock. The solution is to organize your application code sothat every
transaction accesses the data in the same order, eg: update tableAfirst, then update tableB.
This will prevent deadlocks.
Joe Weinstein at BEA
I execute the queries using "synchronized" keyword in the following
way:
synchronized (this)
conConnection.setAutoCommit(false);
executeTransaction(myStatement);
conConnection.commit();
executeTransaction is overriden in sub-classes and is the method which
executes
all the queries.
It calls methods in other objects. These methods are not declared as
synchronized.
1) Should they?
2) Should I use the keyword "synchronized" in another way?
3) This part of code is also called when I do only "select"
statements. I guess
it should only be synchronized when we do "update" and "insert" which
could lead
to a deadlock?
4) Do you have any idea why this deadlock occurs as I use the
"synchronized"
keyword, and one thread should wait until the other one has finished?
Thanks for any idea,
Stéphanie
----------------- error:
<ExecuteThread: '4' for queue: 'default'> <> <> <000000> <SQL request
sent to database: UPDATE PARTICIPANT par SET par.PARTICIPANTLASTRANK =
4 WHERE par.IDPARTICIPANT = 8983566>
<ExecuteThread: '11' for queue: 'default'> <> <> <000000> <SQL request
sent to database: UPDATE PARTICIPANT par SET par.PARTICIPANTLASTRANK =
6 WHERE par.IDPARTICIPANT = 8983570>
ORA-00060: deadlock detected while waiting for resource
at
oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:134)
at oracle.jdbc.ttc7.TTIoer.processError(TTIoer.java:289)
at oracle.jdbc.ttc7.Oall7.receive(Oall7.java:573)
atoracle.jdbc.ttc7.TTC7Protocol.doOall7(TTC7Protocol.java:1891)
atoracle.jdbc.ttc7.TTC7Protocol.parseExecuteFetch(TTC7Protocol.java:1093)
atoracle.jdbc.driver.OracleStatement.executeNonQuery(OracleStatement.java:2047
atoracle.jdbc.driver.OracleStatement.doExecuteOther(OracleStatement.java:1940)
atoracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java
:2709)
atoracle.jdbc.driver.OracleStatement.executeUpdate(OracleStatement.java:796)
atweblogic.jdbc.pool.Statement.executeUpdate(Statement.java:872)
atweblogic.jdbc.rmi.internal.StatementImpl.executeUpdate(StatementImpl.java:89
atweblogic.jdbc.rmi.SerialStatement.executeUpdate(SerialStatement.java:100)
at bfinance.framework.EDBBLBean.executeSQL(EDBBLBean.java:299) -
Select for update query not working..
hi i am tying to get this bit of ncode to work so i can then go to the next part of demonstrating a deadlock between two transactions. however i cannot go any further has my initial code does not work at all. here it goes
//////////User A////////////////////////////////
DECLARE
v_salary squad.salary%TYPE := 300;
v_pos squad.position%TYPE := 'Forward';
BEGIN
UPDATE squad
SET salary = salary + v_salary
WHERE sname = 'Henry';
FOR UPDATE;
UPDATE squad
SET position = v_pos
WHERE sname = 'Fabregas';
COMMIT;
END;
//////////////////////User B/////////////
DECLARE
v_salary squad.salary%TYPE := 200;
v_pos squad.position%TYPE := 'Forward';
BEGIN
UPDATE squad
SET position = v_pos
WHERE sname = 'Fabregas';
FOR UPDATE;
UPDATE squad
SET salary = salary + v_salary
WHERE sname = 'Henry';
FOR UPDATE;
COMMIT;
END;
Basicly user a creats a lock and so does user b, user b enquires a lock from user a and vice versa i.e. a deadlockHi
You get the following error:
ORA-06550: line 8, column 7:
PLS-00103: Encountered the symbol "UPDATE" when expecting one of the following:
because the FOR UPDATE; is invalid in your statement.
Try this:
//////////User A////////////////////////////////
DECLARE
v_salary squad.salary%TYPE := 300;
v_pos squad.position%TYPE := 'Forward';
v_n number;
BEGIN
UPDATE squad
SET salary = salary + v_salary
WHERE sname = 'Henry';
select 1 into v_n from squad
WHERE sname = 'Fabregas'
for update;
UPDATE squad
SET position = v_pos
WHERE sname = 'Fabregas';
COMMIT;
END;
//////////////////////User B/////////////
DECLARE
v_salary squad.salary%TYPE := 200;
v_pos squad.position%TYPE := 'Forward';
v_n number;
BEGIN
UPDATE squad
SET position = v_pos
WHERE sname = 'Fabregas';
select 1 into v_n from squad
WHERE sname = 'Henry'
for update;
UPDATE squad
SET salary = salary + v_salary
WHERE sname = 'Henry';
COMMIT;
END;
To syncronize the blocks first in a SQLPlus call these two statements:
select 1 from squad WHERE sname = 'Fabregas' for update;
select 1 from squad WHERE sname = 'Henry' for update;
After this start the user A code in another SQLPlus, and start the user B code. After this call rollback or commit in the first sqlplus.
Ott Karesz
http://www.trendo-kft.hu -
Dealing with hard timeout of guarded service
Hi, I'm investigating the behavior the cause of and subsequent behavior after a hard timeout of a guarded service. In my experience, the grid members are unable to recover properly. I am trying to figure out whether there is something in our configuration that may be aggravating the situation, and also see whether I might be able to improve the behavior of our client code.
What I have done is purposely lower the service guardian's timeout, to about 25 seconds, so that a certain EntryProcessor will always time-out. The behavior I see after the timeout is very similar to the behavior we see when the problem crops up in the real world. I know that I can raise the timeout or use the PriorityTask API, but even if I do that, we may run into timeouts due to deadlock. If that happens, I want to know what I can expect the grid to do.
Here's what I'm seeing:
I am running 2 storage-enabled members with backup-count=0. I am running my "client" as a non-storage-enabled member.
First, the client kicks off the EntryProcessor:
Map<Object, Integer> readProcessorResult = requestCache.invokeAll(AlwaysFilter.INSTANCE, processor);
The two members begin logging successful executions of the EntryProcessor. Then, one of the members happens to run it against an entry that takes longer than the service guardian's timeout. We get a soft and hard timeout:
[ERROR] Coherence: 2012-01-26 10:56:20.103/333.297Oracle Coherence GE 3.7.1.2 <Error> (thread=Cluster, member=1): Detected soft timeout) of {WrapperGuardable Guard{Daemon=DistributedCache:jdw-read-request-service} Service=PartitionedCache{Name=jdw-read-request-service, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=0, AssignedPartitions=128}}
[WARN ] Coherence: 2012-01-26 10:56:20.121/333.315Oracle Coherence GE 3.7.1.2 <Warning> (thread=Recovery Thread, member=1): Attempting recovery of Guard{Daemon=DistributedCache:jdw-read-request-service}
[ERROR] Coherence: 2012-01-26 10:56:22.604/335.798Oracle Coherence GE 3.7.1.2 <Error> (thread=Cluster, member=1): Detected hard timeout) of {WrapperGuardable Guard{Daemon=DistributedCache:jdw-read-request-service} Service=PartitionedCache{Name=jdw-read-request-service, State=(SERVICE_STARTED), LocalStorage=enabled, PartitionCount=257, BackupCount=0, AssignedPartitions=128}}Now, for some reason (why?), Coherence determines that the service is unrecoverable and it must stop the cluster:
[WARN ] Coherence: 2012-01-26 10:56:22.605/335.799Oracle Coherence GE 3.7.1.2 <Warning> (thread=Termination Thread, member=1): Terminating Guard{Daemon=DistributedCache:jdw-read-request-service}
Coherence <Error>: Halting this cluster node due to unrecoverable service failure
[ERROR] Coherence: 2012-01-26 10:56:23.613/336.807Oracle Coherence GE 3.7.1.2 <Error> (thread=Cluster, member=1): StopRunning ClusterService{Name=Cluster, State=(SERVICE_STARTED, STATE_JOINED), Id=0, Version=3.7.1, OldestMemberId=1} due to unhandled exception:
[ERROR] Coherence: 2012-01-26 10:56:23.613/336.807Oracle Coherence GE 3.7.1.2 <Error> (thread=PacketListener1P, member=1): Stopping cluster due to unhandled exception: com.tangosol.net.messaging.ConnectionException: UdpSocket.receive: unable to reopen socket; State=STATE_CLOSED
at com.tangosol.coherence.component.net.socket.UdpSocket.receive(UdpSocket.CDB:58)
at com.tangosol.coherence.component.net.UdpPacket.receive(UdpPacket.CDB:1)
at com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketListener.onNotify(PacketListener.CDB:20)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
at java.lang.Thread.run(Unknown Source)
[ERROR] Coherence: 2012-01-26 10:56:23.641/336.835Oracle Coherence GE 3.7.1.2 <Error> (thread=Cluster, member=n/a):
java.lang.NullPointerException: null
at com.tangosol.coherence.component.net.Cluster$ClusterService$TcpRing.onAcceptException(Cluster.CDB:13) ~[coherence-3.7.1.2.jar:3.7.1.2]
at com.tangosol.coherence.component.net.TcpRing.onAccept(TcpRing.CDB:25) ~[coherence-3.7.1.2.jar:3.7.1.2]
at com.tangosol.coherence.component.net.TcpRing.onSelect(TcpRing.CDB:27) ~[coherence-3.7.1.2.jar:3.7.1.2]
at com.tangosol.coherence.component.net.TcpRing.select(TcpRing.CDB:14) ~[coherence-3.7.1.2.jar:3.7.1.2]
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.onWait(ClusterService.CDB:6) ~[coherence-3.7.1.2.jar:3.7.1.2]
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:39) ~[coherence-3.7.1.2.jar:3.7.1.2]
at java.lang.Thread.run(Unknown Source) [na:1.6.0_30]
[WARN ] Coherence: 2012-01-26 10:56:24.976/338.170Oracle Coherence GE 3.7.1.2 <Warning> (thread=Invocation:jdw-invocation-service, member=n/a): failed to stop 95 worker threads; abandoning
Coherence <Error>: Halted the cluster:
Cluster is not running: State=5
Exception in thread "Cluster|SERVICE_STOPPED|Member(Id=1, Timestamp=2012-01-26 10:50:58.044, Address=192.168.1.67:9001, MachineId=52295, Location=site:,machine:DEN12956L,process:10012, Role=CoherenceServer)" java.nio.channels.ClosedSelector
Exception
at sun.nio.ch.SelectorImpl.keys(Unknown Source)
at com.tangosol.coherence.component.net.TcpRing.disconnectAll(TcpRing.CDB:6)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService$TcpRing.onLeft(ClusterService.CDB:4)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.onStopRunning(ClusterService.CDB:7)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.onException(ClusterService.CDB:28)
at com.tangosol.coherence.component.net.Cluster$ClusterService.onException(Cluster.CDB:7)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:85)
[ERROR] Coherence: 2012-01-26 10:56:25.616/338.810 Oracle Coherence GE 3.7.1.2 <Error> (thread=Cluster, member=n/a): StopRunning ClusterService{Name=Cluster, State=(SERVICE_STOPPED, STATE_JOINED), Id=0, Version=3.7.1} due to unhandled exception:
at java.lang.Thread.run(Unknown Source)
[ERROR] Coherence: 2012-01-26 10:56:25.616/338.810Oracle Coherence GE 3.7.1.2 <Error> (thread=Cluster, member=n/a):
java.nio.channels.ClosedSelectorException: null
at sun.nio.ch.SelectorImpl.keys(Unknown Source) ~[na:1.6.0_30]
at com.tangosol.coherence.component.net.TcpRing.close(TcpRing.CDB:11) ~[coherence-3.7.1.2.jar:3.7.1.2]
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.ClusterService.onExit(ClusterService.CDB:1) ~[coherence-3.7.1.2.jar:3.7.1.2]
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:68) ~[coherence-3.7.1.2.jar:3.7.1.2]
at java.lang.Thread.run(Unknown Source) [na:1.6.0_30]Around the same time that member 1 decided to stop its cluster, member 2, which had been happily executing entry processors, begins taking responsibility for member 1's partitions. That's what I'd expect.
[WARN ] Coherence: 2012-01-26 10:56:23.643/336.784Oracle Coherence GE 3.7.1.2 <Warning> (thread=DistributedCache:sys-id-dist-service, member=2): Assigned 128 orphaned primary partitions
[WARN ] Coherence: 2012-01-26 10:56:23.646/336.787Oracle Coherence GE 3.7.1.2 <Warning> thread=DistributedCache:sourceMetadataReviewCache-service, member=2): Assigned 128 orphaned primary partitions
......Member 1 now restarts its cluster and re-joins as member 4, and starts re-starting its services:
[INFO ] Coherence: 2012-01-26 10:56:26.008/339.202Oracle Coherence GE 3.7.1.2 <Info> (thread=main, member=n/a): Restarting cluster
[INFO ] Coherence: 2012-01-26 10:56:26.327/339.521Oracle Coherence GE 3.7.1.2 <Info> (thread=Cluster, member=n/a): This Member(Id=4, Timestamp=2012-01-26 10:56:26.126, Address=192.168.1.67:9001, MachineId=52295, Location=site:,machine:DEN12956L,process:10012, Role=CoherenceServer, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=2) joined cluster "NIR_GRID_DEV" with senior Member(Id=2, Timestamp=2012-01-26 10:51:03.593, Address=192.168.1.67:9003, MachineId=52295, Location=site:,machine:DEN12956L,process:10024, Role=CoherenceServer, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=2)
[INFO ] Coherence: 2012-01-26 10:56:26.364/339.558Oracle Coherence GE 3.7.1.2 <Info> (thread=main, member=n/a): Started cluster Name=NIR_GRID_DEV
Group{Address=224.3.7.0, Port=60000, TTL=0}
MasterMemberSet(
ThisMember=Member(Id=4, Timestamp=2012-01-26 10:56:26.126, Address=192.168.1.67:9001, MachineId=52295, Location=site:,machine:DEN12956L,process:10012, Role=CoherenceServer)
OldestMember=Member(Id=2, Timestamp=2012-01-26 10:51:03.593, Address=192.168.1.67:9003, MachineId=52295, Location=site:,machine:DEN12956L,process:10024, Role=CoherenceServer)
ActualMemberSet=MemberSet(Size=3
Member(Id=2, Timestamp=2012-01-26 10:51:03.593, Address=192.168.1.67:9003, MachineId=52295, Location=site:,machine:DEN12956L,process:10024, Role=CoherenceServer)
Member(Id=3, Timestamp=2012-01-26 10:55:05.522, Address=192.168.1.67:9005, MachineId=52295, Location=site:,machine:DEN12956L,process:13268, Role=IntellijRtExecutionAppMain)
Member(Id=4, Timestamp=2012-01-26 10:56:26.126, Address=192.168.1.67:9001, MachineId=52295, Location=site:,machine:DEN12956L,process:10012, Role=CoherenceServer)
MemberId|ServiceVersion|ServiceJoined|MemberState
2|3.7.1|2012-01-26 10:51:03.593|JOINED,
3|3.7.1|2012-01-26 10:55:05.522|JOINED,
4|3.7.1|2012-01-26 10:56:26.337|JOINED
RecycleMillis=1200000
RecycleSet=MemberSet(Size=0
TcpRing{Connections=[2, 3]}
IpMonitor{AddressListSize=0}
[INFO ] Coherence: 2012-01-26 10:56:26.365/339.559Oracle Coherence GE 3.7.1.2 <Info> (thread=main, member=4): Restarting Service:Management
[INFO ] Coherence: 2012-01-26 10:56:26.417/339.611Oracle Coherence GE 3.7.1.2 <Info> (thread=main, member=4): Restarting Service:jdwSourceManagerCache-service
......While that is happening, member 2 now also encounters a hard timeout on the same service and begins going through the same process that member 1 just went through. I am not sure why member 2 encounters this timeout. Perhaps it is because it ran another entry processor that took too long? It is difficult to tell. Except this time we start having problems when member 2 tries to stop the cluster, and I'm not sure exactly why. Perhaps because member 4 is not yet in a good state?
[ERROR] Coherence: 2012-01-26 10:56:35.282/348.423Oracle Coherence GE 3.7.1.2 <Error> (thread=PacketListener1, member=2): Stopping cluster due to unhandled exception:
com.tangosol.net.messaging.ConnectionException: UdpSocket.receive: unable to reopen socket; State=STATE_CLOSED
at com.tangosol.coherence.component.net.socket.UdpSocket.receive(UdpSocket.CDB:58)
at com.tangosol.coherence.component.net.UdpPacket.receive(UdpPacket.CDB:1)
at com.tangosol.coherence.component.util.daemon.queueProcessor.packetProcessor.PacketListener.onNotify(PacketListener.CDB:20)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
at java.lang.Thread.run(Unknown Source)
Coherence <Error>: Halted the cluster:
Cluster is not running: State=5
[ERROR] Coherence: 2012-01-26 10:56:42.147/355.288Oracle Coherence GE 3.7.1.2 <Error> (thread=main, member=n/a): Failed to restart services: com.tangosol.net.RequestTimeoutException: Timeout while waiting for cluster to stop.
[ERROR] Coherence: 2012-01-26 10:56:51.324/364.465Oracle Coherence GE 3.7.1.2 <Error> (thread=main, member=n/a): Failed to restart services: com.tangosol.net.RequestTimeoutException: Timeout while waiting for cluster to stop.
...... (error repeats)And the newly re-joined member 4, which had been restarting services and claiming partitions from the newly-dead member 2, appears to restart execution of the entry processor but then complains that its thread was interrupted: I am not sure why this happens or what it means.
[INFO ] Coherence: 2012-01-26 10:56:52.685/365.879Oracle Coherence GE 3.7.1.2 <Info> (thread=DistributedCache:jdw-read-request-service, member=4): Restarting NamedCache: coherence.common.sequencegenerators
[ERROR] Coherence: 2012-01-26 10:56:52.686/365.880Oracle Coherence GE 3.7.1.2 <Error> (thread=DistributedCache:jdw-read-request-service, member=4): This thread was interrupted while waiting for the results of a request:
Poll
PollId=2, active
InitTimeMillis=1327597012685
Service=DistributedCacheForSequenceGenerators (60)
RespondedMemberSet=[]
LeftMemberSet=[]
RemainingMemberSet=[4]
},Meanwhile, my client, who ran the invoke() with the EntryProcessor, has received a message stating "no storage enabled members exist," which although it may be true doesn't seem like it's really the cause of the problem.
Could I be having problems because all my cluster members are executing the same kind of long-running EntryProcessor and all of them are getting service guardian timeouts at around the same time? How do I deal with that? For example, what if a database is running very slowly one day and my entry processors all start to execute slowly? I don't want my whole grid having problems. Should I, for example, limit the number of concurrent EntryProcessors to < count of grid members by using a LimitFilter?
What does it mean that there was a timeout waiting for the cluster to stop?
And what about the client? If I get an exception after running a grid command, is it valid to wait a few seconds for things to stabilize and then try again? What's my best bet as far as that goes?
Thanks! And let me know if I should file a support request instead.rehevkor5 wrote:
I appreciate the responses so far, but unfortunately they give me no new information.
I am aware that I can try to avoid timeouts by using the PriorityTask API and heartbeats, but that still gives me no guarantee that I will not run into a deadlock situation and encounter a timeout. That is fine as long as my grid can deal with it. Currently, it does not behave well when that happens (all members eventually die). I challenge anyone reading this post to try running an entry processor on their grid that is designed to time out (not by using sleep() as that is interruptable via a soft timeout). All your members will probably encounter the timeout at the same time, and it probably will not end well. If, however, you can handle this situation or have some approach that mitigates it, I would love to hear from you!
I am also aware that I can disable the service guardians, or configure them to only log and take no action. However, this leaves me vulnerable to deadlocks.
Therefore, I would still appreciate any responses that address either of these two issues:
1. Configuration changes or other fixes that allow my members to recover successfully from a service timeout
2. Operational best practices that allow my grid to continue running even if an entry processor that is running on every member of the grid times out at the same time. For example, limiting the number of concurrent entry processors.One operational best practice is design your system so you can't get into a deadlock. If it can get into a deadlock once, it can get into it many times. If a system does that, it is not a well-behaving or well-designed system.
If your code can get into a deadlock across the cluster, that usually means that you are trying to do something which you should not have, like trying to operate on multiple resources in an arbitrary order while holding locks on them (e.g. violating the threading guidelines), which is the typical case of a distributed deadlock, and against which you can guard with ordering your multiple locking operations in a consistent order, but you should not expect Coherence to retry and possibly get into the same problem, as all you would achieve with it is that you converted the deadlock into a livelock.
Coherence is not supposed to guard against your errors. With the guardian it gives you a way to find out what your error is, but it should not attempt the impossible by trying to correct it. You should design your system so that it does not have this kinds of errors.
If you think there is a specific reproducible problem in how Coherence reacts to your code which can bring down the system, then create a test case and submit it in a SR. The kinds of exceptions you get make it possible that it may be necessary. If it is indeed Coherence's fault that it dies and not something you did, then I am fairly certain that it will be given quite high priority considering it is an error which can bring a cluster down.
Coherence server-side constructs have an API contract. Single foremost aspect of that is that the operations complete without an error (unless otherwise documented) within a small period of time. If it goes against this then it has to provide feedback or customization with PriorityTask/GuardSupport. A code which can deadlock goes against this contract. After all we are speaking about a grid, not a distributed process manager.
I agree that the grid should not die due to a deadlock, but you should not expect Coherence to fix a deadlock either. But dying due to an exception and having a distributed deadlock is two different and independent issues.
Best regards,
Robert
Edited by: robvarga on Feb 3, 2012 10:12 AM -
Tow internal table to one commone internal table
hi
i have two internal table
itab1 having fields
f1
f2
f3
f4
and second internal table having fields
HERE F1 IS COMMON FIELD BETTWEEN TWO
f1
f5
f6
f7
f8
f9
f10
i want the data of both the internal table into one final table having data
F1
F2
F3
F4
F5
F6
F7
F8
F9
F10
IE COMBINED DATA OR BOTHE IE I AM ADDING THE COLUIMNS OF INTERNAL TABLE 1 AND INTERNAL TABLE 2 TO FINAL OUTPUT TABLE
PLEASE SUGGEST ?
REGARDS
ARORAhi RAAM
the situation is like this
her is the internal table with data and rows
IT_APC Table[58x64]
it_dep Table[14x48]
it_Dep_Apc Table[1x108]
now iand combining data of itapc and itdep to it_dep_apc
both have only one common firled bukrs
and as rows and columns as displayed above
now if i do below as suggeted by u
LOOP AT IT_APC INTO WA_APC.
READ TABLE IT_DEP INTO WA_DEP WITH KEY BUKRS = WA_APC-BUKRS
BINARY SEARCH.
then the all the data is not combined i suppose as i need to have the final table all the data from both the internal tables....
ie common rows plus extra rows for bukrs
regards
Arora -
Hi,
I have been facing a problem for the past two days. We have a two node RAC and the DB was working fine until Monday.
Suddenly we came across a lots of Deadlocks with two tables and their was repeated ORA-00060 error in the Alert. And when we analyzed we got the tables and query from the trace. We are not sure about the cause of deadlock because all the columns involved in the query are properly indexes and all foreign keys are also indexes.
When we try the same scenario locally it is working fine but when it get simultaneous hits say like 10 users the query hangs. This was working fine past week.
Below is the observation in ADDM:
FINDING 1: 100% impact (31499 seconds)
SQL statements were found waiting for row lock waits.
RECOMMENDATION 1: Application Analysis, 96% benefit (30314 seconds)
ACTION: Significant row contention was detected in the TABLE
"NIC_GS.T_INSURED_LIST" with object id 270988. Trace the cause of row
contention in the application logic using the given blocked SQL.
RELEVANT OBJECT: database object with id 270988
RATIONALE: The SQL statement with SQL_ID "41k7uj9f36tv0" was blocked on
row locks.
RELEVANT OBJECT: SQL statement with SQL_ID 41k7uj9f36tv0
update T_INSURED_LIST set UPDATE_TIME=:1, POLICY_ID=:2,
SUM_INSURED=:3, INSERT_TIME=:4, FIELD12=:5, FIELD13=:6 where
INSURED_ID=:7
RATIONALE: The SQL statement with SQL_ID "9dg72r8w5nap6" was blocked on
row locks.
RELEVANT OBJECT: SQL statement with SQL_ID 9dg72r8w5nap6
update T_INSURED_LIST set UPDATE_TIME=:1, POLICY_ID=:2,
INSERT_TIME=:3, FIELD12=:4, FIELD13=:5 where INSURED_ID=:6
RATIONALE: The SQL statement with SQL_ID "d6n84rch33cbq" was blocked on
row locks.
RELEVANT OBJECT: SQL statement with SQL_ID d6n84rch33cbq
update T_INSURED_LIST set sum_insured=:1 where insured_id=:2
FINDING 2: 98% impact (31024 seconds)
SQL statements consuming significant database time were found.
RECOMMENDATION 1: SQL Tuning, 53% benefit (16830 seconds)
ACTION: Investigate the SQL statement with SQL_ID "41k7uj9f36tv0" for
possible performance improvements.
RELEVANT OBJECT: SQL statement with SQL_ID 41k7uj9f36tv0 and
PLAN_HASH 610793629
update T_INSURED_LIST set UPDATE_TIME=:1, POLICY_ID=:2,
SUM_INSURED=:3, INSERT_TIME=:4, FIELD12=:5, FIELD13=:6 where
INSURED_ID=:7
RATIONALE: SQL statement with SQL_ID "41k7uj9f36tv0" was executed 63
times and had an average elapsed time of 267 seconds.
RATIONALE: Waiting for event "enq: TX - row lock contention" in wait
class "Application" accounted for 99% of the database time spent in
processing the SQL statement with SQL_ID "41k7uj9f36tv0".
Please suggest....Hi,
Thanks for your response..
We have AIX server with oracle 10.2.0.4.
We have narrowed down the issue : The table contains multiple insurance products and the problem is not occurring all the products but in only one product.
I have been attaching the full trace of Deadlock.
----------enqueue 0x7000002cf73bff0------------------------
lock version : 37
Owner node : 0
grant_level : KJUSERNL
req_level : KJUSEREX
bast_level : KJUSERNL
notify_func : 0
resp : 7000002aa711890
procp : 7000002ca743c90
pid : 4460708
proc version : 0
oprocp : 0
opid : 0
group lock owner : 7000002c84f0398
possible pid : 4460708
xid : 101E-01EE-00000002
dd_time : 2.0 secs
dd_count : 0
timeout : 0.0 secs
On_timer_q? : N
On_dd_q? : Y
lock_state : OPENING CONVERTING
Open Options : KJUSERDEADLOCK
Convert options : KJUSERGETVALUE
History : 0x1495149a
Msg_Seq : 0x0
res_seq : 1
valblk : 0x00000000000000000000000000000000 .
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[494.4460708] on resource TX-0087002D-000012DF
Submitting asynchronized dump request [28]
*** 2012-08-22 15:51:13.021
kjddopr: skip converting lock 7000002c8a214d0 dd_cnt 1
user session for deadlock lock 7000002cf73c698
pid=242 serial=2 audsid=323336032 user: 103/Samp_GS
O/S info: user: Samp, term: , ospid: 1234, machine: Samp-PEBAAP1A-AR
program:
Current SQL Statement:
update T_POLICY_CT set UPDATE_TIME=:1, INSERT_TIME=:2, FIELD01=:3, FIELD02=:4, FIELD03=:5, FIELD04=:6, FIELD07=:7, FIELD66=:8 where POLICY_CT_ID=:9
user session for deadlock lock 7000002cf73c548
pid=244 serial=2 audsid=323336034 user: 103/Samp_GS
O/S info: user: Samp, term: , ospid: 1234, machine: Samp-PEBAAP1A-AR
program:
Current SQL Statement:
update T_INSURED_LIST set sum_insured=:1 where insured_id=:2
user session for deadlock lock 7000002c8adf120
pid=244 serial=2 audsid=323336034 user: 103/Samp_GS
O/S info: user: Samp, term: , ospid: 1234, machine: Samp-PEBAAP1A-AR
program:
Current SQL Statement:
update T_INSURED_LIST set sum_insured=:1 where insured_id=:2
user session for deadlock lock 7000002c8a21380
pid=242 serial=2 audsid=323336032 user: 103/Samp_GS
O/S info: user: Samp, term: , ospid: 1234, machine: Samp-PEBAAP1A-AR
program:
Current SQL Statement:
update T_POLICY_CT set UPDATE_TIME=:1, INSERT_TIME=:2, FIELD01=:3, FIELD02=:4, FIELD03=:5, FIELD04=:6, FIELD07=:7, FIELD66=:8 where POLICY_CT_ID=:9
Global blockers dump start:---------------------------------
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x2d0025][0x4ff2],[TX]
----------resource 0x7000002aa711890----------------------
resname : [0x2d0025][0x4ff2],[TX]
Local node : 0
dir_node : 0
master_node : 0
hv idx : 118
hv last r.inc : 4
current inc : 4
hv status : 0
hv master : 1
open options : dd
grant_bits : KJUSERNL KJUSEREX
grant mode : KJUSERNL KJUSERCR KJUSERCW KJUSERPR KJUSERPW KJUSEREX
count : 1 0 0 0 0 1
val_state : KJUSERVS_NOVALUE
valblk : 0x00000000000000000000000000000000 .
access_node : 0
vbreq_state : 0
state : x0
resp : 7000002aa711890
On Scan_q? : N
Total accesses: 43
Imm. accesses: 39
Granted_locks : 1
Cvting_locks : 1
value_block: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
GRANTED_Q :
lp 7000002cf73c548 gl KJUSEREX rp 7000002aa711890 [0x2d0025][0x4ff2],[TX]
master 0 gl owner 7000002cf501998 possible pid 2551964 xid 100F-00F4-00000002 bast 0 rseq 5 mseq 0 history 0x14951495
open opt KJUSERDEADLOCK
CONVERT_Q:
lp 7000002cf73c698 gl KJUSERNL rl KJUSEREX rp 7000002aa711890 [0x2d0025][0x4ff2],[TX]
master 0 gl owner 7000002c8544830 possible pid 2588702 xid 100F-00F2-00000002 bast 0 rseq 5 mseq 0 history 0x1495149a
convert opt KJUSERGETVALUE
----------enqueue 0x7000002cf73c548------------------------
lock version : 109
Owner node : 0
grant_level : KJUSEREX
req_level : KJUSEREX
bast_level : KJUSERNL
notify_func : 0
resp : 7000002aa711890
procp : 7000002ca6f5870
pid : 2588702
proc version : 0
oprocp : 0
opid : 0
group lock owner : 7000002cf501998
possible pid : 2551964
xid : 100F-00F4-00000002
dd_time : 0.0 secs
dd_count : 0
timeout : 0.0 secs
On_timer_q? : N
On_dd_q? : N
lock_state : GRANTED
Open Options : KJUSERDEADLOCK
Convert options : KJUSERNOQUEUE
History : 0x14951495
Msg_Seq : 0x0
res_seq : 5
valblk : 0x00000000000000000000000000000000 .
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[244.2551964] on resource TX-002D0025-00004FF2
Submitting asynchronized dump request [28]
Please suggest.... -
Question about using Db cursors safely outside a transaction
I have a question re: proper use of the Dbc (Db cursor) to avoid
self-deadlock when two cursors are opened by the same thread and used only
for reading. (This is when open in full transactional mode, with an
environment and all.)
My first question is, should I bother? Or is it equivalent to always start a
transaction, even before read operations? My reading of the doc says it is
not equivalent. The doc says "in the presence of transactions all locks are
held until commit". I read it as saying that reading with a cursor outside a
transaction provides degree 2 isolation (cursor stability), with greater
concurrency than reading inside a transaction (which provides degree 3
isolation). What I'm attempting here is to avoid degree 3 isolation, and
locks held until commit, for cursor reads -- I want the greater concurrency.
This is just a technique to do reading outside a transaction, all writing
will be done within transactions, and all reading will be done with a txnid
too, whenever a transaction is in progress. Am I OK so far?
Here's how I imagine getting away with it. Tell me if this works please :-)
and (of course) if there's a better way. Suppose DB_THREAD was specified,
but to avoid the dreaded self-deadlock when using multiple cursors outside a
transaction, all such cursors are used only to read and are dup'ed from a
single cursor (one kept as a prototype in each thread for that purpose,
solely to ensure a single locker ID for all these read cursors; the
prototype cursor is never used to lock anything). As I read the doc, that
prevents the thread from self-deadlocking, provided it doesn't start any
transactions while such read cursors are extant (other than the prototype
cursor, which shouldn't matter because it never holds any locks). I
understand that if the same thread did start a transaction, it could
self-deadlock with such a read cursor, but I won't be doing that.
My next question is, am I still OK if multiple Db's are placed in one file?
The doc says in that case lock contention can only occur for page
allocations -- seeming to imply that competition among read-only cursors
won't ever cause it -- though they could be contending with other threads
that are writing -- could thread self-deadlock ever occur here? I'm not
worried about real deadlock with the other (writer) thread, just thread
self-deadlock between my two read cursors.
Many thanks if you could confirm my reasoning (or dash my hopes) before I
find out the long, hard nondeterministic way if it's any good.Good, so I've overdesigned my solution -- I can dispense with dup()ing a prototype cursor and just use multiple read-only cursors made by Db::cursor with a NULL txnid.
Many thanks for your prompt response.
I am David Christie at samizdat.org BTW -- we exchanged email last year re: a potential C++ std container interface for Berkeley DB -- sometime before the merger. I hope it's all worked out well for your sleepycat guys. It's nice to see some of you are still around and responding to the open source community. -
If Two Users try to schedule a report at the same time does that lead to a Deadlock?
When Two users trying to access the same server and need to access the report for scheduling at the same time, does that lead to Deadlock Situation?
What are the Odd's in such scenarios?
Please help me understanding.
Regards,
ShivaWhen two users are say trying to schedule the same report at the same time - the relevant Job server has default capacity of running 5 concurrent jobs at a time (this value can be increased) - that means, it can run 5 schedules at the same time.
Let's say now two users have scheduled 6 reports - now the 5 reports that were scheduled will be in the "running" state but the server has only 5 concurrent job capacity so the 6th report schedule will stay "pending" until any one of the existing jobs are completed, and then the 6th report will start "running".
There is nothing called deadlock while scheduling. While viewing the reports, there are enough resources available in the BO system to make the report available for various users at the same time.
Thank you, Rahul -
Deadlock when executing package simultaneously from two different sessions
I have written a package which will do the below tasks .
package A
Delete data which was older than one year from master table A and then populate data from stage table A to master table A .
truncate summary table and populate the summary table from the master table .
The package will be executed from a java application with the help of a scheduler in java. Some times, the packageA is still executing, while another instance of the package A is scheduled, that creating to deadlock in oracle. we can not use dbms_locks pkg in our application due to restrictions . i want to handle this situation from the db side in such way that the session B , need to wait until the session A completes the execution of the package . Can some one please tell how to handle this scenario in pl/sql?
I thought of creating a function which will return the execution status of package A by reading a flag from temporary table . So that next schedule can be scheduled with the return status of the function. is there any other way other than this , so that i can pause execution of package A in session B and resume after session A is successfully executed
create or replace pkg a
populate master ;
populate smry ;
populate app_tables ;
end pkg ;
create or replace pkg body
populate master()
delete from master where loaddate < sysdate -365;
loop
fetch from stage a
insert to master a
end loop
populate smry()
truncate sumary a ;
insert into smry
select values from master ;
populate app_tables()
populate master;
populate smry ;
end pkg bodyI have a question about your requirements. I'm not questioning them just trying to understand them. You wrote:
Delete data which was older than one year from master table A and then populate data from stage table A to master table A .
truncate summary table and populate the summary table from the master table . If this is all there is to the requirement why would a second invocation be scheduled so soon? If you delete all data older than one year why would you need to do it again so soon?
Notwithstanding the above you basically need a serialization management process.
For all batch frameworks I have worked with we always include batch control and status tables to:
1. guarantee that batches are run in the proper order
2. allow for batch restart
3. allow for suspension or termination of single batch jobs or job streams
4. provide for reporting of batch statistics - batch run time, records processed, etc.
5. simplify the testing of complex batch streams. Tests can be performed on any one batch step or any combination of steps by enabling/disabling flags in the control tables.
6. eliminate the possibility of the problem you are reporting
Using one or more batch control and status tables, in my opiinion, is the simplest and best way to serialize the batch jobs you are dealing. Such tables gives you maximum flexibility while placing the fewest constraints on the system.
In the system I work with we try to have a clear line of demarcation between processes that control the work to be done and the processes that actually do the work.
The processes that do the work never determine what data they work with; they are parameterized so that they are told what data to use. All they know is how to process the data that they are told to process. This makes it easier to scale to add additional 'worker' processes by having the 'control' processes break up the data into different batches and running 'worker's in parallel.
I would suggest designing and implementing a control hierarchy to oversee the execution of the worker processes. This will be your serialization manager. -
Error-Code: ora-4020 Deadlock error, while updating a record
Hello!
Is it possible to get a deadlock error when two or more users try to update the same record simultaneously? And what would be the best solution to circumvent this?
Thanks
Your help is appreciated.<BLOCKQUOTE><font size="1" face="Verdana, Arial">quote:</font><HR>Originally posted by ( vidya):
Hello Vidhya,
Issue either Commit or Rollback without much delay once the statement(s) is/are executed.
vmkrish
[email protected]
Is it possible to get a deadlock error when two or more users try to update the same record simultaneously? And what would be the best solution to circumvent this?
Thanks
Your help is appreciated.<HR></BLOCKQUOTE>
null -
RSBKCHECKBUFFER job scheduled twice and both canceling suddenly with two er
Hello,
This job RSBKCHECKBUFFER is scheduled to run two times in production system starting at same time 12:00 pm and it was running fine till 12.04.11 but both the jobs started failing suddenly since 12.05.11 with two different sql errors at times
DBIF_RSQL_SQL_ERROR
CX_SY_OPEN_SQL_DB
1. Database error text........: "SQL1224N The database manager is not able to
accept new requests, has terminated all requests in progress, or has
terminated the specified request because of an error or a forced interrupt.
SQLSTATE=55032 row=1"
2. Database error text........: "SQL0911N The current transaction has been rolled
back because of a deadlock or timeout. Reason code "2". SQLSTATE=40001 row=1"
Also both the job is scheduled to run with two different parameters
Parameters
&0000000000001
&0000000000000
Can anyone let me know what could be the reason of sudden failure since 12.05.11.
Also if I schedule the job to run only once in the system will it solve the problem ?Hello Shradha,
1. Database error text........: "SQL1224N The database manager is not able to accept new requests, has terminated all requests in progress, or has terminated the specified request because of an error or a forced interrupt.
SQLSTATE=55032 row=1"
- As far as I know this is serious problem in production environment. Looks like your DB was being restarted at that time. Please have your DBA team to review db2diag.log for error details.
2. Database error text........: "SQL0911N The current transaction has been rolled back because of a deadlock or timeout. Reason code "2". SQLSTATE=40001 row=1" .
- Looks like second job was overlapped with first one. Please try to run the job once in a day or change of schedule of both jobs to ensure no overlap.
Thanks,
Siva Kumar -
Max Degree of Parallelism, Timeouts and Deadlocks
We have been trying to track down some timeout issues with our SQL Server. Our SQL Server runs a AccPac Account System, an Internal Intranet site and some SharePoint Content DBs.
We know that some of the queries are complex and take some time, though the timeouts seem to happen randomly and not always on the same pages/queries. IIS and the SQL timeouts are set to 60 and 120 seconds.
Looking at some of the SQL Server settings, I noticed that MAXDOP was set to 1. Before doing extensive research on it and looking through the BOL, another DBA and I changed it to 0.
Server is a Hyper-V VM with:
Processors:4
NUMA nodes:2
Sockets: 2
Interesting thing happened. Our Timeouts seem to have disappeared. Going from several a day, we are now at 1 every few days. Though now the issue we are having is that our Deadlocks have gone through the roof. Going from one or two every few days, we are
up to 8+ a day!
We have been changing our Select statements to include WITH (NOLOCK) so they do not compete with the UPDATE statements they usually fall victim to. The Deadlocks and timeouts do not seem to be related to any of the SharePoint Content DBs. All of the deadlocks
are with our Intranet Site and When it communicates with the AccPac DB, or the Internet Site on its own.
Any Suggestions on where I should be focusing my energy on benchmarking and tuning the server?
Thank you,
Scott<-Thank you all for your replies.
The server had 30GB of RAM and then we bumped it up to 40GB at the same time we changed the MAXDOP to 0.
It was set to 1 because if it isn't MS won't support your SharePoint installation. This is from the Setup Guide for SharePoint on SQL Server, MAXDOP = 1 is a must in their book for official support. It always forces serial plans, because to be honest
the sharepoint queries are extremely terrible.
I understand this, though I would guess that the install of SharePoint didn’t actually do the MAXDOP =1 setting during the install? We basically have two Sharepoint sites on the server. One has about 10 users and the other has maybe 20 users. The Sites are
not used very much either. So I didn't think there would be too much impact.
Though now the issue we are having is that our Deadlocks have gone through the roof.
You probably didn't get this before (though they probably still happened) because the executions were forced serially and you dodged many-a-bullet because of this artificial speed bump. Deadlocks are application based, pure and simple.
The accounting system we are running is something we typically do not alter the DB Contents directly, we are only peering into the database to present information to the user. We looked at READ_COMMITTED_SNAPSHOT, though since that is a DB setting on not
on individual queries, we do not want to alter the Accounting DB as we could not know the potential ramifications of what that would cause.
A Typical Deadlock will occur when the Accounting system is creating or modifying an order’s master record so no one else can modify it, though instead of row lock, it locks the table. This is something that is out of our control. When we do a select against
the same table from the Intranet site, we get a deadlock unless we use the WITH (NOLOCK). The data that we get is not Super Critical. The only potential issue that might happen is that in an uncommitted Transaction form the Accounting system, it could be adding
multiple rows to an order and when we SELECT the data, we might miss a line Item or two.
We have been changing our Select statements to include WITH (NOLOCK) so they do not compete with the UPDATE statements they usually fall victim to.
This really isn't going to get very far to be honest. Are you deadlocking on the same rows? It seems to be the order of operations taken by either the queries or the logic used in them. Without deadlock information there really is nothing to diagnose, but
that definitely sounds like the same resource being used in multiple places when it probably doesn't need to be.
This is one of the typical deadlocks that we get. Intranet Site is getting Totals for Orders while the accounting system is in the process of setting its internal record Lock on an Order.
<EVENT_INSTANCE>
<EventType>DEADLOCK_GRAPH</EventType>
<PostTime>2014-05-12T15:26:09.447</PostTime>
<SPID>23</SPID>
<TextData>
<deadlock-list>
<deadlock victim="process2f848b048">
<process-list>
<process id="process2f848b048" taskpriority="0" logused="0" waitresource="OBJECT: 12:644249400:0 " waittime="1295" ownerId="247639995" transactionname="SELECT" lasttranstarted="2014-05-12T15:26:08.150" XDES="0x69d1d3620" lockMode="IS" schedulerid="2" kpid="2856" status="suspended" spid="184" sbid="0" ecid="0" priority="0" trancount="0" lastbatchstarted="2014-05-12T15:26:08.150" lastbatchcompleted="2014-05-12T15:26:08.150" lastattention="2014-05-12T14:50:52.280" clientapp=".Net SqlClient Data Provider" hostname="VSVR-WWW-INT12" hostpid="15060" loginname="SFA" isolationlevel="read committed (2)" xactid="247639995" currentdb="7" lockTimeout="4294967295" clientoption1="673185824" clientoption2="128056">
<executionStack>
<frame procname="SFA.dbo.SAGE_SO_order_total_no_history_credit" line="20" stmtstart="1190" stmtend="5542" sqlhandle="0x030007004c642a7a7c17d000e9a100000100000000000000">
SELECT SUM(t.price * t.qtyord) AS On_Order_Total
FROM PRODATA01..somast m
INNER JOIN PRODATA01..sotran t ON m.sono = t.sono
INNER JOIN SFA..item i ON i.our_part_number COLLATE DATABASE_DEFAULT = t.item COLLATE DATABASE_DEFAULT
INNER JOIN SFA..supplier s ON s.supplier_key = i.supplier_key
INNER JOIN SFA..customer c ON c.abbreviation COLLATE DATABASE_DEFAULT = m.custno COLLATE DATABASE_DEFAULT
INNER JOIN SFA..sales_order_ownership soo ON soo.so_id_col = m.id_col
LEFT JOIN PRODATA01..potran p ON p.id_col = t.po_id_col
LEFT JOIN SFA..alloc_inv a ON a.sono COLLATE DATABASE_DEFAULT = t.sono COLLATE DATABASE_DEFAULT AND a.tranlineno = t.tranlineno
WHERE c.is_visible = 1 AND m.sostat NOT IN ('V','X') AND m.sotype IN ('C','','O')
AND t.sostat NOT IN ('V','X') AND t.sotype IN ('C','','O')
--AND t.rqdate BETWEEN @start_ordate AND @end_ordate
AND UPPER(LEFT(t.item,4)) <> 'SHIP' AND t.item NOT LIKE '[_]%'
AND ((SUBSTRING(m.ornum,2,1) = 'A' AND p.expdate <= @end_ordate) OR (t.rqdate <= @en </frame>
</executionStack>
<inputbuf>
Proc [Database Id = 7 Object Id = 2049598540] </inputbuf>
</process>
<process id="process51df0c2c8" taskpriority="0" logused="28364" waitresource="OBJECT: 12:1369823992:0 " waittime="1032" ownerId="247639856" transactionname="user_transaction" lasttranstarted="2014-05-12T15:26:07.940" XDES="0xf8b5b620" lockMode="X" schedulerid="1" kpid="7640" status="suspended" spid="292" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2014-05-12T15:26:08.410" lastbatchcompleted="2014-05-12T15:26:08.410" clientapp="Sage Pro ERP version 7.5" hostname="VSVR-DESKTOP" hostpid="15892" loginname="AISSCH" isolationlevel="read uncommitted (1)" xactid="247639856" currentdb="12" lockTimeout="4294967295" clientoption1="536870944" clientoption2="128056">
<executionStack>
<frame procname="adhoc" line="1" stmtstart="22" sqlhandle="0x02000000304ac5350b86da8b9422b389413bf23015ac25d0">
UPDATE PRODATA01..SOTRAN WITH (TABLOCK HOLDLOCK) SET lckuser = lckuser WHERE id_col = @P1 </frame>
<frame procname="unknown" line="1" sqlhandle="0x000000000000000000000000000000000000000000000000">
unknown </frame>
</executionStack>
<inputbuf>
(@P1 float)UPDATE PRODATA01..SOTRAN WITH (TABLOCK HOLDLOCK) SET lckuser = lckuser WHERE id_col = @P1 </inputbuf>
</process>
</process-list>
<resource-list>
<objectlock lockPartition="0" objid="644249400" subresource="FULL" dbid="12" objectname="PRODATA01.dbo.somast" id="lock18b5e3900" mode="X" associatedObjectId="644249400">
<owner-list>
<owner id="process51df0c2c8" mode="X" />
</owner-list>
<waiter-list>
<waiter id="process2f848b048" mode="IS" requestType="wait" />
</waiter-list>
</objectlock>
<objectlock lockPartition="0" objid="1369823992" subresource="FULL" dbid="12" objectname="PRODATA01.dbo.sotran" id="lock2ce1c4680" mode="SIX" associatedObjectId="1369823992">
<owner-list>
<owner id="process2f848b048" mode="IS" />
</owner-list>
<waiter-list>
<waiter id="process51df0c2c8" mode="X" requestType="convert" />
</waiter-list>
</objectlock>
</resource-list>
</deadlock>
</deadlock-list>
</TextData>
<TransactionID />
<LoginName>sa</LoginName>
<StartTime>2014-05-12T15:26:09.447</StartTime>
<ServerName>VSVR-SQL</ServerName>
<LoginSid>AQ==</LoginSid>
<EventSequence>2335848</EventSequence>
<IsSystem>1</IsSystem>
<SessionLoginName />
</EVENT_INSTANCE>
I'd (in parallel) look at why parallel plans are being chosen. Not that parallel plans are a bad thing, but is the cost of the execution so high that parallelism is chosen all of the time?
How can I determine the cost of different statements? The Current Cost threshold value is 5.
The last place I would set my effort is on the Dev team. Internal Intranet queries should not take 60 to 120 seconds. That's just asking for issues that you already have. If some larger functionality with that is needed, do it on the back end as
aggregation over a certain time period and use that new static data. Recompute as needed. This is especially true if your deadlocks are happening on these resources (chances are, it is).
We are working with the long queries. Trying to Break them up. We thought about backend processing of the data so its available for the user when they need it, but some of the pages that take time are not accessed that often, so if we gathered
the data every 10 minutes in the background it would be called way many more times in a day than would be called on demand.
Thank you all again! -
Multiple threads block each other, but its not a deadlock!
I have recently upgraded to Berkeley 2.4.16/DB 4.6.21 on Linux 2.6.9-68.9.ELsmp #1 SMP 2008 x86_64 GNU/Linux
Right away I noticed that when I run two threads, one doing a putDocument, another doing a query, they get stuck, as if they were deadlocked, however running db_deadlock does not change the quagmire.
Here is the simplest class I could come up with that produces the problem I have been dealing with:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import com.sleepycat.db.DatabaseException;
import com.sleepycat.db.Environment;
import com.sleepycat.db.EnvironmentConfig;
import com.sleepycat.db.ReplicationManagerAckPolicy;
import com.sleepycat.db.ReplicationTimeoutType;
import com.sleepycat.dbxml.XmlContainer;
import com.sleepycat.dbxml.XmlContainerConfig;
import com.sleepycat.dbxml.XmlDocumentConfig;
import com.sleepycat.dbxml.XmlManager;
import com.sleepycat.dbxml.XmlManagerConfig;
import com.sleepycat.dbxml.XmlQueryContext;
import com.sleepycat.dbxml.XmlResults;
import com.sleepycat.dbxml.XmlUpdateContext;
import com.sleepycat.dbxml.XmlValue;
public class TestEnvironment
private XmlManager i_xmlManager = null;
private XmlContainer i_container = null;
private XmlContainerConfig i_xmlContainerConfig = null;
private XmlDocumentConfig i_docCfg = null;
public TestEnvironment(File dataDir, File dbErr, File dbOut)
throws DatabaseException, FileNotFoundException
final EnvironmentConfig cfg = new EnvironmentConfig();
cfg.setErrorStream(new FileOutputStream(dbErr));
cfg.setMessageStream(new FileOutputStream(dbOut));
cfg.setAllowCreate(true);
cfg.setInitializeLocking(true);
cfg.setInitializeLogging(true);
cfg.setInitializeCache(true);
cfg.setTransactional(true);
cfg.setRunRecovery(false);
cfg.setTxnNoSync(true);
cfg.setTxnNotDurable(false);
cfg.setTxnTimeout(60000000L);
cfg.setCacheSize(1073741824L);
cfg.setThreaded(true);
cfg.setInitializeReplication(false);
cfg.setReplicationLimit(1048576L);
cfg.setVerboseReplication(true);
cfg.setReplicationManagerAckPolicy(ReplicationManagerAckPolicy.NONE);
cfg.setMaxLockers(100000);
cfg.setMaxLockObjects(100000);
cfg.setMaxLocks(100000);
cfg.setLockDown(false);
cfg.setSystemMemory(false);
cfg.setInitializeCDB(false);
final Environment env = new Environment(dataDir, cfg);
env.setReplicationTimeout(ReplicationTimeoutType.ACK_TIMEOUT, 100000);
env.setReplicationTimeout(ReplicationTimeoutType.CONNECTION_RETRY, 100000);
env.setReplicationTimeout(ReplicationTimeoutType.ELECTION_RETRY, 100000);
env.setReplicationTimeout(ReplicationTimeoutType.ELECTION_TIMEOUT, 90000);
final XmlManagerConfig mgrCfg = new XmlManagerConfig();
mgrCfg.setAdoptEnvironment(true);
mgrCfg.setAllowAutoOpen(true);
mgrCfg.setAllowExternalAccess(false);
XmlManager.setLogCategory(XmlManager.CATEGORY_ALL, true);
XmlManager.setLogLevel(XmlManager.LEVEL_ALL, true);
i_xmlManager = new XmlManager(env, mgrCfg);
i_xmlManager.setDefaultContainerType(XmlContainer.NodeContainer);
i_xmlContainerConfig = new XmlContainerConfig();
i_xmlContainerConfig.setAllowValidation(false);
i_xmlContainerConfig.setIndexNodes(true);
i_xmlContainerConfig.setNodeContainer(true);
i_xmlContainerConfig.setTransactional(true);
if (i_xmlManager.existsContainer("container.dbxml") != 0)
i_container = i_xmlManager.openContainer("container.dbxml", i_xmlContainerConfig);
else
i_container = i_xmlManager.createContainer("container.dbxml", i_xmlContainerConfig);
i_docCfg = new XmlDocumentConfig();
i_docCfg.setGenerateName(true);
final TestEnvironment thisRef = this;
Runtime.getRuntime().addShutdownHook(new Thread()
public void run()
try
thisRef.close();
System.out.println("Shutting down the TestEnvironment.");
catch (Exception e)
e.printStackTrace();
public void close() throws DatabaseException
if (i_container != null)
i_container.close();
i_container = null;
if (i_xmlManager != null)
i_xmlManager.close();
i_xmlManager = null;
public void insert(String doc) throws DatabaseException
System.out.println('[' + Thread.currentThread().getName() +
"] insert received document to be inserted");
final long beforeT = System.currentTimeMillis();
final XmlUpdateContext ctxt = i_xmlManager.createUpdateContext();
i_container.putDocument(null, doc, ctxt, i_docCfg);
final long afterT = System.currentTimeMillis();
System.out.println('[' + Thread.currentThread().getName() +
"] insert took " + (afterT - beforeT) + " ms. ");
public String[] query(String xquery) throws DatabaseException
System.out.println('[' + Thread.currentThread().getName() +
"] query \"" + xquery + "\" received.");
String[] retVal = {};
final long beforeT = System.currentTimeMillis();
XmlQueryContext qctxt = null;
XmlResults rs = null;
XmlValue nextValue = null;
try
qctxt = i_xmlManager.createQueryContext();
qctxt.setQueryTimeoutSeconds(10);
rs = i_xmlManager.query(xquery, qctxt);
if (rs != null)
retVal = new String[rs.size()];
for (int i = 0; i < retVal.length && rs.hasNext(); i++)
nextValue = rs.next();
retVal[i] = nextValue.asString();
nextValue.delete();
nextValue = null;
finally
if (nextValue != null)
nextValue.delete();
if (qctxt != null)
qctxt.delete();
if (rs != null)
rs.delete();
final long afterT = System.currentTimeMillis();
System.out.println('[' + Thread.currentThread().getName() +
"] query \"" + xquery + "\" took " + (afterT - beforeT) + " ms. ");
return retVal;
}If I call the insert and the query methods in parallel from two different threads, they will both be stuck, the former in the method com.sleepycat.dbxml.dbxml_javaJNI.XmlContainer_putDocument__SWIG_0(Native Method) and the latter in the method com.sleepycat.dbxml.dbxml_javaJNI.XmlManager_query__SWIG_2(Native Method). I would really appreciate help with this issue, I've looked through all compilation flags for db and dbxml, and all runtime configuration parameters, and I can not figure out why this is not working. Thank you!
Edited by: gmfeinberg on Mar 26, 2009 10:41 AM for formattingUpon your suggestion, I added explicit transactions to my code, and now it looks like this:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import com.sleepycat.db.DatabaseException;
import com.sleepycat.db.Environment;
import com.sleepycat.db.EnvironmentConfig;
import com.sleepycat.db.LockDetectMode;
import com.sleepycat.dbxml.XmlContainer;
import com.sleepycat.dbxml.XmlContainerConfig;
import com.sleepycat.dbxml.XmlDocumentConfig;
import com.sleepycat.dbxml.XmlException;
import com.sleepycat.dbxml.XmlManager;
import com.sleepycat.dbxml.XmlManagerConfig;
import com.sleepycat.dbxml.XmlQueryContext;
import com.sleepycat.dbxml.XmlResults;
import com.sleepycat.dbxml.XmlTransaction;
import com.sleepycat.dbxml.XmlValue;
public class TestEnvironment
private static final int DEADLOCK_DETECTOR_INTERVAL = 5000;
private XmlManager i_xmlManager = null;
private XmlContainer i_container = null;
private XmlContainerConfig i_xmlContainerConfig = null;
private XmlDocumentConfig i_docCfg = null;
private boolean i_shuttingDown = false;
public TestEnvironment(File dataDir, File dbErr, File dbOut)
throws XmlException, DatabaseException, FileNotFoundException
final EnvironmentConfig cfg = new EnvironmentConfig();
cfg.setErrorStream(new FileOutputStream(dbErr));
cfg.setMessageStream(new FileOutputStream(dbOut));
cfg.setAllowCreate(true);
cfg.setInitializeLocking(true);
cfg.setInitializeLogging(true);
cfg.setInitializeCache(true);
cfg.setTransactional(true);
cfg.setRunRecovery(false);
cfg.setTxnNoSync(true);
cfg.setTxnNotDurable(false);
//cfg.setTxnTimeout(500000L);
cfg.setCacheSize(1073741824L);
cfg.setThreaded(true);
cfg.setInitializeReplication(false);
cfg.setMaxLockers(100000);
cfg.setMaxLockObjects(100000);
cfg.setMaxLocks(100000);
cfg.setLockDown(false);
cfg.setSystemMemory(false);
cfg.setInitializeCDB(false);
final Environment env = new Environment(dataDir, cfg);
final XmlManagerConfig mgrCfg = new XmlManagerConfig();
mgrCfg.setAdoptEnvironment(true);
mgrCfg.setAllowAutoOpen(true);
mgrCfg.setAllowExternalAccess(false);
XmlManager.setLogCategory(XmlManager.CATEGORY_ALL, true);
XmlManager.setLogLevel(XmlManager.LEVEL_ALL, true);
i_xmlManager = new XmlManager(env, mgrCfg);
i_xmlManager.setDefaultContainerType(XmlContainer.NodeContainer);
i_xmlContainerConfig = new XmlContainerConfig();
i_xmlContainerConfig.setAllowValidation(false);
i_xmlContainerConfig.setIndexNodes(true);
i_xmlContainerConfig.setNodeContainer(true);
i_xmlContainerConfig.setTransactional(true);
i_xmlContainerConfig.setStatisticsEnabled(false);
if (i_xmlManager.existsContainer("container.dbxml") != 0)
i_container = i_xmlManager.openContainer("container.dbxml", i_xmlContainerConfig);
else
i_container = i_xmlManager.createContainer("container.dbxml", i_xmlContainerConfig);
i_docCfg = new XmlDocumentConfig();
i_docCfg.setGenerateName(true);
final TestEnvironment thisRef = this;
Runtime.getRuntime().addShutdownHook(new Thread()
public void run()
try
thisRef.close();
System.out.println("Shutting down the TestEnvironment.");
catch (Exception e)
e.printStackTrace();
final Thread deadLockDetector = new Thread("deadLockDetector") {
@Override
public void run()
while (!i_shuttingDown)
try
i_xmlManager.getEnvironment().detectDeadlocks(LockDetectMode.YOUNGEST);
System.out.println('[' + Thread.currentThread().getName() +
"] ran deadlock detector.");
Thread.sleep(DEADLOCK_DETECTOR_INTERVAL);
catch (XmlException e)
e.printStackTrace();
catch (DatabaseException e)
e.printStackTrace();
catch (InterruptedException e)
e.printStackTrace();
deadLockDetector.start();
public void close() throws XmlException
i_shuttingDown = true;
if (i_container != null)
i_container.close();
i_container = null;
if (i_xmlManager != null)
i_xmlManager.close();
i_xmlManager = null;
public void insert(String doc) throws XmlException
System.out.println('[' + Thread.currentThread().getName() +
"] insert received document to be inserted");
final long beforeT = System.currentTimeMillis();
final XmlTransaction txn = i_xmlManager.createTransaction();
try
i_container.putDocument(txn, null, doc, i_docCfg);
txn.commit();
catch (XmlException e)
txn.abort();
throw e;
finally
txn.delete();
final long afterT = System.currentTimeMillis();
System.out.println('[' + Thread.currentThread().getName() +
"] insert took " + (afterT - beforeT) + " ms. ");
public String[] query(String xquery) throws XmlException
System.out.println('[' + Thread.currentThread().getName() +
"] query \"" + xquery + "\" received.");
String[] retVal = {};
final long beforeT = System.currentTimeMillis();
XmlQueryContext qctxt = null;
XmlResults rs = null;
XmlValue nextValue = null;
final XmlTransaction txn = i_xmlManager.createTransaction();
try
qctxt = i_xmlManager.createQueryContext();
qctxt.setQueryTimeoutSeconds(10);
rs = i_xmlManager.query(txn, xquery, qctxt);
if (rs != null)
retVal = new String[rs.size()];
for (int i = 0; i < retVal.length && rs.hasNext(); i++)
nextValue = rs.next();
retVal[i] = nextValue.asString();
nextValue.delete();
nextValue = null;
txn.commit();
catch (XmlException e)
txn.abort();
throw e;
finally
txn.delete();
if (nextValue != null)
nextValue.delete();
if (qctxt != null)
qctxt.delete();
if (rs != null)
rs.delete();
final long afterT = System.currentTimeMillis();
System.out.println('[' + Thread.currentThread().getName() +
"] query \"" + xquery + "\" took " + (afterT - beforeT) + " ms. ");
return retVal;
}For the purpose of brevity I omitted the main method, but it merely runs two parallel threads -- one inserting a collection of different XML documents, the other -- querying. Each thread runs a semi-tight loop with a 10 millisecond sleep after each iteration. The documents being inserted are all fairly similar, and well-formed. Here is what happens: if I do not set a maximum transaction timeout, every SINGLE concurrent pair of inserts/queries results in a deadlock, that is only broken when the deadlock detector thread runs every five seconds. Our application does require massive inserts/quieries run in parallel sometimes. If I do set a maximum transaction timeout that expires before the deadlock detector thread runs, I get the following exception:
com.sleepycat.dbxml.XmlException: Error: DB_LOCK_NOTGRANTED: Lock not granted, errcode = DATABASE_ERROR
at com.sleepycat.dbxml.dbxml_javaJNI.XmlContainer_putDocument__SWIG_3(Native Method)
at com.sleepycat.dbxml.XmlContainer.putDocument(XmlContainer.java:736)
at com.sleepycat.dbxml.XmlContainer.putDocument(XmlContainer.java:232)
at com.sleepycat.dbxml.XmlContainer.putDocument(XmlContainer.java:218)
at TestEnvironment.insert(TestEnvironment.java:178)
at TestEnvironment$3.run(TestEnvironment.java:327)which does not seem to be a deadlock.
Now I understand that deadlocks are a fact of life, but I thought they were statistically rare, and in this scenario every single insert/query pair results in a deadlock. The same code works without any deadlocks in the Berkeley DB XML 2.3/Berkeley DB 4.5 combination
Could there be a patch or a bugfix that I missed? This seems very very suspicious. I also tried the same code on both Linux and Windows with the same result
Thank you,
Alexander.
Maybe you are looking for
-
Extended Withholding Tax in Country DE, Company Code 1000 in ECC 5 IDES
hi all can we configure extended withholding tax for the company code 1000 in ECC5 IDES. even after making the whole configuration, while maintaining vendor master code an error message is coming. message: Entry DE 15 does not exist in T059Z - check
-
Problem with common services discovery !
hi, I have set the network discovery setting to use (arp-ospf, and cdp) . after the network discovery was finished I found out that there are 30 devices which had been discovered but in unreachable state !!!! I start to ping all those devices and hop
-
IPhoto problem with email, export, editing...
Had my iMac upgraded with a new harddrive and Lion at my authorized retailer & service center. All has been good but did not use iPhoto much after the upgrade, but all seemed to be good. Recently photographed a family occasion and now I experience th
-
Deleting records from a versioned table slow.
Hi, I have a problem the sql statement delete * from version_table slows down considerably as the number of history entries increase. 10g advises that an index should be created on the primary key column and creation time to improve performance. My q
-
Oracle.jdbc.driver.LogicalConnection error
I have an application in ADF and OAS. It begins working fine, but suddenly it stops and the log shows this error: (message is in Portuguese... translating: "An attempt to use an invalid handle") # java.sql.SQLException: Tentativa de usar um handle in