BDB v5.0.73 - EnvironmentFailureException: (JE 5.0.73) JAVA_ERROR

Hi there!
Im using java berkeley db as caching tool between two applications (backend application is quite slow so we cache some requests). We are caching xml data (one request xml can be ~4MB), if xml data exists we update the cache entry, if does not exists we insert new entry. there can be many concurrent hits same time.
Currently I tested with JMeter and with 10 Threads all works fine but if I increase to 20 Threads following error occurs:
+2013-05-14 15:31:15,914 [ERROR] CacheImpl - error occured while trying to get data from cache.+
com.sleepycat.je.EnvironmentFailureException: (JE 5.0.73) JAVA_ERROR: Java Error occurred, recovery may not be possible. fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0
+     at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1507)+
+     at com.sleepycat.je.Environment.checkEnv(Environment.java:2185)+
+     at com.sleepycat.je.Environment.beginTransactionInternal(Environment.java:1313)+
+     at com.sleepycat.je.Environment.beginTransaction(Environment.java:1284)+
+     at com.ebcont.redbull.bullchecker.cache.impl.CacheImpl.get(CacheImpl.java:157)+
+     at com.ebcont.redbull.bullchecker.handler.EndpointHandler.doPerform(EndpointHandler.java:132)+
+     at com.ebcont.redbull.bullchecker.WSCacheEndpointServlet.doPost(WSCacheEndpointServlet.java:86)+
+     at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)+
+     at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)+
+     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)+
+     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)+
+     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)+
+     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)+
+     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)+
+     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)+
+     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)+
+     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)+
+     at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)+
+     at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)+
+     at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)+
+     at java.lang.Thread.run(Unknown Source)+
Caused by: java.lang.OutOfMemoryError: Java heap space
+2013-05-14 15:31:15,939 [ERROR] CacheImpl - error occured while trying to get data from cache.+
com.sleepycat.je.EnvironmentFailureException: (JE 5.0.73) JAVA_ERROR: Java Error occurred, recovery may not be possible. fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0 fetchTarget of 0x11/0x1d1d parent IN=8 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x1b/0x4cd lastLoggedVersion=0x1b/0x4cd parent.getDirty()=false state=0
+     at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1507)+
+     at com.sleepycat.je.Environment.checkEnv(Environment.java:2185)+
+     at com.sleepycat.je.Environment.beginTransactionInternal(Environment.java:1313)+
+     at com.sleepycat.je.Environment.beginTransaction(Environment.java:1284)+
+     at com.ebcont.redbull.bullchecker.cache.impl.CacheImpl.get(CacheImpl.java:157)+
+     at com.ebcont.redbull.bullchecker.handler.EndpointHandler.doPerform(EndpointHandler.java:132)+
+     at com.ebcont.redbull.bullchecker.WSCacheEndpointServlet.doPost(WSCacheEndpointServlet.java:86)+
+     at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)+
+     at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)+
+     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)+
+     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)+
+     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)+
+     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)+
+     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)+
+     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)+
+     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)+
+     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)+
+     at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)+
+     at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)+
+     at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)+
+     at java.lang.Thread.run(Unknown Source)+
after restarting the server I get following error while trying to get data from cache:
java.lang.OutOfMemoryError: Java heap space
+     at com.sleepycat.je.log.LogUtils.readBytesNoLength(LogUtils.java:365)+
+     at com.sleepycat.je.tree.LN.readFromLog(LN.java:786)+
+     at com.sleepycat.je.log.entry.LNLogEntry.readBaseLNEntry(LNLogEntry.java:196)+
+     at com.sleepycat.je.log.entry.LNLogEntry.readEntry(LNLogEntry.java:130)+
+     at com.sleepycat.je.log.LogManager.getLogEntryFromLogSource(LogManager.java:1008)+
+     at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:848)+
+     at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:809)+
+     at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1412)+
+     at com.sleepycat.je.tree.BIN.fetchTarget(BIN.java:1251)+
+     at com.sleepycat.je.dbi.CursorImpl.fetchCurrent(CursorImpl.java:2261)+
+     at com.sleepycat.je.dbi.CursorImpl.getCurrentAlreadyLatched(CursorImpl.java:1466)+
+     at com.sleepycat.je.dbi.CursorImpl.getNext(CursorImpl.java:1593)+
+     at com.sleepycat.je.Cursor.retrieveNextAllowPhantoms(Cursor.java:2924)+
+     at com.sleepycat.je.Cursor.retrieveNextNoDups(Cursor.java:2801)+
+     at com.sleepycat.je.Cursor.retrieveNext(Cursor.java:2775)+
+     at com.sleepycat.je.Cursor.getNextNoDup(Cursor.java:1244)+
+     at com.ebcont.redbull.bullchecker.cache.impl.BDBCacheImpl.getStoredKeys(BDBCacheImpl.java:244)+
+     at com.ebcont.redbull.bullchecker.CacheStatisticServlet.doPost(CacheStatisticServlet.java:108)+
+     at com.ebcont.redbull.bullchecker.CacheStatisticServlet.doGet(CacheStatisticServlet.java:74)+
+     at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)+
+     at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)+
+     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)+
+     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)+
+     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)+
+     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)+
+     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)+
+     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)+
+     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)+
+     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)+
+     at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)+
+     at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)+
+     at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)+
my bdb configuration:
environmentConfig.setReadOnly(false);
databaseConfig.setReadOnly(false);
environmentConfig.setAllowCreate(true);
databaseConfig.setAllowCreate(true);
environmentConfig.setTransactional(true);
databaseConfig.setTransactional(true);
environmentConfig.setCachePercent(60);
environmentConfig.setLockTimeout(2000, TimeUnit.MILLISECONDS);
environmentConfig.setCacheMode(CacheMode.DEFAULT);
environment path: C:/tmp/berkeleydb
Tomcat JVM Parameters: Initial Memory Pool: 1024
Maximum Memory Pool: 2048
Server: Windows Server 2008
Memory: 8 GB
Edited by: 1005842 on 14.05.2013 07:22
Edited by: 1005842 on 14.05.2013 07:23
Edited by: 1005842 on 14.05.2013 07:37

Hi,
The stack trace shows an OOME error due to running out of heap space.
Could you detail what is the exact Java version you are using, on what OS, and what are the JVM options, in particular the max heap size (-Xmx), you are using?
Also, what is the JE cache size you use? (if you do not set any of the MAX_MEMORY or MAX_MEMORY_PERCENT then the JE cache size will default to 60% of the JVM max heap size)
You should look into the way you are using transactions, cursors etc. It might be possible that you are using long running transactions that accumulate a large number of locks, or you might be opening more and more transactions without closing/completing them (by aborting or committing them). Is any of this the case for your application? You can check the lock and transaction statistics using Environment.getStats() and respectively Environment.getTransactionStats().
Aside properly ending/closing transactions and cursors, you should also examine your cache statistics to understand the memory profile. See the following documentation sections on this:
http://docs.oracle.com/cd/E17277_02/html/GettingStartedGuide/cachesize.html
http://www.oracle.com/technetwork/database/berkeleydb/je-faq-096044.html#HowcanIestimatemyapplicationsoptimalcachesize
http://www.oracle.com/technetwork/database/berkeleydb/je-faq-096044.html#WhyshouldtheJEcachebelargeenoughtoholdtheBtreeinternalnodes
Regards,
Andrei

Similar Messages

  • LOG_FILE_NOT_FOUND bug possible in current BDB JE?

    I've seen references to the LOG_FILE_NOT_FOUND bug in older BDB JE versions (4.x and 5 <= 5.0.34(, however, I seem to be suffering something similar with 5.0.48.
    I have a non-transactional, deferred-write DB that seems to have gotten itself into an inconsistent state. It was fine loading several million records, but after ~8 hours of operation, bailed out with:
    com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 5.0.55) /tmp/data/index fetchTarget of 0x9f1/0x24d34eb parent IN=44832 IN class=com.sleepycat.je.tree.BIN lastFullVersion=0xdcf/0x5a96c91 lastLoggedVersion=0xdcf/0x5a96c91 parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.
         at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1429)
         at com.sleepycat.je.tree.BIN.fetchTarget(BIN.java:1251)
         at com.sleepycat.je.dbi.CursorImpl.fetchCurrent(CursorImpl.java:2229)
         at com.sleepycat.je.dbi.CursorImpl.getCurrentAlreadyLatched(CursorImpl.java:1434)
         at com.sleepycat.je.Cursor.searchInternal(Cursor.java:2716)
         at com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:2576)
         at com.sleepycat.je.Cursor.searchNoDups(Cursor.java:2430)
         at com.sleepycat.je.Cursor.search(Cursor.java:2397)
         at com.sleepycat.je.Database.get(Database.java:1042)
         at com.xxxx.db.BDBCalendarStorageBackend.indexCalendar(BDBCalendarStorageBackend.java:95)
         at com.xxxx.indexer.TicketIndexer.indexDeltaLogs(TicketIndexer.java:201)
         at com.xxxx.indexer.DeltaLogLoader.run(DeltaLogLoader.java:87)
    Caused by: java.io.FileNotFoundException: /tmp/data/index/000009f1.jdb (No such file or directory)
         at java.io.RandomAccessFile.open(Native Method)
         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:101)
         at com.sleepycat.je.log.FileManager$6.<init>(FileManager.java:1282)
         at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:1281)
         at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:1147)
         at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1102)
         at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:808)
         at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:772)
         at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1412)
         ... 11 more
    Subsequent opens/use on the DB pretty much instantly yield the same error. I tried upgrading to 5.0.55 (hence the ver in the output above) but still get the same error.
    As a recovery attempt, I used DbDump to try to dump the DB, however, it failed with a similar error. Enabling salvage mode enabled me to successfuly dump it, however, reloading it into a clean environment by programmatically running DbLoad.load() (so I can setup my env) caused the following error (after about 30% of the DB has restored):
    Exception in thread "main" com.sleepycat.je.EnvironmentFailureException: (JE 5.0.55) Node 11991 should have been split before calling insertEntry UNEXPECTED_STATE: Unexpected internal state, may have side effects. fetchTarget of 0x25/0x155a822 parent IN=2286 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x3e/0x118d8f6 lastLoggedVersion=0x3e/0x118d8f6 parent.getDirty()=false state=0
         at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:376)
         at com.sleepycat.je.tree.IN.insertEntry1(IN.java:2326)
         at com.sleepycat.je.tree.IN.insertEntry(IN.java:2296)
         at com.sleepycat.je.tree.BINDelta.reconstituteBIN(BINDelta.java:216)
         at com.sleepycat.je.tree.BINDelta.reconstituteBIN(BINDelta.java:144)
         at com.sleepycat.je.log.entry.BINDeltaLogEntry.getIN(BINDeltaLogEntry.java:53)
         at com.sleepycat.je.log.entry.BINDeltaLogEntry.getResolvedItem(BINDeltaLogEntry.java:43)
         at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1422)
         at com.sleepycat.je.tree.Tree.searchSubTreeUntilSplit(Tree.java:1786)
         at com.sleepycat.je.tree.Tree.searchSubTreeSplitsAllowed(Tree.java:1729)
         at com.sleepycat.je.tree.Tree.searchSplitsAllowed(Tree.java:1296)
         at com.sleepycat.je.tree.Tree.findBinForInsert(Tree.java:2205)
         at com.sleepycat.je.dbi.CursorImpl.putInternal(CursorImpl.java:834)
         at com.sleepycat.je.dbi.CursorImpl.put(CursorImpl.java:779)
         at com.sleepycat.je.Cursor.putAllowPhantoms(Cursor.java:2243)
         at com.sleepycat.je.Cursor.putNoNotify(Cursor.java:2200)
         at com.sleepycat.je.Cursor.putNotify(Cursor.java:2117)
         at com.sleepycat.je.Cursor.putNoDups(Cursor.java:2052)
         at com.sleepycat.je.Cursor.putInternal(Cursor.java:2020)
         at com.sleepycat.je.Database.putInternal(Database.java:1324)
         at com.sleepycat.je.Database.put(Database.java:1194)
         at com.sleepycat.je.util.DbLoad.loadData(DbLoad.java:544)
         at com.sleepycat.je.util.DbLoad.load(DbLoad.java:414)
         at com.xxxx.db.BDBCalendarStorageBackend.loadBDBDump(BDBCalendarStorageBackend.java:254)
         at com.xxxx.cli.BDBTool.run(BDBTool.java:49)
         at com.xxxx.cli.AbstractBaseCommand.execute(AbstractBaseCommand.java:114)
         at com.xxxx.cli.BDBTool.main(BDBTool.java:69)
    The only other slightly exotic thing I'm using is a custom partial BTree comparator, however, it quite happily loaded/updated literally tens of millions of records for hours before the FileNotFound error cropped up, so it seems unlikely that would be the cause.
    Any ideas?
    Thanks in advance,
    fb.

    Thanks heaps to Mark for working through this with me.You're welcome. Thanks for following up and explaining it for the benefit of others. And I'm very glad it wasn't a JE bug!
    My solution is to switch to using a secondary database for providing differentiated "uniqueness" vs "ordering".An index for uniqueness may be a good solution. But as you said in email, it adds significant overhead (memory and disk). This overhead can be minimized by keeping your keys (primary and secondary) as small as possible, and enabling key prefixing.
    I'd also like to point out that adding a secondary isn't always the best choice. For example, if the number of keys with the same C1 value is fairly small, another way of checking for uniqueness (when inserting) is to iterate over them, looking for a match on C1:C3. The cost of this iteration may be less than the cost of maintaining a uniqueness index. To make this work, you'll have to use Serializable isolation during the iteration, to prevent another thread from inserting a key in that range.
    If you're pushing the performance limits of your hardware, it may be worth trying more than one such approach and comparing the performance. If performance is not a big concern, then the additional index is the simplest approach to get right.
    --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

  • EnvironmentFailureException thrown while recovering the database!

    While recovering the database, an EnvironmentFailureException with LOG_FILE_NOT_FOUND was thrown. The exception was thrown after some data was recovered, and the left data can not be recovered because of the EnvironmentFailureException.
    I upgraded the je to 4.1.7, but the data still can not be recovered!
    Caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 4.0.92) /home/admin/shopcenter/cdncleaner fetchTarget of 0x64/0x3b8f73 parent IN=8811763 IN class=com.sleepycat.je.tree.IN lastFullVersion=0xffffffff/0xffffffff parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.
         at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1241)
         at com.sleepycat.je.tree.Tree.searchSubTreeInternal(Tree.java:1858)
         at com.sleepycat.je.tree.Tree.searchSubTree(Tree.java:1682)
         at com.sleepycat.je.tree.Tree.search(Tree.java:1548)
         at com.sleepycat.je.dbi.CursorImpl.searchAndPosition(CursorImpl.java:2054)
         at com.sleepycat.je.Cursor.searchInternal(Cursor.java:2088)
         at com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:2058)
         at com.sleepycat.je.Cursor.search(Cursor.java:1926)
         at com.sleepycat.je.Cursor.getSearchKey(Cursor.java:1351)
         at com.sleepycat.util.keyrange.RangeCursor.doGetSearchKey(RangeCursor.java:966)
         at com.sleepycat.util.keyrange.RangeCursor.getSearchKey(RangeCursor.java:593)
         at com.sleepycat.collections.DataCursor.doGetSearchKey(DataCursor.java:571)
         at com.sleepycat.collections.DataCursor.initForPut(DataCursor.java:812)
         at com.sleepycat.collections.DataCursor.put(DataCursor.java:752)
         at com.sleepycat.collections.StoredContainer.putKeyValue(StoredContainer.java:322)
         at com.sleepycat.collections.StoredMap.put(StoredMap.java:280)
         at com.taobao.shopservice.picture.core.util.BdbStoredQueueImpl.offer(BdbStoredQueueImpl.java:118)
         at com.taobao.shopservice.picture.core.service.CdnClearServiceImpl.clearCdnCache(CdnClearServiceImpl.java:45)
         at sun.reflect.GeneratedMethodAccessor484.invoke(Unknown Source)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:304)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
         at com.taobao.shopservice.common.monitor.ProfileInterceptor.invoke(ProfileInterceptor.java:26)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
         at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
         at $Proxy74.clearCdnCache(Unknown Source)
         at com.taobao.shopservice.picture.core.service.PictureWriteServiceImpl.movePicturesToRecycleBin(PictureWriteServiceImpl.java:302)
         at com.taobao.shopservice.picture.core.service.PictureWriteServiceImpl.deletePictures(PictureWriteServiceImpl.java:207)
         at sun.reflect.GeneratedMethodAccessor483.invoke(Unknown Source)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:304)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
         at com.taobao.shopservice.common.monitor.ProfileInterceptor.invoke(ProfileInterceptor.java:26)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
         at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
         at $Proxy77.deletePictures(Unknown Source)
         at sun.reflect.GeneratedMethodAccessor482.invoke(Unknown Source)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at com.taobao.hsf.rpc.tbremoting.provider.ProviderProcessor.handleRequest0(ProviderProcessor.java:222)
         at com.taobao.hsf.rpc.tbremoting.provider.ProviderProcessor.handleRequest(ProviderProcessor.java:174)
         at com.taobao.hsf.rpc.tbremoting.provider.ProviderProcessor.handleRequest(ProviderProcessor.java:41)
         at com.taobao.remoting.impl.DefaultMsgListener$1ProcessorExecuteTask.run(DefaultMsgListener.java:131)
         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
         at java.lang.Thread.run(Thread.java:619)
    Caused by: java.io.FileNotFoundException: /home/admin/shopcenter/cdncleaner/00000064.jdb (No such file or directory)
         at java.io.RandomAccessFile.open(Native Method)
         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:98)
         at com.sleepycat.je.log.FileManager$1.<init>(FileManager.java:993)
         at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:992)
         at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:888)
         at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1073)
         at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:779)
         at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:743)
         at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1225)
         ... 49 more
    2011-03-24 00:00:27,967 INFO [org.quartz.core.JobRunShell] Job DEFAULT.cdnCleanerJobDetail threw a JobExecutionException:
    org.quartz.JobExecutionException: Invocation of method 'clearCdn' on target class [class com.taobao.shopservice.picture.core.job.clearcdn.CdnCleaner] failed [See nested exception: com.sleepycat.je.EnvironmentFailureException: (JE 4.0.92) Environment must be closed, caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 4.0.92) /home/admin/shopcenter/cdncleaner fetchTarget of 0x64/0x3b8f73 parent IN=8811763 IN class=com.sleepycat.je.tree.IN lastFullVersion=0xffffffff/0xffffffff parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.]
         at sun.reflect.GeneratedConstructorAccessor102.newInstance(Unknown Source)
         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
         at org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:85)
         at org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean$MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:283)
         at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)
         at org.quartz.core.JobRunShell.run(JobRunShell.java:203)
         at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
    * Nested Exception (Underlying Cause) ---------------
    com.sleepycat.je.EnvironmentFailureException: (JE 4.0.92) Environment must be closed, caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 4.0.92) /home/admin/shopcenter/cdncleaner fetchTarget of 0x64/0x3b8f73 parent IN=8811763 IN class=com.sleepycat.je.tree.IN lastFullVersion=0xffffffff/0xffffffff parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.
         at com.sleepycat.je.EnvironmentFailureException.wrapSelf(EnvironmentFailureException.java:197)
         at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1403)
         at com.sleepycat.je.Database.checkEnv(Database.java:1772)
         at com.sleepycat.je.Database.openCursor(Database.java:619)
         at com.sleepycat.collections.CurrentTransaction.openCursor(CurrentTransaction.java:416)
         at com.sleepycat.collections.MyRangeCursor.openCursor(MyRangeCursor.java:54)
         at com.sleepycat.collections.MyRangeCursor.<init>(MyRangeCursor.java:30)
         at com.sleepycat.collections.DataCursor.init(DataCursor.java:171)
         at com.sleepycat.collections.DataCursor.<init>(DataCursor.java:59)
         at com.sleepycat.collections.StoredContainer.getValue(StoredContainer.java:301)
         at com.sleepycat.collections.StoredMap.get(StoredMap.java:241)
         at com.taobao.shopservice.picture.core.util.BdbStoredQueueImpl.peek(BdbStoredQueueImpl.java:131)
         at com.taobao.shopservice.picture.core.util.BdbStoredQueueImpl.poll(BdbStoredQueueImpl.java:169)
         at com.taobao.shopservice.picture.core.job.clearcdn.CdnCleaner.clearCdn(CdnCleaner.java:194)
         at sun.reflect.GeneratedMethodAccessor641.invoke(Unknown Source)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at org.springframework.util.MethodInvoker.invoke(MethodInvoker.java:283)
         at org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean$MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:272)
         at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)
         at org.quartz.core.JobRunShell.run(JobRunShell.java:203)
         at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
    Caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 4.0.92) /home/admin/shopcenter/cdncleaner fetchTarget of 0x64/0x3b8f73 parent IN=8811763 IN class=com.sleepycat.je.tree.IN lastFullVersion=0xffffffff/0xffffffff parent.getDirty()=true state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.
         at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1241)
         at com.sleepycat.je.tree.Tree.searchSubTreeInternal(Tree.java:1858)
         at com.sleepycat.je.tree.Tree.searchSubTree(Tree.java:1682)
         at com.sleepycat.je.tree.Tree.search(Tree.java:1548)
         at com.sleepycat.je.dbi.CursorImpl.searchAndPosition(CursorImpl.java:2054)
         at com.sleepycat.je.Cursor.searchInternal(Cursor.java:2088)
         at com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:2058)
         at com.sleepycat.je.Cursor.search(Cursor.java:1926)
         at com.sleepycat.je.Cursor.getSearchKey(Cursor.java:1351)
         at com.sleepycat.util.keyrange.RangeCursor.doGetSearchKey(RangeCursor.java:966)
         at com.sleepycat.util.keyrange.RangeCursor.getSearchKey(RangeCursor.java:593)
         at com.sleepycat.collections.DataCursor.doGetSearchKey(DataCursor.java:571)
         at com.sleepycat.collections.DataCursor.initForPut(DataCursor.java:812)
         at com.sleepycat.collections.DataCursor.put(DataCursor.java:752)
         at com.sleepycat.collections.StoredContainer.putKeyValue(StoredContainer.java:322)
         at com.sleepycat.collections.StoredMap.put(StoredMap.java:280)
         at com.taobao.shopservice.picture.core.util.BdbStoredQueueImpl.offer(BdbStoredQueueImpl.java:118)
         at com.taobao.shopservice.picture.core.service.CdnClearServiceImpl.clearCdnCache(CdnClearServiceImpl.java:45)
         at sun.reflect.GeneratedMethodAccessor484.invoke(Unknown Source)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:304)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
         at com.taobao.shopservice.common.monitor.ProfileInterceptor.invoke(ProfileInterceptor.java:26)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
         at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
         at $Proxy74.clearCdnCache(Unknown Source)
         at com.taobao.shopservice.picture.core.service.PictureWriteServiceImpl.movePicturesToRecycleBin(PictureWriteServiceImpl.java:302)
         at com.taobao.shopservice.picture.core.service.PictureWriteServiceImpl.deletePictures(PictureWriteServiceImpl.java:207)
         at sun.reflect.GeneratedMethodAccessor483.invoke(Unknown Source)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:304)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
         at com.taobao.shopservice.common.monitor.ProfileInterceptor.invoke(ProfileInterceptor.java:26)
         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
         at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
         at $Proxy77.deletePictures(Unknown Source)
         at sun.reflect.GeneratedMethodAccessor482.invoke(Unknown Source)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at com.taobao.hsf.rpc.tbremoting.provider.ProviderProcessor.handleRequest0(ProviderProcessor.java:222)
         at com.taobao.hsf.rpc.tbremoting.provider.ProviderProcessor.handleRequest(ProviderProcessor.java:174)
         at com.taobao.hsf.rpc.tbremoting.provider.ProviderProcessor.handleRequest(ProviderProcessor.java:41)
         at com.taobao.remoting.impl.DefaultMsgListener$1ProcessorExecuteTask.run(DefaultMsgListener.java:131)
         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
         at java.lang.Thread.run(Thread.java:619)
    Caused by: java.io.FileNotFoundException: /home/admin/shopcenter/cdncleaner/00000064.jdb (No such file or directory)
         at java.io.RandomAccessFile.open(Native Method)
         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:98)
         at com.sleepycat.je.log.FileManager$1.<init>(FileManager.java:993)
         at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:992)
         at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:888)
         at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1073)
         at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:779)
         at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:743)
         at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1225)
         ... 49 more

    I mean that i can open the database and read some data from it, which was stored before the database was closed, and than the exception was thrown.
    here is exception stack with JE 4.1.7:
    com.sleepycat.je.EnvironmentFailureException: (JE 4.1.7) F:\job fetchTarget of 0x64/0x4735f7 parent IN=7847269 IN class=com.sleepycat.je.tree.IN lastFullVersion=0x66/0x927f09 parent.getDirty()=false state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.
         at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1337)
         at com.sleepycat.je.tree.IN.fetchTargetWithExclusiveLatch(IN.java:1278)
         at com.sleepycat.je.tree.Tree.getNextBinInternal(Tree.java:1358)
         at com.sleepycat.je.tree.Tree.getPrevBin(Tree.java:1240)
         at com.sleepycat.je.dbi.CursorImpl.getNextWithKeyChangeStatus(CursorImpl.java:1754)
         at com.sleepycat.je.dbi.CursorImpl.getNext(CursorImpl.java:1617)
         at com.sleepycat.je.Cursor.retrieveNextAllowPhantoms(Cursor.java:2488)
         at com.sleepycat.je.Cursor.retrieveNext(Cursor.java:2304)
         at com.sleepycat.je.Cursor.getPrev(Cursor.java:1190)
         at com.ppsoft.bdb.test.Main.main(Main.java:52)
    Caused by: java.io.FileNotFoundException: F:\job\00000064.jdb (No such file or directory)
         at java.io.RandomAccessFile.open(Native Method)
         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:98)
         at com.sleepycat.je.log.FileManager$1.<init>(FileManager.java:995)
         at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:994)
         at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:890)
         at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1074)
         at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:778)
         at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:742)
         at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1320)
         ... 9 more

  • EnvironmentFailureException question

    Hi,
    I'm relatively new to BDB... I've (part-)written a distributed crawler which uses BDB to store a persistent on-disk queue of URIs. This is the setup.
    +/* Open a transactional Berkeley DB engine environment. */+
    EnvironmentConfig envConfig = new EnvironmentConfig();
    envConfig.setAllowCreate(true);
    envConfig.setTransactional(true);
    _env = new Environment(envDir, envConfig);+
    +/* Open a transactional entity store. */+
    StoreConfig storeConfig = new StoreConfig();
    storeConfig.setAllowCreate(true);
    storeConfig.setTransactional(true);
    store = new EntityStore(env, this.getClass().getSimpleName(), storeConfig);+
    +/* Primary index of the queue */+
    urlIndex = store.getPrimaryIndex(String.class, URLObject.class);+
    +/* Secondary index of the queue */+
    countIndex = store.getSecondaryIndex(_urlIndex, Integer.class, "count");+
    The machine is running Java 1.6.0_12 on Debian 5.0.4. The environment directory is on a local partition (in fact, everything is pretty much local as far as BDB can see).
    This setup seems to work quite well. However, after about 30 hours or crawling one server (of eight) dies with the initial exception:
    +<DaemonThread name="Cleaner-1"/> caught exception: com.sleepycat.je.EnvironmentFailureException: (JE 4.0.71) /data/webdb/may10/crawl/q java.io.IOException: Input/output error LOG_READ: IOException on read, log is likely invalid. Environment is invalid and must be closed.+
    com.sleepycat.je.EnvironmentFailureException: (JE 4.0.71) /data/webdb/may10/crawl/q java.io.IOException: Input/output error LOG_READ: IOException on read, log is likely invalid. Environment is invalid and must be closed.
    at com.sleepycat.je.log.FileManager.readFromFile(FileManager.java:1516)
    at com.sleepycat.je.log.FileReader$ReadWindow.fillFromFile(FileReader.java:1116)
    at com.sleepycat.je.log.FileReader$ReadWindow.fillNext(FileReader.java:1074)
    at com.sleepycat.je.log.FileReader.readData(FileReader.java:759)
    at com.sleepycat.je.log.FileReader.readNextEntryAllowExceptions(FileReader.java:315)
    at com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:396)
    at com.sleepycat.je.cleaner.FileProcessor.doClean(FileProcessor.java:236)
    at com.sleepycat.je.cleaner.FileProcessor.onWakeup(FileProcessor.java:141)
    at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:161)
    at java.lang.Thread.run(Thread.java:619)
    Caused by: java.io.IOException: Input/output error
    at java.io.RandomAccessFile.readBytes(Native Method)
    at java.io.RandomAccessFile.read(RandomAccessFile.java:322)
    at com.sleepycat.je.log.FileManager.readFromFileInternal(FileManager.java:1551)
    at com.sleepycat.je.log.FileManager.readFromFile(FileManager.java:1506)
    +... 9 more+
    Exiting
    Followed by a plethora of similar exceptions for different lookup threads:
    Exception in thread "LookupThread-XXX" ...
    Also, in the je.info.0 file in the environment directory, I found the following:
    +SEVERE [data/webdb/may10/crawl/q]Halted log file reading at file 0x997 offset 0x1edcb offset(decimal)=126411 prev=0x1ed84:+
    entry=BINDeltatype=22,version=7)
    prev=0x1ed84
    size=886
    Next entry should be at 0x1f14f
    I'm generally at a loss as to why this happened. There's no obvious cause (such as no disk space, etc.) and it seems more-so that the index is corrupted. At the time of the exception, there is 1.5GB in the environment directory (~150 x 9.6M *.jdb files).
    I cannot really reproduce the error/create a test-case, so I don't know if updating JE would help -- I'd like to have as much information on the bug as possible, and have implemented as sure a fix as possible before I try restarting the crawl.
    Any help/thoughts greatly appreciated.

    Aidan,
    From the error message and the je.info logging you showed us, the complaint seems to be about the file /data/webdb/may10/crawl/q/00000997.jdb. But unfortunately, the Java IOException is not telling you much.
    What's happened is that our daemon thread, the log cleaner, was going along, and it hit an IOException when reading that file. We just wrap the IOException, and send it back to up, and in this case, the exception is particularly uninformative ("Input/output exception"). Since we toString the exception, I think there's really not any more message. We've noticed that JDKs on different platforms can have more or less informative IOExceptions -- perhaps what's available underneath?
    The string "LOG_READ: IOException on read, log is likely invalid. Environment is invalid and must be closed." is a wrapper from JE, and is potentially a bit alarmist. We take the conservative approach that if anything unexpected goes wrong with a read, we don't want to return bad data, and we shut down the whole environment. Other threads will note the invalidation of the environment and will also shut down. Tthough it can certainly always be a JE bug, in this case, it seems more like some kind of underlying and transient system issue.
    JE does detect when it is able to do a read and it thinks the data is bad. In that case, you get an checksum exception. This is different, in that something killed off the read operation itself. Could something have sent an interrupt to the process? Though in those cases, we usually see an InterruptedException.
    One thing you can do is to use the com.sleepycat.je.util.DbPrintLog utility to just read the afflicted file. Despite the JE suggestion of an invalid environment, in this case, it smells a little more like a transient platform I/O problem. You could do "java -jar je.jar DbPrintLog -h /data/webdb/may10/crawl/q -s 0x997 > temp.xml" and see if that file can be read. If it is successful, the utility will dump the contents of the log and you'll see a complete xml file, with no exception at the end. If it can, it would reinforce the likelihood that some transient IO incident occurred, and you may well be able to restart the environment with no problem.
    Hope that give you a start,
    Linda

  • EnvironmentFailureException

    Hello!
    I've upgraded my systems to use 6.0.11 instead of 5.0.97. Everything works fine, storage is 20+ percent more compact and so on, BUT.
    1) Under load (all the time) on production server, after every 25-40 minutes uptime the
        > (JE 6.0.11) Environment must be closed, caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 6.0.11) /usr/local/ae3/private/data/bdbj-lcl java.lang.AssertionError UNCAUGHT_EXCEPTION: Uncaught Exception in internal thread, unable to continue. Environment is invalid and must be closed.
        > com.sleepycat.je.EnvironmentFailureException
            > Environment invalid because of previous exception: (JE 6.0.11) /usr/local/ae3/private/data/bdbj-lcl java.lang.AssertionError UNCAUGHT_EXCEPTION: Uncaught Exception in internal thread, unable to continue. Environment is invalid and must be closed.
            > com.sleepycat.je.EnvironmentFailureException
                > null
                > java.lang.AssertionError
                  : com.sleepycat.je.evictor.LRUEvictor.processTarget(LRUEvictor.java:2014)
                  : com.sleepycat.je.evictor.LRUEvictor.findParentAndRetry(LRUEvictor.java:2182)
                  : com.sleepycat.je.evictor.LRUEvictor.processTarget(LRUEvictor.java:2019)
                  : com.sleepycat.je.evictor.LRUEvictor.evictBatch(LRUEvictor.java:1689)
                  : com.sleepycat.je.evictor.LRUEvictor.doEvict(LRUEvictor.java:1538)
                  : com.sleepycat.je.evictor.Evictor$BackgroundEvictTask.run(Evictor.java:739)
                  : java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                  : java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                  : java.lang.Thread.run(Thread.java:724)
    error appears and floods everything, I have to restart the instance and then it works another 30 minutes approx.
    Same thing was happening on non heavy loaded (all) servers during log file upgrade to new format, timing was random tough. None of the environments that are bigger than 1G size were upgraded 'in one go' - I had to restart several times because of the same error.
    2) I had to reduce the number of cleaner threads to 1 (config.setConfigParam( EnvironmentConfig.CLEANER_THREADS, "1" )) - otherwise it was not starting AT ALL, on every database instance it was failing on with words like "expect BEING_CLEANED but CLEANED"
    I am using low-level BDB JE API with cursors and byte arrays. No secondary databases. No DPL. Does anyone experiencing anything like that?

    2nd issue regarding @ I had to reduce the number of cleaner threads to 1 (config.setConfigParam( EnvironmentConfig.CLEANER_THREADS, "1" )) - otherwise it was not starting AT ALL, on every database instance it was failing on with words like "expect BEING_CLEANED but CLEANED"@ is 100% reproducible.
    The main issue with com.sleepycat.je.evictor.LRUEvictor is happening on one production server currently (57G data, 3/4 of that rotates per week (new data added, old data cleaned)), but it was happening on other servers while upgrading from previous log format.
    My workaround was:
      if (environment != null && !environment.isValid()) {
           WorkerBdbj.LOG.event( "BDBJ-WORKER:FAILURE:FATAL",
                "Environment is invalid!",
                Convert.Throwable.toText( new IllegalStateException( "this:" + this + ", env:" + environment ) ) );
           try {
                environment.close();
           } catch (final Throwable t) {
                // ignore
           Runtime.getRuntime().exit( -37 );
    Thanks for -da:com.sleepycat.je.evictor.LRUEvictor hint, anyway I wouldn't be able to assert it is safe to do so without your reply!
    Several times I noticed different stack traces (normally it is like in initial post).
        > (JE 6.0.11) JAVA_ERROR: Java Error occurred, recovery may not be possible.
        > com.sleepycat.je.EnvironmentFailureException
            > null
            > java.lang.AssertionError
              : com.sleepycat.je.evictor.LRUEvictor.processTarget(LRUEvictor.java:2014)
              : com.sleepycat.je.evictor.LRUEvictor.findParentAndRetry(LRUEvictor.java:2182)
              : com.sleepycat.je.evictor.LRUEvictor.processTarget(LRUEvictor.java:2019)
              : com.sleepycat.je.evictor.LRUEvictor.evictBatch(LRUEvictor.java:1689)
              : com.sleepycat.je.evictor.LRUEvictor.doEvict(LRUEvictor.java:1538)
              : com.sleepycat.je.evictor.Evictor.doCriticalEviction(Evictor.java:469)
              : com.sleepycat.je.dbi.EnvironmentImpl.criticalEviction(EnvironmentImpl.java:2726)
              : com.sleepycat.je.dbi.CursorImpl.criticalEviction(CursorImpl.java:624)
              : com.sleepycat.je.Cursor.beginMoveCursor(Cursor.java:4217)
              : com.sleepycat.je.Cursor.beginMoveCursor(Cursor.java:4237)
              : com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:2795)
              : com.sleepycat.je.Cursor.searchNoDups(Cursor.java:2647)
              : com.sleepycat.je.Cursor.search(Cursor.java:2594)
              : com.sleepycat.je.Cursor.search(Cursor.java:2579)
              : com.sleepycat.je.Cursor.getSearchKey(Cursor.java:1698)
        > (JE 6.0.11) JAVA_ERROR: Java Error occurred, recovery may not be possible. 
        > com.sleepycat.je.EnvironmentFailureException
            > null
            > java.lang.AssertionError
              : com.sleepycat.je.evictor.LRUEvictor.processTarget(LRUEvictor.java:2014)
              : com.sleepycat.je.evictor.LRUEvictor.findParentAndRetry(LRUEvictor.java:2182)
              : com.sleepycat.je.evictor.LRUEvictor.processTarget(LRUEvictor.java:2019)
              : com.sleepycat.je.evictor.LRUEvictor.evictBatch(LRUEvictor.java:1689)
              : com.sleepycat.je.evictor.LRUEvictor.doEvict(LRUEvictor.java:1538)
              : com.sleepycat.je.evictor.Evictor.doCriticalEviction(Evictor.java:469)
              : com.sleepycat.je.dbi.EnvironmentImpl.criticalEviction(EnvironmentImpl.java:2726)
              : com.sleepycat.je.dbi.CursorImpl.criticalEviction(CursorImpl.java:624)
              : com.sleepycat.je.dbi.CursorImpl.close(CursorImpl.java:583)
              : com.sleepycat.je.Cursor.endMoveCursor(Cursor.java:4269)
              : com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:2811)
              : com.sleepycat.je.Cursor.searchNoDups(Cursor.java:2647)
              : com.sleepycat.je.Cursor.search(Cursor.java:2594)
              : com.sleepycat.je.Cursor.search(Cursor.java:2579)
              : com.sleepycat.je.Cursor.getSearchKeyRange(Cursor.java:1757)

  • EnvironmentFailureException on opening EntityStore

    Adding a new secondary key field to an entity class made it impossible to open EntityStore:
    com.sleepycat.je.EnvironmentFailureException: (JE 4.1.6) UNEXPECTED_STATE: Unexpected internal state, may have side effects.
         at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:347)
         at com.sleepycat.compat.DbCompat.unexpectedState(DbCompat.java:507)
         at com.sleepycat.persist.impl.EnhancedAccessor.newInstance(EnhancedAccessor.java:104)
         at com.sleepycat.persist.impl.ComplexFormat.checkNewSecKeyInitializer(ComplexFormat.java:475)
         at com.sleepycat.persist.impl.ComplexFormat.initialize(ComplexFormat.java:451)
         at com.sleepycat.persist.impl.Format.initializeIfNeeded(Format.java:542)
         at com.sleepycat.persist.impl.ComplexFormat.initialize(ComplexFormat.java:334)
         at com.sleepycat.persist.impl.Format.initializeIfNeeded(Format.java:542)
         at com.sleepycat.persist.impl.ComplexFormat.initialize(ComplexFormat.java:334)
         at com.sleepycat.persist.impl.Format.initializeIfNeeded(Format.java:542)
         at com.sleepycat.persist.impl.PersistCatalog.init(PersistCatalog.java:454)
         at com.sleepycat.persist.impl.PersistCatalog.<init>(PersistCatalog.java:221)
         at com.sleepycat.persist.impl.Store.<init>(Store.java:186)
         at com.sleepycat.persist.EntityStore.<init>(EntityStore.java:185)
    Entity store opens fine if there are no changes to format at all or there is no @SecondaryKey annotation for the new field. Here is my entity class:
    @Entity
    public abstract class AbstractMessageEntity implements MessageEntity {
    @PrimaryKey
    private Long id;
    and after adding new secondary key:
    @Entity( version = 3 )
    public abstract class AbstractMessageEntity implements MessageEntity {
    @PrimaryKey
    private Long id;
    @SecondaryKey( relate = Relationship.MANY_TO_ONE )
    private Long executionTime;
    AbstractMessageEntity has 3 persistent subclasses, but they were not changed except increasing their version from 0 to 3.
    I would greatly appreciate any workaround for this problem!

    Hi,
    According to your given class information, I assume that you have an abstract class, AbstractMessageEntity, and three subclasses, called MessageEntity1, MessageEntity2, and MessageEntity3. All you want to store are those three subclasses, right?
    Before addressing to your problem, I wanna point out that, the annotation of AbstractMessageEntity, which is @Entity should be changed to @Persistent. Because in DPL, you cannot define an entity subclass for another entity class. This is just a reminder, which is not related to your current errors.
    Now come back to your error. Actually, I cannot reproduce your error. Below is the code I made and tried to use it to reproduce your error:
    import static com.sleepycat.persist.model.Relationship.MANY_TO_ONE;
    import java.io.File;
    import com.sleepycat.je.Environment;
    import com.sleepycat.je.EnvironmentConfig;
    import com.sleepycat.persist.EntityStore;
    import com.sleepycat.persist.PrimaryIndex;
    import com.sleepycat.persist.StoreConfig;
    import com.sleepycat.persist.model.AnnotationModel;
    import com.sleepycat.persist.model.Entity;
    import com.sleepycat.persist.model.EntityModel;
    import com.sleepycat.persist.model.Persistent;
    import com.sleepycat.persist.model.PrimaryKey;
    import com.sleepycat.persist.model.SecondaryKey;
    public class AddNewSecKeyTest {
        private Environment env;
        private EntityStore store;
        private PrimaryIndex<Long, MessageEntity> primary;
        public static void main(String args[]) {
            AddNewSecKeyTest epc = new AddNewSecKeyTest();
            epc.open();
            epc.writeData();
            epc.close();
        private void writeData() {
            primary.put(null, new MessageEntity(1));
        private void getData() {
            AbstractMessageEntity data = primary.get(1L);
        private void close() {
            store.close();
            store = null;
            env.close();
            env = null;
        private void open() {
            EnvironmentConfig envConfig = new EnvironmentConfig();
            envConfig.setAllowCreate(true);
            File envHome = new File("./");
            env = new Environment(envHome, envConfig);
            EntityModel model = new AnnotationModel();
            StoreConfig config = new StoreConfig();
            config.setAllowCreate(envConfig.getAllowCreate());
            config.setTransactional(envConfig.getTransactional());
            config.setModel(model);
            store = new EntityStore(env, "test", config);
            primary = store.getPrimaryIndex(Long.class, MessageEntity.class);
        @Persistent(version = 3)
        static public abstract class AbstractMessageEntity {
            AbstractMessageEntity(Long i) {
                this.id = i;
            private AbstractMessageEntity(){}
            @PrimaryKey
            private Long id;
            @SecondaryKey( relate = MANY_TO_ONE )
            private Long executionTime;
        @Entity(version = 3)
        static public class MessageEntity extends AbstractMessageEntity{
            private int f1;
            private MessageEntity(){}
            MessageEntity(int i) {
                super(Long.valueOf(i));
                this.f1 = i;
    }However, the above code can run successfully on my sandbox (with JE 4.1.6). I don't know how much difference between my code and your code (I mean the class hierarchy). So please post your class hierarchy according to my code, also the code about how you serialize your class.
    Thanks.
    Eric Wang
    BDB JE Team

  • [bdb bug]repeatly open and close db may cause memory leak

    my test code is very simple :
    char *filename = "xxx.db";
    char *dbname = "xxx";
    for( ; ;)
    DB *dbp;
    DB_TXN *txnp;
    db_create(&dbp,dbenvp, 0);
    dbenvp->txn_begin(dbenvp, NULL, &txnp, 0);
    ret = dbp->open(dbp, txnp, filename, dbname, DB_BTREE, DB_CREATE, 0);
    if(ret != 0)
    printf("failed to open db:%s\n",db_strerror(ret));
    return 0;
    txnp->commit(txnp, 0);
    dbp->close(dbp, DB_NOSYNC);
    I try to run my test program for a long time opening and closing db repeatly, then use the PS command and find the RSS is increasing slowly:
    ps -va
    PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
    1986 pts/0 S 0:00 466 588 4999 980 0.3 -bash
    2615 pts/0 R 0:01 588 2 5141 2500 0.9 ./test
    after a few minutes:
    ps -va
    PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
    1986 pts/0 S 0:00 473 588 4999 976 0.3 -bash
    2615 pts/0 R 30:02 689 2 156561 117892 46.2 ./test
    I had read bdb's source code before, so i tried to debug it for about a week and found something like a bug:
    If open a db with both filename and dbname, bdb will open a db handle for master db and a db handle for subdb,
    both of the two handle will get an fileid by a internal api called __dbreg_get_id, however, just the subdb's id will be
    return to bdb's log region by calling __dbreg_pop_id. It leads to a id leak if I tried to open and close the db
    repeatly, as a result, __dbreg_add_dbentry will call realloc repeatly to enlarge the dbentry area, this seens to be
    the reason for RSS increasing.
    Is it not a BUG?
    sorry for my pool english :)
    Edited by: user9222236 on 2010-2-25 下午10:38

    I have tested my program using Oracle Berkeley DB release 4.8.26 and 4.7.25 in redhat 9.0 (Kernel 2.4.20-8smp on an i686) and AIX Version 5.
    The problem is easy to be reproduced by calling the open method of db handle with both filename and dbname being specified and calling the close method.
    My program is very simple:
    #include <stdlib.h>
    #include <stdio.h>
    #include <sys/time.h>
    #include "db.h"
    int main(int argc, char * argv[])
    int ret, count;
    DB_ENV *dbenvp;
    char * filename = "test.dbf";
    char * dbname = "test";
    db_env_create(&dbenvp, 0);
    dbenvp->open(dbenvp, "/home/bdb/code/test/env",DB_CREATE|DB_INIT_LOCK|DB_INIT_LOG|DB_INIT_TXN|DB_INIT_MPOOL, 0);
    for(count = 0 ; count < 10000000 ; count++)
    DB *dbp;
    DB_TXN *txnp;
    db_create(&dbp,dbenvp, 0);
    dbenvp->txn_begin(dbenvp, NULL, &txnp, 0);
    ret = dbp->open(dbp, txnp, filename, dbname, DB_BTREE, DB_CREATE, 0);
    if(ret != 0)
    printf("failed to open db:%s\n",db_strerror(ret));
    return 0;
    txnp->commit(txnp, 0);
    dbp->close(dbp, DB_NOSYNC);
    dbenvp->close(dbenvp, 0);
    return 0;
    DB_CONFIG is like below:
    set_cachesize 0 20000 0
    set_flags db_auto_commit
    set_flags db_txn_nosync
    set_flags db_log_inmemory
    set_lk_detect db_lock_minlocks
    Edited by: user9222236 on 2010-2-28 下午5:42
    Edited by: user9222236 on 2010-2-28 下午5:45

  • How feasible would it be to DIY BDB JE encyrption

    Hello All,
    I'm aware that BDB JE won't be supporting encryption.
    However, if I wanted to be bold/foolish enough to implement encryption myself for my project, what would the options be? I have encryption code (http://www.jasypt.org/). I have a small BDB JE database of less than a megabyte and plenty of RAM.
    Our client wants to host an application that deals with healthcare records with a 3rd party host and comply with HIPPA and departmental security and encryption policies. A hosting provider is handling the operating system and I cannot, in good faith, promise the client that their hosting provider wouldn't screw up an installation of the BDB native/C database.
    I see 2 options for this:
    1. Encrypting the payload and leaving the PK indexes unencrypted. This complies with regulations, but removes the query benefits of using BDB (we wouldn't be able to index confidential fields). This also makes the people we answer to nervous. I'd rather not do it this way.
    2. Doing all database operations in memory and manually saving, in encrypted form, to disk periodically as well as on shutdown. The app would decrypt from file on startup. I'd be interested in pursuing this if it is the best option.
    So I'll ask:
    1. Is there a strategy I didn't think of that would encrypt the database more reliably? Is there an API in the DB that I didn't think of that I could easily ensure encryption?
    2. Can the DB be run from memory? I assume it'd perform quite well. Would it's memory usage be reasonable? (I have < 1m of data and .75 GB of RAM for a small JEE app.
    3. If the DB can be run from memory, is there any reason why that would be a terrible idea?...beyond the 2 obvious concerns of the app shutting down without writing to the disk and storing more data than I have RAM to allocate (I have have a half gig of RAM, after app startup, to store 500k or so of data).
    Any strategic guidance would be greatly appreciated. I can implement the app in BDB 3.x or 4.x beta.
    Thanks,
    Steven
    PS: I realize that in the grand scheme, my strategy is flawed from the beginning....dealing with super-secret data on a 3rd party host, who I assume is barely competent, with a very low budget, but in this economy, we're happy to be employed. :) If this is just too much of a square peg in a round hole problem, I'll just use serialization and the collections API for storage and encrypt manually in the same way described above.
    An embedded, encrypted BDB JE install would be a great problem to solve as I like working with BDB JE much more than the BDB C version or a JPA + RDBMS solution, but am working with patient data for my next few projects.
    Edited by: JavaGeek_Boston on Oct 2, 2009 1:59 PM

    My confusion about "I'll just use serialization and the collections API for storage and encrypt manually in the same way" is that I thought you meant the JE collections API. You meant the built-in Java collections stuff in java.util.
    I don't think I have anything to add about the pure in-memory transitional approach you described.
    No, there are no built-in APIs that would work to implement full encryption in JE. Either you can encrypt each data record (payload you're calling it), or all keys and data record, individually. If you encrypt keys, then of course you lose sorting. If you decide you want to do this, I can make suggestions about how to do it with bindings.
    For DIY JE encryption, you would have to change the JE implementation. I suggest an off-line email discussion with Charles and myself if you want to explore that option.
    --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • How to let two programs access the same BDB data

    I want to use two programs to access data in one BDB database, but get this error
    multiple databases specified but not supported by file
    db open failed:Invalid argument
    I do not know how to deal with it.....
    anyone can help?
    thank you first

    Hello,
    Can you clarify a bit more as to what you are doing?
    A message like:
    multiple databases specified but not supported by file
    open: Invalid argument
    is expected when you are incorrectly creating multiple databases
    within a single physical file.
    The details are at:
    http://www.oracle.com/technology/documentation/berkeley-db/db/ref/am/opensub.html
    http://www.oracle.com/technology/documentation/berkeley-db/db/api_c/db_open.html
    It is possible to contain multiple databases in a single physical file
    but the application has to follow some steps in order to do that.
    For example you can not attempt to open a second database in a file that
    was not initially created using a database name. Doing will generate
    the error you posted.
    Thanks,
    Sandra

  • Are there any plans to migrate the BDB project to Visual Studio 2005?

    The problem is when BDB is linked into an application built with VS8, deleting some BDB objects results in mixing memory managers and application crashes.
    That is, if BDB is built with VS6, the BDB DLL will use VS6's new and delete operators. If the application that uses BDB is built with VS8 its objects will be allocated using VS8's new and delete:
    DbSequence *seq = new DbSequence(db, 0);
    delete seq;
    The first line calls application's new and then the DbSequence constructor. The second line, on the other hand, calls the destructor, which in turn calls the operator delete (Microsoft's "scalar deleting destructor"). The destructor resides in the BDB DLL, so the delete call is made in the context of the DLL, trying to delete the memory block that was allocated by the executable.

    Thank you for the reply, Michael.
    I finally have been able to work around all inter-DLL dependencies between VS6 and VS8 builds. You may want to mention in README the fact that if C++ interface is used, the application is expected to be built with the same compiler used to build BDB, even if custom callbacks for malloc, realloc and free are used in the code.
    The BDB project builds partially in VS2005. A few subprojects fail by default and the db_dll project requires a couple of small fixes. For example, advapi32.lib is missing from the list of input libraries in db_dll project (SetSecurityDescriptorDacl and InitializeSecurityDescriptor). The resulting DLL seems to work fine.
    Also, those applications that build with 32-bit time_t will fail to link. It's easy to fix, but I wonder if this will have any effect on the format of the database.
    Andre

  • Poor performance of the BDB cache

    I'm experiencing incredibly poor performance of the BDB cache and wanted to share my experience, in case anybody has any suggestions.
    Overview
    Stone Steps maintains a fork of a web log analysis tool - the Webalizer (http://www.stonesteps.ca/projects/webalizer/). One of the problems with the Webalizer is that it maintains all data (i.e. URLs, search strings, IP addresses, etc) in memory, which puts a cap on the maximum size of the data set that can be analyzed. Naturally, BDB was picked as the fastest database to maintain analyzed data on disk set and produce reports by querying the database. Unfortunately, once the database grows beyond the cache size, overall performance goes down the drain.
    Note that the version of SSW available for download does not support BDB in the way described below. I can make the source available for you, however, if you find your own large log files to analyze.
    The Database
    Stone Steps Webalizer (SSW) is a command-line utility and needs to preserve all intermediate data for the month on disk. The original approach was to use a plain-text file (webalizer.current, for those who know anything about SSW). The BDB database that replaced this plain text file consists of the following databases:
    sequences (maintains record IDs for all other tables)
    urls -primary database containing URL data - record ID (key), URL itself, grouped data, such as number of hits, transfer size, etc)
    urls.values - secondary database that contains a hash of the URL (key) and the record ID linking it to the primary database; this database is used for value lookups)
    urls.hits - secondary database that contains the number of hits for each URL (key) and the record ID to link it to the primary database; this database is used to order URLs in the report by the number of hits.
    The remaining databases are here just to indicate the database structure. They are the same in nature as the two described above. The legend is as follows: (s) will indicate a secondary database, (p) - primary database, (sf) - filtered secondary database (using DB_DONOTINDEX).
    urls.xfer (s), urls.entry (s), urls.exit (s), urls.groups.hits (sf), urls.groups.xfer (sf)
    hosts (p), hosts.values (s), hosts.hits (s), hosts.xfer (s), hosts.groups.hits (sf), hosts.groups.xfer (sf)
    downloads (p), downloads.values (s), downloads.xfer (s)
    agents (p), agents.values (s), agents.values (s), agents.hits (s), agents.visits (s), agents.groups.visits (sf)
    referrers (p), referrers.values (s), referrers.values (s), referrers.hits (s), referrers.groups.hits (sf)
    search (p), search.values (s), search.hits (s)
    users (p), users.values (s), users.hits (s), users.groups.hits (sf)
    errors (p), errors.values (s), errors.hits (s)
    dhosts (p), dhosts.values (s)
    statuscodes (HTTP status codes)
    totals.daily (31 days)
    totals.hourly (24 hours)
    totals (one record)
    countries (a couple of hundred countries)
    system (one record)
    visits.active (active visits - variable length)
    downloads.active (active downloads - variable length)
    All these databases (49 of them) are maintained in a single file. Maintaining a single database file is a requirement, so that the entire database for the month can be renamed, backed up and used to produce reports on demand.
    Database Size
    One of the sample Squid logs I received from a user contains 4.4M records and is about 800MB in size. The resulting database is 625MB in size. Note that there is no duplication of text data - only nodes and such values as hits and transfer sizes are duplicated. Each record also contains some small overhead (record version for upgrades, etc).
    Here are the sizes of the URL databases (other URL secondary databases are similar to urls.hits described below):
    urls (p):
    8192 Underlying database page size
    2031 Overflow key/data size
    1471636 Number of unique keys in the tree
    1471636 Number of data items in the tree
    193 Number of tree internal pages
    577738 Number of bytes free in tree internal pages (63% ff)
    55312 Number of tree leaf pages
    145M Number of bytes free in tree leaf pages (67% ff)
    2620 Number of tree overflow pages
    16M Number of bytes free in tree overflow pages (25% ff)
    urls.hits (s)
    8192 Underlying database page size
    2031 Overflow key/data size
    2 Number of levels in the tree
    823 Number of unique keys in the tree
    1471636 Number of data items in the tree
    31 Number of tree internal pages
    201970 Number of bytes free in tree internal pages (20% ff)
    45 Number of tree leaf pages
    243550 Number of bytes free in tree leaf pages (33% ff)
    2814 Number of tree duplicate pages
    8360024 Number of bytes free in tree duplicate pages (63% ff)
    0 Number of tree overflow pages
    The Testbed
    I'm running all these tests using the latest BDB (v4.6) built from the source on Win2K3 server (release version). The test machine is 1.7GHz P4 with 1GB of RAM and an IDE hard drive. Not the fastest machine, but it was able to handle a log file like described before at a speed of 20K records/sec.
    BDB is configured in a single file in a BDB environment, using private memory, since only one process ever has access to the database).
    I ran a performance monitor while running SSW, capturing private bytes, disk read/write I/O, system cache size, etc.
    I also used a code profiler to analyze SSW and BDB performance.
    The Problem
    Small log files, such as 100MB, can be processed in no time - BDB handles them really well. However, once the entire BDB cache is filled up, the machine goes into some weird state and can sit in this state for hours and hours before completing the analysis.
    Another problem is that traversing large primary or secondary databases is a really slow and painful process. It is really not that much data!
    Overall, the 20K rec/sec quoted above drop down to 2K rec/sec. And that's all after most of the analysis has been done, just trying to save the database.
    The Tests
    SSW runs in two modes, memory mode and database mode. In memory mode, all data is kept in memory in SSW's own hash tables and then saved to BDB at the end of each run.
    In memory mode, the entire BDB is dumped to disk at the end of the run. First, it runs fairly fast, until the BDB cache is filled up. Then writing (disk I/O) goes at a snail pace, at about 3.5MB/sec, even though this disk can write at about 12-15MB/sec.
    Another problem is that the OS cache gets filled up, chewing through all available memory long before completion. In order to deal with this problem, I disabled the system cache using the DB_DIRECT_DB/LOG options. I could see OS cache left alone, but once BDB cache was filed up, processing speed was as good as stopped.
    Then I flipped options and used DB_DSYNC_DB/LOG options to disable OS disk buffering. This improved overall performance and even though OS cache was filling up, it was being flushed as well and, eventually, SSW finished processing this log, sporting 2K rec/sec. At least it finished, though - other combinations of these options lead to never-ending tests.
    In the database mode, stale data is put into BDB after processing every N records (e.g. 300K rec). In this mode, BDB behaves similarly - until the cache is filled up, the performance is somewhat decent, but then the story repeats.
    Some of the other things I tried/observed:
    * I tried to experiment with the trickle option. In all honesty, I hoped that this would be the solution to my problems - trickle some, make sure it's on disk and then continue. Well, trickling was pretty much useless and didn't make any positive impact.
    * I disabled threading support, which gave me some performance boost during regular value lookups throughout the test run, but it didn't help either.
    * I experimented with page size, ranging them from the default 8K to 64K. Using large pages helped a bit, but as soon as the BDB cached filled up, the story repeated.
    * The Db.put method, which was called 73557 times while profiling saving the database at the end, took 281 seconds. Interestingly enough, this method called ReadFile function (Win32) 20000 times, which took 258 seconds. The majority of the Db.put time was wasted on looking up records that were being updated! These lookups seem to be the true problem here.
    * I tried libHoard - it usually provides better performance, even in a single-threaded process, but libHoard didn't help much in this case.

    I have been able to improve processing speed up to
    6-8 times with these two techniques:
    1. A separate trickle thread was created that would
    periodically call DbEnv::memp_trickle. This works
    especially good on multicore machines, but also
    speeds things up a bit on single CPU boxes. This
    alone improved speed from 2K rec/sec to about 4K
    rec/sec.Hello Stone,
    I am facing a similar problem, and I too hope to resolve the same with memp_trickle. I had these queries.
    1. what was the % of clean pages that you specified?
    2. What duration were you clling this thread to call memp_trickle?
    This would give me a rough idea about which to tune my app. Would really appreciate if you can answer these queries.
    Regards,
    Nishith.
    >
    2. Maintaining multiple secondary databases in real
    time proved to be the bottleneck. The code was
    changed to create secondary databases at the end of
    the run (calling Db::associate with the DB_CREATE
    flag), right before the reports are generated, which
    use these secondary databases. This improved speed
    from 4K rec/sec to 14K rec/sec.

  • The chapter about php_db4 in the BDB documentation need to be updated

    Hi all,
    I think the chapter about php_db4 in the BerkeleyDB documentation should be updated.
    The first sentence "A PHP 4 extension for this release of Berkeley DB..." gave me the concept that the extension can be ONLY run with PHP 4. I've got the idea that DBXML's php extension can run with PHP 5, so the BDB's extension should work with PHP 5 too.(Perhaps I'm not clever enough, but it did give me the wrong thought.)
    It's too simple in the documentation to let everyone know how to sovle the problems in compile process(and such subjects are hard to be found through search engines).
    I had compiled BDB 4.5 on my Fedora Core 6. When I make the php_db4, it could not finish normally(with some errors). When I added CPPFLAGS=-DHAVE_CXX_STDHEADERS, it solved - this should be written somewhere in the documentation or at lease in the INSTALL file, right?
    In the INSTALL file, it says "PHP can itself be forced to link against libpthread either by manually editing its build files (which some distributions do), or by building it with --with-experimental-zts". But at least I can't find this option in the configure of PHP 5.2.1. Only an option '--enable-maintainer-zts' can be found, and it is noted as 'for code maintainers only'(so it should not be enabled by end-users with less experience, isn't it?). PHP 5.2.1's TSRM is ptheads-enabled by default, I don't know whether I should follow the note about pthreads or not.
    Anyway, I can use the native API of BerkeleyDB in my php code now. Thanks for the developers! Hope the documentation can be updated with more directives, so the new users of php_db4 can use it more smoothly:-)

    nikkap wrote:
    I haven't updated to the latest software because I didn't want my google maps to change to the new version they made with poorer directions.
    To each their own, but you can always install the Google Maps app.
    nikkap wrote:
    I also didn't want to lose battery charge quicker and didn't want certain functions to cease working like wifi which I've heard has been a problem.  It just seems that since my phone is so 'old' that updating it could cause more harm than good. 
    Nothing is further from the truth.  Installing updates does not cause hardware issues or hardware to stop working.  Anyone that tells you otherwise is completely ignorant.
    I (and millions of others) have installed iOS updates with little or no adverse affects.  The vast majority of issues that arise during an iOS update can be rectified with basic troubleshooting from the User's Guide.
    There is no legitimate reason to not update the iOS which adds new features and fixes security issues in iOS.
    nikkap wrote:
    My other problem is that someone proxied the call to ATT to obtain the unlock so I don't know exactly what happened and if in fact they did agree to unlock or perhaps lied instead or did something else.  But I don't have the account information for ATT since it's a group account so i'm SOL. 
    Well, there you go.  You need to contact AT&T or get the correct information and go to AT&T's website and request the unlock.
    There are no codes, as you've already found out.
    Once the unlock has been approved, AT&T will email you to let you know it has been processed and that the next step is to restore the device.

  • Sefault in __lock_get_internal using BDB 4.7.25

    Hi,
    I am having trouble finding the root cause of a segfault. The program generating the fault uses both the bdb and repmgr APIs; the segfault happends in a bdb call.
    Here is a quick run-down of the problem. My test is setup with two nodes. The master node is started first, then queried by a client program. Then a client node is started. It replicates the database successfully, then is queried by the same client program. Each node is asked to perform two database gets, the first completes the second causes the segfault, but only in the client node.
    Each node is configured the same, except the client node will close and re-open the database after the syncronization is done.
    I would appreciate any insight to what could be causing my problem, as I've noted the segfault occrrus during a lock aquisition. The program is multi-threaded, but I enable the database to be thread-safe.
    I've included an example of the API calls made to setup each environment, a backtrace from the client corefile, and the verbose output from both nodes during the run.
    h5. Node Configuration Example
    int master_port = 10001;
    int client_port = 10002;
    DB_ENV *env;
    DB *db;
    int env_flags = DB_CREATE | DB_INIT_LOCK | DB_INIT_LOG | DB_INIT_MPOOL | DB_INIT_TXN | DB_INIT_REP | DB_RECOVER | DB_THREAD;
    int db_flags = DB_CREATE | DB_AUTO_COMMIT | DB_THREAD:
    db_env_create(&env, 0);
    env->set_lk_detect(env, DB_LOCK_DEFAULT);
    if(master)
    env->repmgr_set_local_site(env, 'localhost', master_port, 0);
    else
    env->repmgr_set_local_site(env, 'localhost', client_port, 0);
    * The DB_REPMGR_PEER seems useless in this example. But the actual
    * design allows for a client to peer with another client.
    if(master)
    env->repmgr_add_remote_site(env, 'localhost', 0, NULL, DB_REPMGR_PEER);
    else
    env->repmgr_add_remote_site(env, 'localhost', master_port, NULL, DB_REPMGR_PEER);
    if(master)
    env->open(env, '/tmp/dbs_m', env_flags, 0);
    else
    env->open(env, '/tmp/dbs_c', env_flags, 0);
    db_create(&db, env, 0);
    db->open(db, NULL, 'DB', NULL, DB_BTREE, db_flags, 0);
    env->repmgr_start(env, 3, DB_REP_ELECTION);
    h5. GDB backtrace
    GNU gdb 6.8
    Copyright (C) 2008 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law. Type "show copying"
    and "show warranty" for details.
    This GDB was configured as "i686-pc-linux-gnu"...
    Reading symbols from /lib/libpthread.so.0...done.
    Loaded symbols for /lib/libpthread.so.0
    Reading symbols from /lib/libc.so.6...done.
    Loaded symbols for /lib/libc.so.6
    Reading symbols from /lib/ld-linux.so.2...done.
    Loaded symbols for /lib/ld-linux.so.2
    Reading symbols from /lib/libnss_files.so.2...done.
    Loaded symbols for /lib/libnss_files.so.2
    Reading symbols from /lib/libnss_dns.so.2...done.
    Loaded symbols for /lib/libnss_dns.so.2
    Reading symbols from /lib/libresolv.so.2...done.
    Loaded symbols for /lib/libresolv.so.2
    Core was generated by `./dbserver/dbserver bootstrap=localhost:24050 address=localhost:17000 -'.
    Program terminated with signal 11, Segmentation fault.
    [New process 685]
    #0 0x0814239f in __lock_get_internal (lt=0x40140868, sh_locker=0x4032d508, flags=0, obj=0x81f7108, lock_mode=DB_LOCK_READ,
    timeout=0, lock=0xbe7ff5bc) at ../dist/../lock/lock.c:586
    586               OBJECT_LOCK(lt, region, obj, lock->ndx);
    (gdb) bt full
    #0 0x0814239f in __lock_get_internal (lt=0x40140868, sh_locker=0x4032d508, flags=0, obj=0x81f7108, lock_mode=DB_LOCK_READ,
    timeout=0, lock=0xbe7ff5bc) at ../dist/../lock/lock.c:586
         newl = (struct __db_lock *) 0x0
         lp = (struct __db_lock *) 0x40142e48
         env = (ENV *) 0x40140860
         sh_obj = (DB_LOCKOBJ *) 0x0
         region = (DB_LOCKREGION *) 0x40140880
         ip = (DB_THREAD_INFO *) 0x40142e48
         ndx = 3196058196
         part_id = 1074222655
         did_abort = 1073875436
         ihold = 0
         grant_dirty = 1075064392
         no_dd = 0
         ret = 0
         t_ret = 1073875436
         holder = 1075064392
         sh_off = 0
         action = 3196056724
    #1 0x08141da2 in __lock_get (env=0x401407f0, locker=0x4032d508, flags=0, obj=0x81f7108, lock_mode=DB_LOCK_READ,
    lock=0xbe7ff5bc) at ../dist/../lock/lock.c:456
         lt = (DB_LOCKTAB *) 0x40140868
         ret = 0
    #2 0x08181674 in __db_lget (dbc=0x81f7080, action=0, pgno=1075054832, mode=DB_LOCK_READ, lkflags=0, lockp=0xbe7ff5bc)
    at ../dist/../db/db_meta.c:1035
         dbp = (DB *) 0x401407e0
         couple = {{op = DB_LOCK_DUMP, mode = DB_LOCK_NG, timeout = 3196058052, obj = 0x400546b8, lock = {off = 32, ndx = 0,
          gen = 0, mode = DB_LOCK_NG}}, {op = 136380459, mode = 3196057632, timeout = 3196057624, obj = 0xbe7ff9c4, lock = {
          off = 3196057576, ndx = 0, gen = 1073916640, mode = DB_LOCK_NG}}, {op = 1073875800, mode = 3196057972, timeout = 0,
        obj = 0x0, lock = {off = 35, ndx = 66195, gen = 3196057576, mode = 43}}}
         reqp = (DB_LOCKREQ *) 0x0
         txn = (DB_TXN *) 0x0
         env = (ENV *) 0x401407f0
         has_timeout = 0
         i = 0
         ret = -1
    #3 0x080d8f9e in __bam_get_root (dbc=0x81f7080, pg=1075054832, slevel=1, flags=1409, stack=0xbe7ff6a8)
    at ../dist/../btree/bt_search.c:94
         cp = (BTREE_CURSOR *) 0x8259248
         dbp = (DB *) 0x401407e0
         lock = {off = 1073709056, ndx = 510075, gen = 2260372568, mode = 3758112764}
         mpf = (DB_MPOOLFILE *) 0x401407f8
         h = (PAGE *) 0x0
         lock_mode = DB_LOCK_READ
         ret = 89980928
         t_ret = 134764095
    #4 0x080d9407 in __bam_search (dbc=0x81f7080, root_pgno=1075054832, key=0xbe7ffa6c, flags=1409, slevel=1, recnop=0x0,
    ---Type <return> to continue, or q <return> to quit---
    exactp=0xbe7ff8b0) at ../dist/../btree/bt_search.c:203
         t = (BTREE *) 0x401408f8
         cp = (BTREE_CURSOR *) 0x8259248
         dbp = (DB *) 0x401407e0
         lock = {off = 0, ndx = 0, gen = 0, mode = DB_LOCK_NG}
         mpf = (DB_MPOOLFILE *) 0x401407f8
         env = (ENV *) 0x401407f0
         h = (PAGE *) 0x0
         base = 0
         i = 0
         indx = 0
         inp = (db_indx_t *) 0x0
         lim = 0
         lock_mode = DB_LOCK_NG
         pg = 0
         recno = 0
         adjust = 0
         cmp = 0
         deloffset = 0
         ret = 0
         set_stack = 0
         stack = 0
         t_ret = 0
         func = (int (*)(DB *, const DBT *, const DBT *)) 0
    #5 0x0819b1d1 in __bamc_search (dbc=0x81f7080, root_pgno=0, key=0xbe7ffa6c, flags=26, exactp=0xbe7ff8b0)
    at ../dist/../btree/bt_cursor.c:2501
         t = (BTREE *) 0x401408f8
         cp = (BTREE_CURSOR *) 0x8259248
         dbp = (DB *) 0x401407e0
         h = (PAGE *) 0x0
         indx = 0
         inp = (db_indx_t *) 0x0
         bt_lpgno = 0
         recno = 0
         sflags = 1409
         cmp = 0
         ret = 0
         t_ret = 0
    #6 0x08196ff7 in __bamc_get (dbc=0x81f7080, key=0xbe7ffa6c, data=0xbe7ffa50, flags=26, pgnop=0xbe7ff93c)
    at ../dist/../btree/bt_cursor.c:970
         cp = (BTREE_CURSOR *) 0x8259248
         dbp = (DB *) 0x401407e0
         mpf = (DB_MPOOLFILE *) 0x401407f8
         orig_pgno = 0
         orig_indx = 0
         exact = 1075236764
         newopd = 1
    ---Type <return> to continue, or q <return> to quit---
         ret = 136272648
    #7 0x0816f6fc in __dbc_get (dbc_arg=0x81f7080, key=0xbe7ffa6c, data=0xbe7ffa50, flags=26) at ../dist/../db/db_cam.c:700
         dbp = (DB *) 0x401407e0
         dbc = (DBC *) 0x0
         dbc_n = (DBC *) 0x81f7080
         opd = (DBC *) 0x0
         cp = (DBC_INTERNAL *) 0x8259248
         cp_n = (DBC_INTERNAL *) 0x0
         mpf = (DB_MPOOLFILE *) 0x401407f8
         env = (ENV *) 0x401407f0
         pgno = 0
         indx_off = 0
         multi = 0
         orig_ulen = 0
         tmp_flags = 0
         tmp_read_uncommitted = 0
         tmp_rmw = 0
         type = 64 '@'
         key_small = 0
         ret = 136268720
         t_ret = -1098909244
    #8 0x0817a1ac in __db_get (dbp=0x8258bb0, ip=0x0, txn=0x0, key=0xbe7ffa6c, data=0xbe7ffa50, flags=26)
    at ../dist/../db/db_iface.c:760
         dbc = (DBC *) 0x81f7080
         mode = 0
         ret = 0
         t_ret = 1075208764
    #9 0x08179f6c in __db_get_pp (dbp=0x8258bb0, txn=0x0, key=0xbe7ffa6c, data=0xbe7ffa50, flags=0)
    at ../dist/../db/db_iface.c:684
         ip = (DB_THREAD_INFO *) 0x0
         env = (ENV *) 0x81f4bb0
         mode = 0
         handle_check = 1
         ignore_lease = 0
         ret = 0
         t_ret = 1073880126
         txn_local = 0
    #10 0x0804c7a8 in _get (database=0x81f37a8, txn=0x0, query=0x821d1a0, callName=0x81cc1b7 "GET") at ../dbserver/database.c:503
         k = {data = 0x81f67e8, size = 22, ulen = 22, dlen = 0, doff = 0, app_data = 0x0, flags = 0}
         v = {data = 0x821d2a0, size = 255, ulen = 255, dlen = 0, doff = 0, app_data = 0x0, flags = 256}
         err = 136263592
         __PRETTY_FUNCTION__ = "_get"
    #11 0x0804c8f0 in get (database=0x81f37a8, txn_id=3, query=0x821d1a0) at ../dbserver/database.c:643
         txn = (DB_TXN *) 0x416a7db4
    #12 0x08053f1d in workerThreadMain (threadArg=0x7c87b) at ../dbserver/server.c:433
         type = ISProtocol_IDENTIFYMASTER
         class = <value optimized out>
    ---Type <return> to continue, or q <return> to quit---
         s = {context = 0x8211930, protocol = 0x8211980, socketToClient = 3, query = 0x821d1a0, deleteClientSocket = ISFalse,
      abortActiveTxn = ISFalse}
         __PRETTY_FUNCTION__ = "workerThreadMain"
    #13 0x4001d0ba in pthread_start_thread () from /lib/libpthread.so.0
    No symbol table info available.
    #14 0x400fad6a in clone () from /lib/libc.so.6
    No symbol table info available.
    h5. Verbose Master Node Log
    REP_UNDEF: rep_start: Found old version log 14
    CLIENT: db rep_send_message: msgv = 5 logv 14 gen = 0 eid -1, type newclient, LSN [0][0] nogroup nobuf
    CLIENT: starting election thread
    CLIENT: elect thread to do: 0
    CLIENT: repmgr elect: opcode 0, finished 0, master -2
    CLIENT: elect thread to do: 1
    CLIENT: Start election nsites 1, ack 1, priority 100
    CLIENT: Tallying VOTE1[0] (2147483647, 1)
    CLIENT: Beginning an election
    CLIENT: db rep_send_message: msgv = 5 logv 14 gen = 0 eid -1, type vote1, LSN [1][8702] nogroup nobuf
    CLIENT: Tallying VOTE2[0] (2147483647, 1)
    CLIENT: Counted my vote 1
    CLIENT: Skipping phase2 wait: already got 1 votes
    CLIENT: Got enough votes to win; election done; winner is 2147483647, gen 0
    CLIENT: Election finished in 0.039845000 sec
    CLIENT: Election done; egen 2
    CLIENT: Ended election with 0, sites 0, egen 2, flags 0x200a01
    CLIENT: Election done; egen 2
    CLIENT: New master gen 2, egen 3
    MASTER: rep_start: Old log version was 14
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type newmaster, LSN [1][8702] nobuf
    MASTER: restore_prep: No prepares. Skip.
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][8702]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][8785]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][8821]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][8904] perm
    MASTER: rep_send_function returned: -30975
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][8948]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9034]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9115]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9202]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9287] flush perm
    MASTER: rep_send_function returned: -30975
    MASTER: election thread is exiting
    MASTER: accepted a new connection
    MASTER: handshake introduces unknown site localhost:10002
    MASTER: EID 0 is assigned for site localhost:10002
    MASTER: db rep_process_message: msgv = 5 logv 14 gen = 0 eid 0, type newclient, LSN [0][0] nogroup
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type newsite, LSN [0][0] nobuf
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type newmaster, LSN [1][9367] nobuf
    MASTER: NEWSITE info from site localhost:10002 was already known
    MASTER: db rep_process_message: msgv = 5 logv 14 gen = 0 eid 0, type master_req, LSN [0][0] nogroup
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type newmaster, LSN [1][9367] nobuf
    MASTER: db rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type verify_req, LSN [1][8658]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type verify, LSN [1][8658] nobuf
    MASTER: db rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type update_req, LSN [0][0]
    MASTER: Walk_dir: Getting info for dir: db
    MASTER: Walk_dir: Dir db has 10 files
    MASTER: Walk_dir: File 0 name: __db.001
    MASTER: Walk_dir: File 1 name: __db.002
    MASTER: Walk_dir: File 2 name: __db.rep.gen
    MASTER: Walk_dir: File 3 name: __db.rep.egen
    MASTER: Walk_dir: File 4 name: __db.003
    MASTER: Walk_dir: File 5 name: __db.004
    MASTER: Walk_dir: File 6 name: __db.005
    MASTER: Walk_dir: File 7 name: __db.006
    MASTER: Walk_dir: File 8 name: log.0000000001
    MASTER: Walk_dir: File 9 name: ROUTER
    MASTER: Walk_dir: File 0 (of 1) ROUTER at 0x40356018: pgsize 4096, max_pgno 1
    MASTER: Walk_dir: Getting info for in-memory named files
    MASTER: Walk_dir: Dir INMEM has 0 files
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type update, LSN [1][9367] nobuf
    MASTER: db rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type page_req, LSN [0][0]
    MASTER: page_req: file 0 page 0 to 1
    MASTER: page_req: Open 0 via mpf_open
    MASTER: sendpages: file 0 page 0 to 1
    MASTER: sendpages: 0, page lsn [0][1]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type page, LSN [1][9367] nobuf resend
    MASTER: sendpages: 0, lsn [1][9367]
    MASTER: sendpages: 1, page lsn [1][9202]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type page, LSN [1][9367] nobuf resend
    MASTER: sendpages: 1, lsn [1][9367]
    MASTER: db rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log_req, LSN [1][28]
    MASTER: [1][28]: LOG_REQ max lsn: [1][9367]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][28] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][91] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][4266] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8441] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8535] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8575] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8658] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8702] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8785] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8821] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8904] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8948] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9034] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9115] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9202] nobuf resend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9287] nobuf resend
    MASTER: db rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type all_req, LSN [1][9287]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9287] nobuf resend logend
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9367]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9469]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9548] flush perm
    MASTER: will await acknowledgement: need 1
    MASTER: rep_send_function returned: 110
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9628]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9696]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9785] flush perm
    MASTER: will await acknowledgement: need 1
    MASTER: got ack [1][9548](2) from site localhost:10002
    MASTER: got ack [1][9785](2) from site localhost:10002
    MASTER: got ack [1][9287](2) from site localhost:10002
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type start_sync, LSN [1][9785] nobuf
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9865]
    MASTER: db rep_send_message: msgv = 5 logv 14 gen = 2 eid -1, type log, LSN [1][9948] flush perm
    MASTER: will await acknowledgement: need 1
    MASTER: got ack [1][9948](2) from site localhost:10002
    EOF on connection from site localhost:10002
    h5. Verbose Client Node Log
    REP_UNDEF: EID 0 is assigned for site localhost:10001
    REP_UNDEF: rep_start: Found old version log 14
    CLIENT: db2 rep_send_message: msgv = 5 logv 14 gen = 0 eid -1, type newclient, LSN [0][0] nogroup nobuf
    CLIENT: starting election thread
    CLIENT: elect thread to do: 0
    CLIENT: repmgr elect: opcode 0, finished 0, master -2
    CLIENT: init connection to site localhost:10001 with result 115
    CLIENT: handshake from connection to localhost:10001
    CLIENT: handshake with no known master to wake election thread
    CLIENT: reusing existing elect thread
    CLIENT: repmgr elect: opcode 3, finished 0, master -2
    CLIENT: elect thread to do: 3
    CLIENT: db2 rep_send_message: msgv = 5 logv 14 gen = 0 eid -1, type newclient, LSN [0][0] nogroup nobuf
    CLIENT: repmgr elect: opcode 0, finished 0, master -2
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type newsite, LSN [0][0]
    CLIENT: db2 rep_send_message: msgv = 5 logv 14 gen = 0 eid -1, type master_req, LSN [0][0] nogroup nobuf
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type newmaster, LSN [1][9367]
    CLIENT: Election done; egen 1
    CLIENT: Updating gen from 0 to 2 from master 0
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type newmaster, LSN [1][9367]
    CLIENT: egen: 3. rep version 5
    CLIENT: db2 rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type verify_req, LSN [1][8658] any nobuf
    CLIENT: sending request to peer
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type verify, LSN [1][8658]
    CLIENT: db2 rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type update_req, LSN [0][0] nobuf
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type update, LSN [1][9367]
    CLIENT: Update setup for 1 files.
    CLIENT: Update setup: First LSN [1][28].
    CLIENT: Update setup: Last LSN [1][9367]
    CLIENT: Walk_dir: Getting info for dir: db2
    CLIENT: Walk_dir: Dir db2 has 11 files
    CLIENT: Walk_dir: File 0 name: __db.001
    CLIENT: Walk_dir: File 1 name: __db.002
    CLIENT: Walk_dir: File 2 name: __db.rep.gen
    CLIENT: Walk_dir: File 3 name: __db.rep.egen
    CLIENT: Walk_dir: File 4 name: __db.003
    CLIENT: Walk_dir: File 5 name: __db.004
    CLIENT: Walk_dir: File 6 name: __db.005
    CLIENT: Walk_dir: File 7 name: __db.006
    CLIENT: Walk_dir: File 8 name: log.0000000001
    CLIENT: Walk_dir: File 9 name: ROUTER
    CLIENT: Walk_dir: File 0 (of 1) ROUTER at 0x40356018: pgsize 4096, max_pgno 1
    CLIENT: Walk_dir: File 10 name: __db.rep.db
    CLIENT: Walk_dir: Getting info for in-memory named files
    CLIENT: Walk_dir: Dir INMEM has 0 files
    CLIENT: Next file 0: pgsize 4096, maxpg 1
    CLIENT: db2 rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type page_req, LSN [0][0] any nobuf
    CLIENT: sending request to peer
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type page, LSN [1][9367] resend
    CLIENT: PAGE: Received page 0 from file 0
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type page, LSN [1][9367] resend
    CLIENT: PAGE: Write page 0 into mpool
    CLIENT: rep_write_page: Calling fop_create for ROUTER
    CLIENT: PAGE_GAP: pgno 0, max_pg 1 ready 0, waiting 0 max_wait 0
    CLIENT: FILEDONE: have 1 pages. Need 2.
    CLIENT: PAGE: Received page 1 from file 0
    CLIENT: PAGE: Write page 1 into mpool
    CLIENT: PAGE_GAP: pgno 1, max_pg 1 ready 1, waiting 0 max_wait 0
    CLIENT: FILEDONE: have 2 pages. Need 2.
    CLIENT: NEXTFILE: have 1 files. RECOVER_LOG now
    CLIENT: NEXTFILE: LOG_REQ from LSN [1][28] to [1][9367]
    CLIENT: db2 rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type log_req, LSN [1][28] any nobuf
    CLIENT: sending request to peer
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][28] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][91] resend
    CLIENT: rep_apply: Set apply_th 1
    CLIENT: rep_apply: Decrement apply_th 0
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][4266] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8441] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8535] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8575] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8658] resend
    CLIENT: Returning NOTPERM [1][8658], cmp = 1
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8702] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8785] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8821] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8904] resend
    CLIENT: Returning NOTPERM [1][8904], cmp = 1
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][8948] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9034] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9115] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9202] resend
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9287] resend
    CLIENT: Returning NOTPERM [1][9287], cmp = 1
    CLIENT: rep_apply: Set apply_th 1
    CLIENT: rep_apply: Decrement apply_th 0
    CLIENT: Returning LOGREADY up to [1][9287], cmp = 0
    CLIENT: Election done; egen 3
    Recovery starting from [1][28]
    Recovery complete at Fri Jul 31 10:11:33 2009
    Maximum transaction ID 80000002 Recovery checkpoint [0][0]
    CLIENT: db2 rep_send_message: msgv = 5 logv 14 gen = 2 eid 0, type all_req, LSN [1][9287] any nobuf
    CLIENT: sending request to peer
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9287] resend logend
    CLIENT: Start-up is done [1][9287]
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9367]
    CLIENT: rep_apply: Set apply_th 1
    CLIENT: rep_apply: Decrement apply_th 0
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9469]
    CLIENT: rep_apply: Set apply_th 1
    CLIENT: rep_apply: Decrement apply_th 0
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9548] flush
    CLIENT: rep_apply: Set apply_th 1
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9628]
    CLIENT: rep_apply: Decrement apply_th 0
    CLIENT: Returning ISPERM [1][9548], cmp = 0
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9696]
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9785] flush
    CLIENT: Returning NOTPERM [1][9785], cmp = 1
    CLIENT: rep_apply: Set apply_th 1
    CLIENT: rep_apply: Decrement apply_th 0
    CLIENT: Returning ISPERM [1][9785], cmp = 0
    CLIENT: Returning ISPERM [1][9287], cmp = -1
    CLIENT: election thread is exiting
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type start_sync, LSN [1][9785]
    CLIENT: ALIVE: Completed sync [1][9785]
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9865]
    CLIENT: rep_apply: Set apply_th 1
    CLIENT: rep_apply: Decrement apply_th 0
    CLIENT: db2 rep_process_message: msgv = 5 logv 14 gen = 2 eid 0, type log, LSN [1][9948] flush
    CLIENT: rep_apply: Set apply_th 1
    CLIENT: rep_apply: Decrement apply_th 0
    CLIENT: Returning ISPERM [1][9948], cmp = 0
    Regards,
    Chris

    I was able to track this issue down to a usage error. I was calling a DB API call from within a callback -- which violates the APIs re-entrancy assumptions.

  • How to read the BDB log ... and other questions

    I am using a bdb database interface in the application openldap. When the bdb database there is established it creates, in addition to the data storage diles, a log file (log.0000000001). "file log.0000000001" reports that to be a binary file. How does one read that log?
    I have asked this question on the openldap forum and was advised it can be read using tools provided by Oracle to support Berkeley DB, and was furhter advised to go the the Oracle Berkeley DB site. Well, I have done that ... looked around for evidence of any such "tools", but have found nothing.
    Also, was advised there (in the openldap forum) that having that log file in the same directory with the data files is not a good idea, that it should be on a different spindle for performance purposes. I have looked at the BDB reference manual on line here but find no configuration options to move that log file to a different location.
    Help? Thanks.

    Hi Robert,
    The information about setting log directories can be found here:
    http://www.oracle.com/technology/documentation/berkeley-db/db/api_c/env_set_lg_dir.html
    General information about log files that you may want to read about:
    http://www.oracle.com/technology/documentation/berkeley-db/db/gsg_txn/C/index.html
    You can use db_printlog to display the log files:
    http://www.oracle.com/technology/documentation/berkeley-db/db/utility/db_printlog.html
    The above link will also point you to a place to review the output. The db_printlog utility should be installed as part of your distribution.
    Ron

  • How to load a BDB file into memory?

    The entire BDB database needs to reside in memory for performance reasons, it needs to be in memory all the time, not paged in on demand. The physical memory and virtual process address space are large enough to hold this file. How can I load it into memory just before accessing the first entry? I've read the C++ API reference, and it seems that I can do the following:
    1, Create a DB environment;
    2, Call DB_ENV->set_cachesize() to set a memory pool large enough to hold the BDB file;
    3, Call DB_MPOOLFILE->open() to open the BDB file in memory pool of that DB environment;
    4, Create a DB handle in that DB environment and open the BDB file (again) via this DB handle.
    My questions are:
    1, Is there a more elegant way instead of using that DB environment? If the DB environment is a must, then:
    2, Does step 3 above load the BDB file into memory pool or just reserve enough space for that file?
    Thanks in advance,
    Feng

    Hello,
    Does the documentation on "Memory-only or Flash configurations" at:
    http://download.oracle.com/docs/cd/E17076_02/html/programmer_reference/program_ram.html
    answer the question?
    From there we have:
    By default, databases are periodically flushed from the Berkeley DB memory cache to backing physical files in the filesystem. To keep databases from being written to backing physical files, pass the DB_MPOOL_NOFILE flag to the DB_MPOOLFILE->set_flags() method. This flag implies the application's databases must fit entirely in the Berkeley DB cache, of course. To avoid a database file growing to consume the entire cache, applications can limit the size of individual databases in the cache by calling the DB_MPOOLFILE->set_maxsize() method.
    Thanks,
    Sandra

Maybe you are looking for

  • Re-connecting media to different frame rate clip

    Hi guys, Wondering whether anyone can help me out to solve an error I've made... So, I've basically completed an edit (some episodes on Final Cut Pro X, some on Final Cut 7) only to find out that I'd transcoded the media at the incorrect frame rate,

  • Auto populate a textfield based on users input

    Hi All, Working with Designer ES ver. 8.2, and I am looking to see if it is possible to auto fill another field based on an end users input. Example: End-user enters a 6 digit departmentCode for a program; in turn the field opposite to the department

  • Flash 8 pro flv playback behaviors

    Could someone show me where I can get the FLV playback component behaviors for flash 8 pro? I couldn't find them on the Adobe exchange. Thanks -mark

  • Unable to access 100/149 events in iphoto

    It all started in an effort to restore photos from my camera accidentally deleted while on a trip.  I used a restore program on the memory card.  In the process of retreiving images from the memory card, it also downloaded 10K of images from my compu

  • Photoshop Elements Editor 11 Mac doesn't remember open files

    I just installed PSE 11 on my Mac (OS X 10.8.2). My previous version (PSE 6) would re-open all files that had been open when I quit the app last time, but this version always opens with a blank screen. I can't find any preference in PSE to affect thi