Berkeley DB

Dear Ron,
Hi,
I have two databases that are created using Berkeley
DB, I need to merge these two databases into one
database. could you please tell me what is the fastest
way to do so?
currently, I am using the following method :
open the first database, search each key in the first
database in the second database, merge the data of
both keys and rewrite the data on the second database
but the speed is slow... I read about Cursor (Join
Union , Intersection ..) but I am not sure if those
operation are the best to get the best speed of
merging???
thank you for your help.
Best Regards,
Ahmad

Hi Alex,
Thank you for your quick response, I will adjust the database configuration (the pge size and the cache size) and I will inform you about the result as soon as possible.
now I have another problem :
I run the following code and I got the error shown after the end of this code:
package dbmerging;
import java.io.FileNotFoundException;
import com.sleepycat.db.LockMode;
import com.sleepycat.db.OperationStatus;
import com.sleepycat.db.Database;
import com.sleepycat.db.DatabaseConfig;
import com.sleepycat.db.DatabaseType;
import com.sleepycat.db.DatabaseException;
import com.sleepycat.db.Environment;
import com.sleepycat.db.EnvironmentConfig;
import com.sleepycat.db.DatabaseEntry;
import com.sleepycat.db.Cursor;
import com.sleepycat.db.CursorConfig;
import com.sleepycat.db.DatabaseException;
import com.sleepycat.db.MessageHandler;
import java.io.File;
import java.io.*;
import java.net.*;
import com.sleepycat.db.*;
public class Merging2DB {
public Merging2DB() {
class StringEntry extends DatabaseEntry {
StringEntry() {
StringEntry(String value) {
setString(value);
void setString(String value) {
byte[] data = value.getBytes();
setData(data);
setSize(data.length);
String getString() {
return new String(getData(), getOffset(), getSize());
String moddata="";
long sttime=0;
long entime=0;
long eltime =0 ;
/// try {
Cursor cursor =null;
Environment myDbE = null;
Environment myDbE1= null;
Database dbin = null;
Database dbout=null;
File envHome= new File("/database2006/index1");
File envHome1= new File("/database2006/index2");
String dbFileName = new String ("mydb.db");
try { // 1
// Open the environment. Create it if it does not already exist.
EnvironmentConfig envConfig = new EnvironmentConfig();
envConfig.setAllowCreate(true);
//envConfig.setCacheSize(131072);
// envConfig.setCacheSize(1048576);
envConfig.setCacheSize(2097152);
envConfig.setInitializeCache(true);
// Open the database. Create it if it does not already exist.
DatabaseConfig dbConfig = new DatabaseConfig();
dbConfig.setAllowCreate(true);
// dbConfig.setPageSize(32768);
dbConfig.setPageSize(8192);
dbConfig.setType(DatabaseType.BTREE);
myDbE = new Environment(envHome, envConfig);
myDbE1= new Environment(envHome1, envConfig);
// dbConfig.setSortedDuplicates(false);
dbin = myDbE.openDatabase(null, dbFileName,null, dbConfig);
dbout= myDbE1.openDatabase(null, dbFileName,null, dbConfig);
}//end try
catch(Exception e){System.out.println(e);}
sttime = System.currentTimeMillis();
StringEntry foundKey = new StringEntry();
StringEntry foundData = new StringEntry();
try {
cursor = dbout.openCursor(null,null);
while (cursor.getNext(foundKey, foundData, LockMode.DEFAULT) ==
OperationStatus.SUCCESS) {
String theKey = foundKey.getString();
String theData = foundData.getString();
try {
StringEntry searchdata = new StringEntry();
if (dbin.get(null, foundKey, searchdata,
LockMode.DEFAULT) ==
OperationStatus.SUCCESS) {
String retData = searchdata.getString();
// System.out.println("ret data= "+ retData);
moddata = retData +theData;
else moddata = theData;
try{
StringEntry updata = new StringEntry(moddata);
dbin.put(null,foundKey,updata);
}catch(DatabaseException dbe){System.out.println(dbe);}
catch (Exception e1) {System.out.println(e1);}
catch(DatabaseException dbe) {System.out.println(dbe);}
entime = System.currentTimeMillis();
eltime = entime - sttime;
System.out.println("\n eltime = " + eltime);
String etime = "";
etime = String.valueOf(eltime);
String wrp = "d://database2006//TimeMerging.txt";
try{
BufferedWriter br = new BufferedWriter(new FileWriter(wrp, true));
br.write("time of Merging = " + etime);
br.newLine();
br.close();
}catch(IOException e) { System.out.println(e);}
try {
if(cursor !=null) cursor.close();
if(myDbE !=null) myDbE.close();
if(myDbE1 !=null) myDbE1.close();
if(dbin !=null) dbin.close();
if(dbout !=null) dbout.close();
}catch(DatabaseException dbex){}
System.out.println("\n End of Merging.....");
public static void main(String[] args) {
Merging2DB merging2DB1 = new Merging2DB();
The error is :
java.lang.NullPointerException
     at java.lang.String.checkBounds(String.java:286)
     at java.lang.String.<init>(String.java:370)
     at dbmerging.Merging2DB$1$StringEntry.getString(Merging2DB.java:42)
     at dbmerging.Merging2DB.<init>(Merging2DB.java:90)
     at dbmerging.Merging2DB.main(Merging2DB.java:142)
Also when I defined a cursor to iterate over the same database above (index1)
the following error occurs :
"Database handles still open at environment close
Open database handle: mydb.db"
"An unexpected exception has been detected in native code outside the VM.
Unexpected Signal : EXCEPTION_ACCESS_VIOLATION (0xc0000005) occurred at PC=0x2DCF723
Function=_db_add_recovery+0x442F
Library=D:\db-4.4.20.tar\db-4.4.20\build_win32\Debug\libdb44d.dll
Current Java thread:
     at com.sleepycat.db.internal.db_javaJNI.Dbc_close0(Native Method)
     at com.sleepycat.db.internal.Dbc.close0(Dbc.java:43)
     at com.sleepycat.db.internal.Dbc.close(Dbc.java:37)
     - locked <0x10055720> (a com.sleepycat.db.internal.Dbc)
     at com.sleepycat.db.Cursor.close(Cursor.java:36)
     - locked <0x100532e0> (a com.sleepycat.db.Cursor)
     at dbmerging.CursorExample.<init>(CursorExample.java:83)
     at dbmerging.CursorExample.main(CursorExample.java:94)
Dynamic libraries:
0x00400000 - 0x00407000      C:\jbuilderx\jdk1.4\bin\javaw.exe
0x7C900000 - 0x7C9B0000      C:\WINDOWS\system32\ntdll.dll
0x7C800000 - 0x7C8F4000      C:\WINDOWS\system32\kernel32.dll
0x77DD0000 - 0x77E6B000      C:\WINDOWS\system32\ADVAPI32.dll
0x77E70000 - 0x77F01000      C:\WINDOWS\system32\RPCRT4.dll
0x77D40000 - 0x77DD0000      C:\WINDOWS\system32\USER32.dll
0x77F10000 - 0x77F56000      C:\WINDOWS\system32\GDI32.dll
0x77C10000 - 0x77C68000      C:\WINDOWS\system32\MSVCRT.dll
0x629C0000 - 0x629C9000      C:\WINDOWS\system32\LPK.DLL
0x74D90000 - 0x74DFB000      C:\WINDOWS\system32\USP10.dll
0x08000000 - 0x08136000      C:\jbuilderx\jdk1.4\jre\bin\client\jvm.dll
0x76B40000 - 0x76B6D000      C:\WINDOWS\system32\WINMM.dll
0x10000000 - 0x10007000      C:\jbuilderx\jdk1.4\jre\bin\hpi.dll
0x00940000 - 0x0094E000      C:\jbuilderx\jdk1.4\jre\bin\verify.dll
0x00950000 - 0x00968000      C:\jbuilderx\jdk1.4\jre\bin\java.dll
0x00970000 - 0x0097D000      C:\jbuilderx\jdk1.4\jre\bin\zip.dll
0x02D60000 - 0x02D7F000      D:\db-4.4.20.tar\db-4.4.20\build_win32\Debug\libdb_java44d.dll
0x02D80000 - 0x02EA1000      D:\db-4.4.20.tar\db-4.4.20\build_win32\Debug\libdb44d.dll
0x02EB0000 - 0x02F10000      C:\WINDOWS\system32\MSVCRTD.dll
0x02F10000 - 0x02F8E000      C:\WINDOWS\system32\MSVCP60D.dll
0x76C90000 - 0x76CB8000      C:\WINDOWS\system32\imagehlp.dll
0x59A60000 - 0x59B01000      C:\WINDOWS\system32\DBGHELP.dll
0x77C00000 - 0x77C08000      C:\WINDOWS\system32\VERSION.dll
0x76BF0000 - 0x76BFB000      C:\WINDOWS\system32\PSAPI.DLL
Heap at VM Abort:
Heap
def new generation total 576K, used 358K [0x10010000, 0x100b0000, 0x104f0000)
eden space 512K, 57% used [0x10010000, 0x100598c0, 0x10090000)
from space 64K, 100% used [0x100a0000, 0x100b0000, 0x100b0000)
to space 64K, 0% used [0x10090000, 0x10090000, 0x100a0000)
tenured generation total 1408K, used 42K [0x104f0000, 0x10650000, 0x14010000)
the space 1408K, 3% used [0x104f0000, 0x104fa9a8, 0x104faa00, 0x10650000)
compacting perm gen total 4096K, used 1278K [0x14010000, 0x14410000, 0x18010000)
the space 4096K, 31% used [0x14010000, 0x1414f848, 0x1414fa00, 0x14410000)
Local Time = Sat Aug 05 15:53:58 2006
Elapsed Time = 0
# The exception above was detected in native code outside the VM
# Java VM: Java HotSpot(TM) Client VM (1.4.2_01-b06 mixed mode)
# An error report file has been saved as hs_err_pid3144.log.
# Please refer to the file for further information.
Exception in thread "main"
I checked for the existance of a key stored in the database which is ="" or null, and I found one key="". I do not know if this cause the abovbe problem or not?
Best regards,
Ahmad.
Message was edited by:
user525205

Similar Messages

Can multiple threads share the same cursor in berkeley db java edition?

We use berkeley db to store our path computation results. We now have two threads which need to retrieve records from database. Specifically, the first thread accesses the database from the very beginning and read a certain number of records. Then, the second thread needs to access the database and read the rest records starting from the position where the cursor stops in the first thread. But, now, I cannot let these two threads share the same cursor. So, I have to open the database separately in two threads and use individual cursor for each thread. This means I have to in the second thread let the cursor skip the first certain number of records and then read the rest records. However, in this way, it is a waste of time letting the second thread skip a certain of records. It will be ideal for us that the second thread can start reading the record just from the place where the first thread stops. Actually, I have tried using transactional cursor and wanted to let the two threads share the same transactional cursor. But it seems that this didn't work.
Can anyone give any suggestion? Thank you so much!
sgao

If your question is really about using the BDB Java Edition product please post to the JE forum:
Berkeley DB Java Edition
If your question is about using the Java API of the BDB (C-based) product, then this is the correct forum.
--mark

The DbEnv memery missing in win7 x64(may be a berkeley'Env bug in x64)

I am a newer programer in Berkeley,this is my first use it.
I create a BDB for png image, about 40gb, the key is used ACE_UINT64, the value is ACE_Message_Block.
I used LoadRunner create 100 user to get the image by my program.
It is correctly in win7 32bit, but it is lost memory in 64bit.
I open the Env with DB_Private | DB_init_pool | DB_thread, and set the cachesize to 1gb, also the DBt of value is set_flags(DB_DBT_MALLOC), also use free(DBt.getdata()).
My server thread's commit memory in taskmgr.exe is keep at 1gb, but the memory in used of system increase never stop, at last all of memey has been used, and my server thread stop at berkeleydb.
I find my used memory is 8gb, my system+loadruner+vs2008 at most 1.5gb, and my server thread keep in 1gb, what the other memory who used?
So I shut down the server thread, all memory came back.
So I change Berkeley DB Storage to Read my image.png direct in file system, the memory is correctly.
So must some wrong in my code to used berkeleydb, must in DBt’ alloc，so how can i free the memory in x64?
So I need helper， what’s the wrong with my DBEnv？How can I free the DBt in 64 bit？
int IMG_Storage_Environment::Initialize( ISF_Profile_Visitor & Profile )
     Env = new DbEnv( 0 );
     int env_flags = DB_CREATE | // If the environment does not exist, create it
          DB_PRIVATE |
          DB_INIT_MPOOL | // Initialize the cache
          DB_THREAD ; // Free-thread the env handle
     if ( Env->set_cachesize( 1, 0, 1) == 0 &&     Env->open( NULL, env_flags, 0 ) == 0 )
          return ERR_SUCCESS;
int IMG_Storage_BerkeleyDB::Initialize( ACE_StrItrT Layer , ACE_StrItrT Path )
     this->db = new Db( IMG_Storage_Environment::Instance()->getDbEnv(), 0 );
     if (
          0 == db->open( NULL, STR_T2A( Path ) , NULL ,DB_UNKNOWN, DB_RDONLY ,NULL)
          ISF_DEBUG( "Open DB: %s Succeed" , Path );
          return ERR_SUCCESS;
int IMG_Storage_BerkeleyDB::GetTile( int x , int y , int z , ACE_Message_Block & Data )
     ACE_UINT64 uKey=this->Key( x, y, z);
     Dbt dbKey(&uKey, sizeof(uKey));
     Dbt dbData;
     dbData.set_flags( DB_DBT_MALLOC );
     int err = db->get(NULL, & dbKey, & dbData, 0);
     if ( 0 == err )
          Data.size( dbData.get_size( ) );
          Data.rd_ptr( Data.base( ) );
          Data.wr_ptr( dbData.get_size( ) );
          ACE_OS::memcpy( Data.rd_ptr( ) , dbData.get_data( ) , dbData.get_size( ) );
     else
          ISF_DEBUG( "Image Not exist, Using Empty Image" , err );
     free(dbData.get_data());
     return ERR_SUCCESS;
Edited by: 886522 on 2011-9-21 上午1:31
Edited by: 886522 on 2011-9-21 上午1:39

I encounter the same problem, although I run Berkeley DB (Ver 6.0.20, C#) under .NET Framework and Windows server 2008(x64). Any BDB application of win32 runs well but will encounter trouble under platform of x64 when compile BDB to x64, even though the DLL compiled and linked with win32. The bug is that Berkeley DB take amount of memory as the size of databases and regardless of cacheSize. My estimation is that all memory for BDB malloced and NOT freed.

Load an existing Berkeley DB file into memory

Dear Experts,
I have created some Berkeley DB (BDB) files onto disk.
I noticed that when I issue key-value retrievals, the page faults are substantial, and the CPU utilization is low.
One sample of the time command line output is as follow:
1.36user 1.45system 0:10.83elapsed 26%CPU (0avgtext+0avgdata 723504maxresident)k
108224inputs+528outputs (581major+76329minor)pagefaults 0swaps
I suspect that the bottleneck is the high frequency of file I/O.
This may be because of page faults of the BDB file, and the pages are loaded in/out of disk fairly frequently.
I wish to explore how to reduce this page fault, and hence expedite the retrieval time.
One way I have read is to load the entire BDB file into main memory.
There are some example programs on docs.oracle.com, under the heading "Writing In-Memory Berkeley DB Applications".
However, I could not get them to work.
I enclosed below my code:
--------------- start of code snippets ---------------
/* Initialize our handles */
DB *dbp = NULL;
DB_ENV *envp = NULL;
DB_MPOOLFILE *mpf = NULL;
const char *db_name = "db.id_url"; // A BDB file on disk, size 66,813,952
u_int32_t open_flags;
/* Create the environment */
db_env_create(&envp, 0);
open_flags =
DB_CREATE | /* Create the environment if it does not exist */
DB_INIT_LOCK | /* Initialize the locking subsystem */
DB_INIT_LOG | /* Initialize the logging subsystem */
DB_INIT_MPOOL | /* Initialize the memory pool (in-memory cache) */
DB_INIT_TXN |
DB_PRIVATE; /* Region files are not backed by the filesystem.
* Instead, they are backed by heap memory. */
* Specify the size of the in-memory cache.
envp->set_cachesize(envp, 0, 70 * 1024 * 1024, 1); // 70 Mbytes, more than the BDB file size of 66,813,952
* Now actually open the environment. Notice that the environment home
* directory is NULL. This is required for an in-memory only application.
envp->open(envp, NULL, open_flags, 0);
/* Open the MPOOL file in the environment. */
envp->memp_fcreate(envp, &mpf, 0);
int pagesize = 4096;
if ((ret = mpf->open(mpf, "db.id_url", 0, 0, pagesize)) != 0) {
envp->err(envp, ret, "DB_MPOOLFILE->open: ");
goto err;
int cnt, hits = 66813952/pagesize;
void *p=0;
for (cnt = 0; cnt < hits; ++cnt) {
db_pgno_t pageno = cnt;
mpf->get(mpf, &pageno, NULL, 0, &p);
fprintf(stderr,"\n\nretrieve %5d pages\n",cnt);
/* Initialize the DB handle */
db_create(&dbp, envp, 0);
* Set the database open flags. Autocommit is used because we are
* transactional.
open_flags = DB_CREATE | DB_AUTO_COMMIT;
dbp->open(dbp, // Pointer to the database
NULL, // Txn pointer
NULL, // File name -- NULL for inmemory
db_name, // Logical db name
DB_BTREE, // Database type (using btree)
open_flags, // Open flags
0); // File mode. defaults is 0
DBT key,data; int test_key=103456;
memset(&key, 0, sizeof(key));
memset(&data, 0, sizeof(data));
key.data = (int*)&test_key;
key.size = sizeof(test_key);
dbp->get(dbp, NULL, &key, &data, 0);
printf("%d --> %s ", *((int*)key.data),(char*)data.data );
/* Close our database handle, if it was opened. */
if (dbp != NULL) {
dbp->close(dbp, 0);
if (mpf != NULL) (void)mpf->close(mpf, 0);
/* Close our environment, if it was opened. */
if (envp != NULL) {
envp->close(envp, 0);
/* Final status message and return. */
printf("I'm all done.\n");
--------------- end of code snippets ---------------
After compilation, the code output is:
retrieve 16312 pages
103456 --> (null) I'm all done.
However, the test_key input did not get the correct value retrieval.
I have been reading and trying this for the past 3 days.
I will appreciate any help/tips.
Thank you for your kind attention.
WAN
Singapore

Hi Mike
Thank you for your 3 steps:
-- create the database
-- load the database
-- run you retrievals
Recall that my original intention is to load in an existing BDB file (70Mbytes) completely into memory.
So following your 3 steps above, this is what I did:
Step-1 (create the database)
I have followed the oracle article on http://docs.oracle.com/cd/E17076_02/html/articles/inmemory/C/index.html
In this step, I have created the environment, set the cachesize to be bigger than the BDB file.
However, I have some problem with the code that opens the DB handle.
The code on the oracle page is as follow:
* Open the database. Note that the file name is NULL.
* This forces the database to be stored in the cache only.
* Also note that the database has a name, even though its
* file name is NULL.
ret = dbp->open(dbp, /* Pointer to the database */
NULL, /* Txn pointer */
NULL, /* File name is not specified on purpose */
db_name, /* Logical db name. */
DB_BTREE, /* Database type (using btree) */
db_flags, /* Open flags */
0); /* File mode. Using defaults */
Note that the open(..) API does not include the BDB file name.
The documentation says that this is so that the API will know that it needs an in-memory database.
However, how do I tell the API the source of the existing BDB file from which I wish to load entirely into memory ?
Do I need to create another DB handle (non-in-memory, with a file name as argument) that reads from this BDB file, and then call DB->put(.) that inserts the records into the in-memory DB ?
Step-2 (load the database)
My question in this step-2 is the same as my last question in step-1, on how do I tell the API to load in my existing BDB file into memory?
That is, should I create another DB handle (non-in-memory) that reads from the existing BDB file, use a cursor to read in EVERY key-value pair, and then insert into the in-memory DB?
Am I correct to say that by using the cursor to read in EVERY key-value pair, I am effectively warming the file cache, so that the BDB retrieval performance can be maximized ?
Step-3 (run your retrievals)
Are the retrieval API, e.g. c_get(..), get(..), for the in-memory DB, the same as the file-based DB ?
Thank you and always appreciative for your tips.
WAN
Singapore

Multiple issues, Berkeley JE 3.2.15, 3.2.76

Hi all,
We have been using JE 3.2.15 with great satisfaction for more than a year, but we've been recently running into what seems to be a deadlock with the following stack trace:
#1) "Thread-12" daemon prio=4 tid=0x0b174320 nid=0x72 in Object.wait() [0x208cd000..0x208cdbb8]
at java.lang.Object.wait(Native Method)
- waiting on <0xf4202fc0> (a com.sleepycat.je.txn.Txn)
at com.sleepycat.je.txn.LockManager.lock(LockManager.java:227)
- locked <0xf4202fc0> (a com.sleepycat.je.txn.Txn)
at com.sleepycat.je.txn.Txn.lockInternal(Txn.java:295)
at com.sleepycat.je.txn.Locker.lock(Locker.java:283)
at com.sleepycat.je.dbi.CursorImpl.lockLNDeletedAllowed(CursorImpl.java:2375)
at com.sleepycat.je.dbi.CursorImpl.lockLN(CursorImpl.java:2297)
at com.sleepycat.je.dbi.CursorImpl.searchAndPosition(CursorImpl.java:1983)
at com.sleepycat.je.Cursor.searchInternal(Cursor.java:1188)
at com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:1158)
at com.sleepycat.je.Cursor.search(Cursor.java:1024)
at com.sleepycat.je.Database.get(Database.java:557)
The call does not seem to return, not for as long as 22 hours anyway. Or maybe it's just very slow performing a single get (these are done in loops), giving the appearance that it's stuck. It appears more or less randomly (every couple of months, maybe) at various sites. We have not been able to reproduce it on our test systems.
We recently upgraded to 3.2.76, kinda hoping that this issue had been fixed. While the new version appeared to work fine during internal testing and then for 2 deployments, if failed repeatedly during the third deployment. The original issue just manifested itself again (the above stack trace is from 3.2.76). We also experienced the following:
#2) Environment invalid because of previous exception: com.sleepycat.je.log.DbChecksumException: (JE 3.2.76) Read invalid log entry type: 49
at com.sleepycat.je.log.LogEntryHeader.<init>(LogEntryHeader.java:69)
at com.sleepycat.je.log.LogManager.getLogEntryFromLogSource(LogManager.java:631)
at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:597)
at com.sleepycat.je.tree.IN.fetchTarget(IN.java:958)
at com.sleepycat.je.dbi.CursorImpl.searchAndPosition(CursorImpl.java:1963)
at com.sleepycat.je.Cursor.searchInternal(Cursor.java:1188)
at com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:1158)
at com.sleepycat.je.Cursor.search(Cursor.java:1024)
at com.sleepycat.je.Database.get(Database.java:557)
#3) <DaemonThread name="Checkpointer"/> caught exception: com.sleepycat.je.DatabaseException: (JE 3.2.76) fetchTarget of 0x61d/0xa2d452 parent IN=147872417 lastFullVersion=0x69c/0x27e251 parent.getDirty()=true state=0 com.sleepycat.je.log.DbChecksumException: (JE 3.2.76) Read invalid log entry type: 58
com.sleepycat.je.DatabaseException: (JE 3.2.76) fetchTarget of 0x61d/0xa2d452 parent IN=147872417 lastFullVersion=0x69c/0x27e251 parent.getDirty()=true state=0 com.sleepycat.je.log.DbChecksumException: (JE 3.2.76) Read invalid log entry type: 58
at com.sleepycat.je.tree.IN.fetchTarget(IN.java:989)
at com.sleepycat.je.cleaner.Cleaner.migrateLN(Cleaner.java:1100)
at com.sleepycat.je.cleaner.Cleaner.lazyMigrateLNs(Cleaner.java:928)
at com.sleepycat.je.tree.BIN.logInternal(BIN.java:1117)
at com.sleepycat.je.tree.IN.log(IN.java:2657)
at com.sleepycat.je.recovery.Checkpointer.logTargetAndUpdateParent(Checkpointer.java:975)
at com.sleepycat.je.recovery.Checkpointer.flushIN(Checkpointer.java:810)
at com.sleepycat.je.recovery.Checkpointer.flushDirtyNodes(Checkpointer.java:670)
at com.sleepycat.je.recovery.Checkpointer.doCheckpoint(Checkpointer.java:442)
at com.sleepycat.je.recovery.Checkpointer.onWakeup(Checkpointer.java:211)
at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:191)
at java.lang.Thread.run(Thread.java:595)
Caused by: com.sleepycat.je.log.DbChecksumException: (JE 3.2.76) Read invalid log entry type: 58
at com.sleepycat.je.log.LogEntryHeader.<init>(LogEntryHeader.java:69)
at com.sleepycat.je.log.LogManager.getLogEntryFromLogSource(LogManager.java:631)
at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:597)
at com.sleepycat.je.tree.IN.fetchTarget(IN.java:958)
... 11 more
* An out of memory condition:
#4) Exception in thread "Cleaner-1" java.lang.OutOfMemoryError: Java heap space
at com.sleepycat.je.log.LogUtils.readByteArray(LogUtils.java:204)
at com.sleepycat.je.log.entry.LNLogEntry.readEntry(LNLogEntry.java:104)
at com.sleepycat.je.log.FileReader.readEntry(FileReader.java:238)
at com.sleepycat.je.log.CleanerFileReader.processEntry(CleanerFileReader.java:140)
at com.sleepycat.je.log.FileReader.readNextEntry(FileReader.java:321)
at com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:411)
at com.sleepycat.je.cleaner.FileProcessor.doClean(FileProcessor.java:259)
at com.sleepycat.je.cleaner.FileProcessor.onWakeup(FileProcessor.java:161)
at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:191)
at java.lang.Thread.run(Thread.java:595)
It is possible, of course, that the actual memory issue is somewhere else in the application and Berkeley just happened to be the one getting the error. However the load is very much 24 hours-periodic and the most significant change was the Berkeley upgrade, which does not seem to be running entirely correctly...
I was wondering:
- is issue #1 indeed a deadlock? Has anyone else experienced it? Is there a fix/workaround?
- are #2 and #3 known issues? Note that these happened both while processing 3.2.15 files and, after failure and deletion, pure 3.2.76 files.
- is it possible that an invalid log entry would cause #4? Or is there a memory leak/incorrect allocation in BDB?
If that helps, the standard installation is on Sun x86 hardware running Solaris and (Sun) Java 1.5.0_11. Berkeley is run as an XA resource. Like I said we haven't seen any of these issues on our internal test systems. The customer that experienced #2, #3 and #4 has since been rolled back to 3.2.15, and #1 is rather infrequent. So that's pretty much all the information we have and are going to get, but if there's anything else you'd like to know...
Thanks in advance,
Matthieu Bentot

Hello Matthieu,
Thank you for your clear explanation.
On #1, the deadlock, I don't see any known JE bug that could cause this, in the current release or that has been fixed since 3.2.15. The thread dump you sent implies that a thread is waiting to acquire a record lock, which means that some other thread or transaction holds the record lock at that time. There are a few things that come to mind:
1) If you are setting the lock timeout to a large value, or to zero which means "wait forever", this kind of problem could occur as the result of normal record lock contention. I assume you're not doing this or you would have mentioned it, but I thought I should ask. Are you calling EnvironmentConfig or TransactionConfig.setLockTimeout, or setting the lock timeout via the je.properties file?
2) Are you performing retries when DeadlockException is thrown (as suggested in our documentation)? By retry I mean that when you catch DeadlockException, you close any open cursors, abort the transaction (if you're using transactions), and then start the operation or transaction from the beginning. One possibility is that two threads are continuously retrying, both trying to access the same record(s). If so, one possible solution is to delay for a small time interval before retrying, to give other threads a chance to finish their operation or transaction.
3) The most common reason for the problem you're seeing is that a cursor or transaction is accidentally left open. A cursor or transaction holds record locks until it is closed (or committed or aborted in the case of a transaction). If you have a cursor or transaction that has "leaked" -- been discarded without being closed -- or you have a thread that keeps a cursor or transaction open for a long period, another thread trying to access the record will be unable to lock it. Normally, DeadlockException will be thrown in one thread or the other. But if you are setting the lock timeout (1) or retrying operations (2), this could cause the problem you're seeing. Are you seeing DeadlockException?
If you can describe more about how you handle DeadlockException, send your environment, database, cursor and transaction configuration settings, and describe how you close cursors and commit/abort transactions, hopefully we will be able to narrow this down.
Your #2 and #3 are the same problem, which is very likely a known problem that we're currently focussed on. It is interesting that you never saw this problem in 3.2.15 -- that could be a clue for us.
When you started seeing the problem, were there any changes other than moving to JE 3.2.76? Did you start using a network file system of some kind? Or use different hardware that provides more concurrency? Or update your application to perform with more concurrency?
You mentioned that this occurred in a deployment but not in your test environment. If you have the JE log files from that deployment and/or you can reproduce the problem there, we would very much like to work with you to resolve this. If so, please send me an email -- mark.hayes at the obvious .com -- and I'll ask you to send log files and/or run with a debugging JE jar file when trying to reproduce the problem.
It is possible that #4 is a side effect of the earlier problems, but it's very difficult to say for sure. I don't see any known bugs since 3.2.76 that could cause an OutOfMemoryError.
Please send us more information on #1 as I mentioned above, and send me email on the checksum problem (#2 and #3).
Thanks,
--mark

How to use Berkeley DB in an application without installing

I'm using Berkeley DB in a python project, and I am wondering if I can make the libraries available to python without specifically installing berkeley DB.
How can you embed Berkeley DB in an application generally?
Has anyone done this with python and bsddb3?
Thank you
Sachi.

The python BDB libraries are maintained independently of Oracle. You will probably have more success asking this question on the BDB python mailing list at https://mailman.jcea.es/listinfo/pybsddb
Lauren Foutz

Berkeley DB JE is getting hanged while calling environment sync method.

Hi,
I am developing an application in Java and I am using JE edition of Berkeley. I am facing an issue, during shutdown operation I am calling syn method of EntityStore class and Environment class.
The EntityStore.sync() is getting completed but during executing of Environment.sync() the control is not coming out of this statement. Below is the code that I am using to perform sync operation.
if (myEnv != null) {
try {
entityStore.sync(); //Calling sync method of Entity Store Class.
log.error("Finished Syncing of Entity Store");
myEnv.sync(); // Calling Sync method of Environment Class. Control is not coming out of this sync call.
log.error("Finished Syncing of Environment");
} catch (DatabaseException dbe) {
log.error("DataBase exception duing close operation ",dbe);
} catch (IllegalStateException ie) {
logger.fatal("Cursor not closed: ", ie);
During my unit testing I was changing system date and perfoming some db operation. While I during down the application the the above code gets executed but the system is getting hanged and control is not coming out of Environment sync method call. Can some one tell why the sync is causing the system to hang.
Thanks in advance.

Hello,
You did not mention the version of BDB JE. In any case for BDB JE questions the correct forum is at:
Berkeley DB Java Edition
Thanks,
Sandra

Berkeley DB XML as Message Store

Hi,
I need to build a 'Message Store' (as referred to in 'Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions') for our SOA. Components of the SOA are coupled by JMS / sockets and send XML messages to each other. The central message store would consume all messages and so provide a message history and allow comparison of messages for support, performance measurement, troubleshooting, etc.
One approach which occured to me would be lightweight Java framework on top of Berkeley DB XML, which would handle connectivity to all integration points.
All of our messages are small (< 5K) but some types are high frequency (say 2K/s).
Can anyone comment on this approach pls or share experiences pls?
Many thanks!
Pete

The short answer is that you need to shut down the FastCGI Perl script before you copy the files over, and then re-start it once the file copy is complete. Basically, what's happening is that FastCGI Perl script has cached data in file-system backed shared memory. When you replace the underlying files via an file system copy "under the covers" so to speak, the in-memory and on-disk data becomes out of sync. Subsequent access to the repository can fail in many ways, a core dump is often going to be the result.
If you don't want to shut down the FastCGI Perl script during the copy, there are several other options that you could consider.
1) Copy the new files into a new/alternate directory location, stop/re-start the FastCGI Perl script when the copy is complete, pointing to the new location. That minimizes the "downtime" or,
2) Delete and insert documents one at a time through the API or,
3) Update documents one at a time through the API or,
4) Truncate the container and insert the documents.
Options 2, 3 & 4 will allow the FastCGI Perl script to keep on running, but will probably take more time and may result in larger database files.
Replication really won't help much if your goal is to completely replace the repository once every 24 hours. The same action would have to be applied to the master and replicated to the other repository locations. The same action of "replace everything" is still occurring.
I hope that this helps.
Regards,
Dave

Berkeley DB I/O performance.

I am testing the I/O performance of Berkeley DB. I create two databases and insert the same number of tuples to them. Keys in the first database are 16 bytes long and in the second they are 32 bytes long. Values in both databases are NULL. The resulting file of the first database is 273.3 MB with 69445 leaf pages in it. The second is 434.2 MB and has 109891 leaf pages.
To test the I/O performance, I scan both databases using a cursor one tuple at a time (not using bulk operations). It is obviously that the time to scan the second database is expected to be two times that of the first one. To my surprise, the results are very close. Too close for me. Here is one example.
8.74 sec
8.96 sec
Since the default pool size is rather small, I wonder why the result can be like this.
Here is my code.
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <cassert>
#include <ctime>
#include "db_cxx.h"
#define CNT 10000000
using namespace std;
int main()
     DbEnv *env = NULL;
     Db *db = NULL;
     Db *db2 = NULL;
     Dbc *cur = NULL;
     Dbt key, value;
     int i, flag, start;
     char buf[256];
     try {
          env = new DbEnv(0);
          assert(env != NULL);
          env -> open(
               "./data/",
               DB_CREATE |
               DB_INIT_MPOOL,
               0
          db = new Db(env, 0);
          assert(db != NULL);
          db2 = new Db(env, 0);
          assert(db2 != NULL);
          db -> open(
               NULL,
               "my_db.db",
               NULL,
               DB_BTREE,
               DB_CREATE,
               0
          db2 -> open(
               NULL,
               "my_db_2.db",
               NULL,
               DB_BTREE,
               DB_CREATE,
               0
          for (i = 0; i < CNT; i++) {
               sprintf(buf, "%016d", i);
               key.set_data(buf);
               key.set_size(16);
               value.set_data(NULL);
               value.set_size(0);
               flag = db -> put(NULL, &key, &value, 0);
               assert(flag == 0);
               sprintf(buf, "%032d", i);
               key.set_data(buf);
               key.set_size(32);
               value.set_data(NULL);
               value.set_size(0);
               flag = db2 -> put(NULL, &key, &value, 0);
               assert(flag == 0);
          db -> sync(0);
          db2 -> sync(0);
          db -> cursor(NULL, &cur, 0);
          start = clock();
          while ((flag = cur -> get(&key, &value, DB_NEXT)) == 0) {
               assert(key.get_size() == 16);
               assert(value.get_size() == 0);
          assert(flag == DB_NOTFOUND);
          printf("%lf\n", (double) (clock() - start) / CLOCKS_PER_SEC);
          cur -> close();
          db2 -> cursor(NULL, &cur, 0);
          start = clock();
          while ((flag = cur -> get(&key, &value, DB_NEXT)) == 0) {
               assert(key.get_size() == 32);
               assert(value.get_size() == 0);
          assert(flag == DB_NOTFOUND);
          printf("%lf\n", (double) (clock() - start) / CLOCKS_PER_SEC);
          cur -> close();
          db -> close(0);
          db2 -> close(0);
          env -> close(0);
     } catch (DbException &e) {
          exit(1);
     } catch (exception &e) {
          exit(1);
     return (0);
}Edited by: mdzfirst on 2011-2-24 下午12:42

Thanks for your answer. You are right. The OS is helping here though unexpected. However, tuning I/O efficiency of my application is one goal of my experiment. So I wonder can I just turn off the reading-ahead / prefetching? I have googled it and got some methods to flush the OS cache. Nevertheless, a cold start is not enough. Can I config BDB or OS (ubuntu) so that 'direct' I/O is performed all the time?

Berkeley db fatal region error detected run recovery

Hi,
I have initiated d Berkeley DB object.
Then, I am using multithreading to put data into the DB.
Here is how I open the one data handler.
ret = dbp->open(dbp, /* Pointer to the database */
NULL, /* Txn pointer */
db_file_name, /* File name */
db_logical_name , /* Logical db name (unneeded) */
DB_BTREE, /* Database type (using btree) */
DB_CREATE, /* Open flags */
0644); /* File mode. Using defaults */
each threads would put data into the same handler when it needs to.
I am getting "berkeley db fatal region error detected run recovery".
What is the problem? Does it have anything to do with the way the handler is created or whether multiple threads can put data into the DB?
jb

Hi jb,
user8712854 wrote:
I am getting "berkeley db fatal region error detected run recovery".This is a generic Berkeley DB panic message that means something went wrong. When do you get this error? Did you enable the verbose error messages? Are there any other warning/error messages reported? By just posting the way you open the database doesn't help much at all. I reviewed the other questions you asked on the forum, and it seems that you are not following the advices and consult the documentation, but you prefer to open new threads. I advice you to first work on configuring Berkeley DB for your needs and read the following pages in order to understand which Berkeley DB Product best suits your application and how it should be configured for a multithreaded application:
[The Berkeley DB products|http://www.oracle.com/technology/documentation/berkeley-db/db/ref/intro/products.html]
[Concurrent Data Store introduction|http://www.oracle.com/technology/documentation/berkeley-db/db/ref/cam/intro.html]
[Multithreaded applications|http://www.oracle.com/technology/documentation/berkeley-db/db/ref/program/mt.html]
[Berkeley DB handles|http://www.oracle.com/technology/documentation/berkeley-db/db/ref/program/scope.html]
On the other hand, if the code that's calling the Berkeley DB APIs is not too big and is not private, you can post it here so we can review it for you and let you know what's wrong with it.
The procedures you should follow in order to fix a "run recovery" error are described here:
[Recovery procedures|http://www.oracle.com/technology/documentation/berkeley-db/db/ref/transapp/recovery.html]
Thanks,
Bogdan Coman

Berkeley DB C++ query on floating index

Im using Berkeley DB C++ API 6.0 on OSX. My application creates a database with the following tables:
Primary table: (int, myStruct) -> myStruct is a buffer.
Secondary index: (float, myStruct) -> The float key is an information that I retrieve in myStruct buffer with the following callback.
int meanExtractor(Db *sdbp, const Dbt *pkey, const Dbt *pdata, Dbt *skey) { Dbt data = *pdata; feature<float> f; restoreDescriptor(f, data); void* mean = malloc( sizeof(float) ); memcpy( mean, &f.mean, sizeof(float) ); skey->set_data(mean); skey->set_size(sizeof(float)); skey->set_flags( DB_DBT_APPMALLOC ); return 0; }
When I iterate over the secondary index and print the key/data pairs, the float keys are well stored. My problem is that I can't query this table. I would like to execute this SQL Query for example:
SELECT * FROM secondary index WHERE keys > 1.5 && keys < 3.4
My table is filled by 50000 keys between 0.001 and 49.999. The thing is when I use this method for example:
I assume the Db and the table are already opened float i = 0.05; Dbt key = Dbt(&i, sizeof(float)); Dbc* dbc; db->cursor( txn, &dbc, 0 ); int ret; ret = dbc->get( key, &vald, DB_SET_RANGE));
Its retrieved this key: 0.275. It should retrieve 0.05 (because it exists) or at least 0.051. And for any other floating value in the Dbt key, it gives me some stupid values. If I put the DB_SET flag, it just doesn't find any keys. My idea was to set the cursor to the smallest key greater than or equal to my key, and then to iterate with the flag DB_NEXT until I reach the end of my range.
This must come from the searching algorithm of BerkeleyDB but I saw some (usefull but not enough) examples that do exactly what I need but with the Java API, so it proves that is possible to do...
I'm pretty stucked with this one, so if anybody already had this problem before, thx for helping me. I can put other parts of my code if necessary.

Hi,
Since the default byte-comparison does not reflect the sort order of float numbers, have you set the bt_compare function for your secondary database ? From your description, your query relies on the correct float number sort order, so I think you should set a custom bt_compare function.
Also, as you do not do exact search, and just do range search, DBC->get(DB_SET) does not work for you. I think you need to use DB_SET_RANGE flag to get the nearest(just >=) item. You can check the documentation for DBC(or Dbc in C++) for more information.
Regards,
Winter, Oracle Berkeley DB

Berkeley db lost data when using autocommit

I has written an app which uses berkeley db 4.5.20(libdb_cxx-4.5.so). There are serveral threads reading and writing the same db using autocommit. All the reading threads share one db handle and All writing threads have their own db handles. In reading threads, cursor and explicit transaction are used to retrieve data. In write thread, the db->put method is used to write data. In the app, the return value of put is checked to see whether success.
After serveral days running, the app definitely lost some data(there is a thread to dump the imcomming data to disk for comparison), and the return value of put method seems all ok. Can someone point out the possible reason or how to debug this from log files or something else?
The DBENV is opened with flags,
DB_INIT_TXN | DB_INIT_MPOOL |
DB_INIT_LOCK | DB_INIT_LOG |
DB_CREATE | DB_RECOVER |
DB_THREAD
The DB handle is opened with flags, DB_CREATE | DB_AUTO_COMMIT | DB_THREAD
thanks
Edited by: user13262865 on 2010-8-19 下午10:20
Edited by: user13262865 on 2010-8-19 下午10:28

Hi,
Thanks for your reply. Actually i got no hint from the link :). Today, i found the db->put method throw DB_RUNRECOVERY exception when a check point thread is on-going after the app runs about ten days. The check point thread stack is as follows,
Thread 12 (Thread 26744):
#0 0x000000368accc5f2 in select () from /lib64/libc.so.6
#1 0x00002b367aa29043 in __os_sleep ()
from /usr/local/BerkeleyDB.4.5/lib/libdb_cxx-4.5.so
#2 0x00002b367aa24090 in __memp_sync_int ()
from /usr/local/BerkeleyDB.4.5/lib/libdb_cxx-4.5.so
#3 0x00002b367aa2470e in __memp_sync ()
from /usr/local/BerkeleyDB.4.5/lib/libdb_cxx-4.5.so
#4 0x00002b367aa316aa in __txn_checkpoint ()
from /usr/local/BerkeleyDB.4.5/lib/libdb_cxx-4.5.so
#5 0x00002b367aa319f1 in __txn_checkpoint_pp ()
from /usr/local/BerkeleyDB.4.5/lib/libdb_cxx-4.5.so
#6 0x00002b367a965257 in DbEnv::txn_checkpoint(unsigned int, unsigned int, unsigne
d int) () from /usr/local/BerkeleyDB.4.5/lib/libdb_cxx-4.5.so
#7 0x0000000000437e1f in DBEnv::checkPoint (this=0x676ae0)
at ../source/DBEnv.cpp:46
#8 0x000000000043e436 in thread_check_point (arg=0x7fffcce2b6b0)
at ../source/ChkPointer.cpp:43
#9 0x000000368b806367 in start_thread () from /lib64/libpthread.so.0
#10 0x000000368acd30ad in clone () from /lib64/libc.so.6
My Berkeley version is db-4.5.20.NC
Edited by: user13262865 on 2010-8-31 上午4:50
Edited by: user13262865 on 2010-8-31 上午4:51

Berkeley DB and Tuxedo

Dear all,
I am trying to set up Berkeley DB from Sleepycat Software (an open
source database implementation) as a backend database for Tuxedo with
X/Open transaction support on a HP-UX 11 System. According to the
documentation, this should work. I have successfully compiled and
started the resource manager (called DBRM) and from the logs
everything looks fine.
The trouble starts, however, when I try to start services that use
DBRM. The startup call for opening the database enviroment ("database
enviroment" is a Berkeley DB specific term that refers to a grouping
of files that are opened together with transaction support) fails with
the error message
error: 12 (Not enough space)
Some digging in the documentation for Berkeley DB reveals the
following OS specific snippet (DBENV->open is the function call that
causes the error message above):
<quote>
An ENOMEM error is returned from DBENV->open or DBENV->remove.
Due to the constraints of the PA-RISC memory architecture, HP-UX
does not allow a process to map a file into its address space
multiple times. For this reason, each Berkeley DB environment may
be opened only once by a process on HP-UX, i.e., calls to
DBENV->open will fail if the specified Berkeley DB environment
has been opened and not subsequently closed.
</quote>
OK. So it appears that a call to DBENV->open does a mmap and that
cannot happen twice on the same file in the same process. Looking at
the source for the resource manager DBRM it appears, that there is
indeed a Berkeley DB enviroment that is opened (once), otherwise
transactions would not work. A ps -l on the machine in question looks
like this (I have snipped a couple of columns to fit into a newsreader):
UID PID PPID C PRI NI ADDR SZ TIME COMD
101 29791 1 0 155 20 1017d2c00 84 0:00 DBRM
101 29787 1 0 155 20 10155bb00 81 0:00 TMS_QM
101 29786 1 0 155 20 106d54400 81 0:00 TMS_QM
101 29790 1 0 155 20 100ed2200 84 0:00 DBRM
0 6742 775 0 154 20 1016e3f00 34 0:00 telnetd
101 29858 6743 2 178 20 100ef3900 29 0:00 ps
101 29788 1 0 155 20 100dfc500 81 0:00 TMS_QM
101 29789 1 0 155 20 1024c8c00 84 0:00 DBRM
101 29785 1 0 155 20 1010d7e00 253 0:00 BBL
101 6743 6742 0 158 20 1017d2e00 222 0:00 bash
So every DBRM is started as its own process and the service process
(which does not appear above) would be its own process as well. So how
can it happen that mmap on the same file is called twice in the same
process? What exactly does tmboot do in terms of startup code? Is it
just a couple of fork/execs or there more involved?
Thanks for any suggestions,
Joerg Lenneis
email: [email protected]

Peter Holditch:
Joerg,
Comments in-line.
Joerg Lenneis wrote:[snip]
I have no experience of Berkley DB. Normally the xa_open routine provided by
your database, and called by tx_open, will connect the server process itself to
the database. What that means is database specific. I expect in the case of
Berkley DB, it has done the mmap for you. I guess the open parameters in your
code above are also in your OPENINFO string in the Tuxedo ubbconfig file?
It does not sound to me like you have a problem.Fortunately, I do not any more. Your comments and looking at the
source for the xa interface have put me on the right track. What I did
not realise is that (as you point out in the praragraph above) a
Tuxedo service process that uses a resource manager gets the
following structure linked in:
const struct xa_switch_t db_xa_switch = {
"Berkeley DB", /* name[RMNAMESZ] */
TMNOMIGRATE, /* flags */
0, /* version */
__db_xa_open, /* xa_open_entry */
__db_xa_close, /* xa_close_entry */
__db_xa_start, /* xa_start_entry */
__db_xa_end, /* xa_end_entry */
__db_xa_rollback, /* xa_rollback_entry */
__db_xa_prepare, /* xa_prepare_entry */
__db_xa_commit, /* xa_commit_entry */
__db_xa_recover, /* xa_recover_entry */
__db_xa_forget, /* xa_forget_entry */
__db_xa_complete /* xa_complete_entry */
This is database specific, of course, so it would look different for,
say, Oracle. The entries in that structure are pointers to various
functions which are called by Tuxedo on behalf of the server process
on startup and whenever transaction management is necessary. xa_open
does indeed open the database, which means opening an enviroment
with a mmap somwhere in the case of Berkeley DB. In my code I then
tried to open the enviroment again (you are right, the OPENINFO string
is the same in ubbconfig as in my code) which led to the error message
posted in my initial message.
I had previously thougt that the service process would contact the
resource manager via some IPC mechanism for opening the database.
>>
>>
If I am mistaken, then things look a bit dire. Provided that this is
even the correct thing to do I could move the tx_open() after the call
to env->open, but this would still mean there are two mmaps in the
same process. I also need both calls to i) initiate the transaction
subsystem and ii) get hold of the pointer DB_ENV *env which is the
handle for all subsequent DB access.
In the case or servers using OCI to access Oracle, there is an OCI API that
allows a connection established through xa to be associated with an OCI
connection endpoint. I suspect there is an equivalent function provided by
Berkley DB?There is not, but see my comments below about how to get to the
Berkeley DB enviroment.
[snip]
I doubt it. xa works because xa routines are called in the same thread as the
data access routines. Typically, a server thread will run like this...
xa_start(Tuxedo Transaction ID) /* this is done by the Tux. service dispatcher
before your code is executed */
manipulate_data(whatever parameters necessary) /* this is the code you wrote in
your service routine */
xa_end() /* Tuxedo calls this after your service calls tpreturn or tpforward */
The association between the Tuxedo Transaction ID and the data manipulation is
made by the database because of this calling sequence.OK, this makes sense. Good to know this as well ...
[snip]
For somebody else trying this, here is the correct way:
==================================================
int
tpsvrinit(int argc, char *argv[])
int ret;
if (tpopen() < 0)
     userlog("error tpopen");
userlog("startup, opening database\n");
if (ret = db_create(&dbp, NULL, DB_XA_CREATE)) {
     userlog("error %i db_create: %s", ret, db_strerror(ret));
     return -1;
if (ret = dbp->open(dbp, "sometablename", NULL, DB_BTREE, DB_CREATE, 0644)) {
     userlog("error %i db->open", ret);
     return -1;
return(0);
==================================================
What happens is that the call to the xa_open() function implicitly
opens the Berkely DB enviroment for the database in question, which is
given in the OPENINFO string in the configuration file. It is an error
to specify the enviroment in the call to db_create() in such a
context. All calls to change the database do not need an enviroment
specified and the calls to begin/commit/abort transactions that are
normally used by Berkeley DB which use the enviroment are superseded
by tpopen(), tpclose() and friends. It would be an error to use those
calls anyway.
Thank you very much Peter for your comments which have helped a lot.
Joerg Lenneis
email: [email protected]

Berkeley DB Sessions at Oracle OpenWorld Sept 19 - 23

All,
Just posting some of the Berkeley DB related sessions at Oracle OpenWorld this year. Hope to see you there.
Session ID:      S317033
Title:      Oracle Berkeley DB: Enabling Your Mobile Data Strategy
Abstract:      Mobile data is everywhere. Deploying applications and updates, as well as collecting data from the field and synchronizing it with the Oracle Database server infrastructure, is everyone?s concern today in IT. Mobile devices, by their very nature, are easily damaged, lost, or stolen. Therefore, enabling secure, rapid mobile deployment and synchronization is critically important. By combining Oracle Berkeley DB 11g and Oracle Database Lite Mobile Server, you can easily link your mobile devices, users, applications, and data with the corporate infrastructure in a safe and reliable manner. This session will discuss several real-world use cases.
Speaker(s):
Eric Jensen, Oracle, Principal Product Manager
Greg Rekounas, Rekounas.org,
Event:      JavaOne and Oracle Develop
Stream(s):      ORACLE DEVELOP, DEVELOP
Track(s):      Database Development
Tags:      Add Berkeley DB
Session Type:      Conference Session
Session Category:      Case Study
Duration:      60 min.
Schedule:      Wednesday, September 22, 11:30 | Hotel Nikko, Golden Gate
Session ID:      S318539
Title:      Effortlessly Enhance Your Mobile Applications with Oracle Berkeley DB and SQLite
Abstract:      In this session, you'll learn the new SQL capabilities of Oracle Berkeley DB 11g. You'll discover how Oracle Berkeley DB is a drop-in replacement for SQLite; applications get improved performance and concurrency without sacrificing simplicity and ease of use. This hands-on lab explores seamless data synchronization for mobile applications using the Oracle Mobile Sync Server to synchronize data with the Oracle Database. Oracle Berkeley DB is an OSS embedded database that has the features, options, reliability, and flexibility that are ideal for developing lightweight commercial mobile applications. Oracle Berkeley DB supports a wide range of mobile platforms, including Android.
Speaker(s):
Dave Segleau, Oracle, Product Manager
Ashok Joshi, Oracle, Senior Director, Development
Ron Cohen, Oracle, Member of Technical Staff
Eric Jensen, Oracle, Principal Product Manager
Event:      JavaOne and Oracle Develop
Stream(s):      ORACLE DEVELOP, DEVELOP
Track(s):      Database Development
Tags:      Add 11g, Berkeley DB, Embedded Development, Embedded Technology
Session Type:      Hands-on Lab
Session Category:      Features
Duration:      60 min.
Schedule:      Wednesday, September 22, 16:45 | Hilton San Francisco, Imperial Ballroom A
Session ID:      S317032
Title:      Oracle Berkeley DB: Adding Scalability, Concurrency, and Reliability to SQLite
Abstract:      Oracle Berkeley DB and SQLite: two industry-leading libraries in a single package. This session will look at use cases where the Oracle Berkeley DB library's advantages bring strong enhancements to common SQLite scenarios. You'll learn how Oracle Berkeley DB?s scalability, concurrency, and reliability significantly benefit SQLite applications. The session will focus on Web services, multithreaded applications, and metadata management. It will also explore how to leverage the powerful features in SQLite to maximize the functionality of your application while reducing development costs.
Speaker(s):
Jack Kreindler, Genie DB,
Scott Post, Thomson Reuters, Architect
Dave Segleau, Oracle, Product Manager
Event:      JavaOne and Oracle Develop
Stream(s):      ORACLE DEVELOP, DEVELOP
Track(s):      Database Development
Tags:      Add Berkeley DB
Session Type:      Conference Session
Session Category:      Features
Duration:      60 min.
Schedule:      Monday, September 20, 11:30 | Hotel Nikko, Nikko Ballroom I
Session ID:      S317038
Title:      Oracle Berkeley DB Java Edition: High Availability for Your Java Data
Abstract:      Oracle Berkeley DB Java Edition is the most scalable, highest performance Java application data store available today. This session will focus on the latest features, including triggers and sync with Oracle Database as well as new performance and scalability enhancements for high availability, with an emphasis on real-world use cases. We'll discuss deployment, configuration, and maximized throughput scenarios. You'll learn how you can use Oracle Berkeley DB Java Edition High Availability to increase the reliability and performance of your Java application data storage.
Speaker(s):
Steve Shoaff, UnboundID Corp, CEO
Alex Feinberg, Linkedin,
Ashok Joshi, Oracle, Senior Director, Development
Event:      JavaOne and Oracle Develop
Stream(s):      ORACLE DEVELOP, DEVELOP
Track(s):      Database Development
Tags:      Add Berkeley DB
Session Type:      Conference Session
Session Category:      Features
Duration:      60 min.
Schedule:      Thursday, September 23, 12:30 | Hotel Nikko, Mendocino I / II
Session ID:      S314396
Title:      Java SE for Embedded Meets Oracle Berkeley DB at the Edge
Abstract:      This session covers a special case of edge-to-enterprise computing, where the edge consists of embedded devices running Java SE for Embedded in combination with Oracle Berkeley DB Java Edition, a widely used embedded database. The approach fits a larger emerging trend in which edge embedded devices are "smart"--that is, they come equipped with an embedded (in-process) database for structured persistent storage of data as needed. In addition, these devices may optionally come with a thin middleware layer that can perform certain basic data processing operations locally. The session highlights the synergies between both technologies and how they can be utilized. Topics covered include implementation and performance optimization.
Speaker(s):      Carlos Lucasius, Oracle , Java Embedded Engineering
Carlos Lucasius works in the Java Embedded and Real-Time Engineering product team at Oracle Corporation, where he is involved in development, testing, and technical support. Prior to joining Sun (now Oracle), he worked as an consultant to IT departments at various companies in both North-America and Europe; specific application domains he was involved in include artificial intelligence, pattern recognition, advanced data processing, simulation, and optimization as applied to complex systems and processes such as intelligent instruments and industrial manufacturing. Carlos has presented frequently at scientific conferences, universities/colleges, and corporations across North-America and Europe. He has also published a number of papers in refereed international journals covering applied scientific research in abovementioned areas.
Event:      JavaOne and Oracle Develop
Stream(s):      JAVAONE
Track(s):      Java for Devices, Card, and TV
Session Type:      Conference Session
Session Category:      Case Study
Duration:      60 min.
Schedule:      Tuesday, September 21, 13:00 | Hilton San Francisco, Golden Gate 1
Session ID:      S313952
Title:      Developing Applications with Oracle Berkeley DB for Java and Java ME Smartphones
Abstract:      Oracle Berkeley DB is a high-performance, embeddable database engine for developers of mission-critical systems. It runs directly in the application that uses it, so no separate server is required and no human administration is needed, and it provides developers with fast, reliable, local persistence with zero administration. The Java ME platform provides a new, rich user experience for cell phones comparable to the graphical user interfaces found on the iPhone, Google Android, and other next-generation cell phones. This session demonstrates how to use Oracle Berkeley DB and the Java ME platform to deliver rich database applications for today's cell phones.
Speaker(s):      Hinkmond Wong, Oracle, Principal Member of Technical Staff
Hinkmond Wong is a principal engineer with the Java Micro Edition (Java ME) group at Oracle. He was the specification lead for the Java Community Process (JCP) Java Specification Requests (JSRs) 36, 46, 218, and 219, Java ME Connected Device Configuration (CDC) and Foundation Profile. He holds a B.S.E degree in Electrical Engineering from the University of Michigan (Ann Arbor) and an M.S.E degree in Computer Engineering from Santa Clara University. Hinkmond's interests include performance tuning in Java ME and porting the Java ME platform to many types of embedded devices. His recent projects include investigating ports of Java ME to mobile devices, such as Linux/ARM-based smartphones and is the tech lead of CDC and Foundation Profile libraries. He is the author of the book titled "Developing Jini Applications Using J2ME Technology".
Event:      JavaOne and Oracle Develop
Stream(s):      JAVAONE
Track(s):      Java ME and Mobile, JavaFX and Rich User Experience
Tags:      Add Application Development, Java ME, Java Mobile, JavaFX Mobile, Mobile Applications
Session Type:      Conference Session
Session Category:      Tips and Tricks
Duration:      60 min.
Schedule:      Monday, September 20, 11:30 | Hilton San Francisco, Golden Gate 3
I think I have them all. If I have missed any, please reply and I can update the list, or just post the info in the reply.
Thanks,
Greg Rekounas

are any links to access these Seminars??

Berkeley DB Oracle Open World Sessions

Oracle Open World is going to be on Sep. 21 through 25 at the Moscone Center in San Francisco. You can register here.
If you're already registered, you can pre-enroll for any of these Berkeley DB sessions:
S298846: Hands-on Lab: Lightning-Fast Java Object Persistence Using Oracle Berkeley DB Java Edition (Sun 9/21, 10:30)
S299649: Panel: Choosing the Right Embedded Database for Your Application (Microsoft/TriCipher/Actuate, Mon 9/22, 13:00)
S299631: Oracle Berkeley DB XML in MapGuide Open Source and Autodesk MapGuide Enterprise (Autodesk, Mon 9/22, 14:30)
S299709: A Fully Customizable Point-of-Sale System at FEC, Using Oracle Berkeley DB (Firich Enterprises, Mon 9/22, 17:30)
S299654: Carrier-Grade Applications Using Oracle Berkeley DB (Adaptive Mobile, Tue 9/23, 17:30)
S298845: Accelerating Application Performance with Oracle Berkeley DB (Riverbed Technology, Wed 9/24, 17:00)

[http://orana.info/]
[http://www.rittmanmead.com/blog]
[http://www.livestream.com/openworldlive]
twitter @oracleopenworld, @oracletechnet, #oow09

Berkeley DB XML crash with multiple readers (dbxml-2.5.16 and db-4.8.26)

I am using Berkeley DB XML (v. 2.5.16 and the bundled underlying Berkeley DB 4.8.26, which I suppose is now fairly old) to manage an XML database which is read by a large number (order 100) of independent worker processes communicating via MPI. These processes only read from the database; a single master process performs writes.
Everything works as expected with one or two worker processes. But with three or more, I am experiencing database panics with the error
pthread lock failed: Invalid argument
PANIC: Invalid argument
From searching with Google I can see that issues arising from incorrectly setting up the environment to support concurrency are are fairly common. But I have not been able to find a match for this problem, and as far as I can make out from the documentation I am using the correct combination of flags; I use DB_REGISTER and DB_RECOVER to handle the fact that multiple processes join the environment independently. Each process uses on a single environment handle, and joins using
DB_ENV* env;
db_env_create(&env, 0);
u_int32_t env_flags = DB_INIT_LOG | DB_INIT_MPOOL | DB_REGISTER | DB_RECOVER | DB_INIT_TXN | DB_CREATE;
env->open(env, path to environment, env_flags, 0);
Although the environment requests DB_INIT_TXN, I am not currently using transactions. There is an intention to implement this later, but my understanding was that concurrent reads would function correctly without the full transaction infrastructure.
All workers seem to join the environment correctly, but then fail when an attempt is made to read from the database. They will all try to access the same XML document in the same container (because it gives them instructions about what work to perform). However, the worker processes open each container setting the read-only flag:
DbXml::XmlContainerConfig models_config;
models_config.setReadOnly(true);
DbXml::XmlContainer models = this->mgr->openContainer(path to container, models_config);
Following the database panic, the stack trace is
[lcd-ds283:27730] [ 0] 2   libsystem_platform.dylib            0x00007fff8eed35aa _sigtramp + 26
[lcd-ds283:27730] [ 1] 3   ???                                 0x0000000000000000 0x0 + 0
[lcd-ds283:27730] [ 2] 4   libsystem_c.dylib                   0x00007fff87890bba abort + 125
[lcd-ds283:27730] [ 3] 5   libc++abi.dylib                     0x00007fff83aff141 __cxa_bad_cast + 0
[lcd-ds283:27730] [ 4] 6   libc++abi.dylib                     0x00007fff83b24aa4 _ZL25default_terminate_handlerv + 240
[lcd-ds283:27730] [ 5] 7   libobjc.A.dylib                     0x00007fff89ac0322 _ZL15_objc_terminatev + 124
[lcd-ds283:27730] [ 6] 8   libc++abi.dylib                     0x00007fff83b223e1 _ZSt11__terminatePFvvE + 8
[lcd-ds283:27730] [ 7] 9   libc++abi.dylib                     0x00007fff83b21e6b _ZN10__cxxabiv1L22exception_cleanup_funcE19_Unwind_Reason_CodeP17_Unwind_Exception + 0
[lcd-ds283:27730] [ 8] 10 libdbxml-2.5.dylib                  0x000000010f30e4de _ZN5DbXml18DictionaryDatabaseC2EP8__db_envPNS_11TransactionERKNSt3__112basic_stringIcNS5_11char_traitsIcEENS5_9allocatorIcEEEERKNS_15ContainerConfigEb + 1038
[lcd-ds283:27730] [ 9] 11 libdbxml-2.5.dylib                  0x000000010f2f348c _ZN5DbXml9Container12openInternalEPNS_11TransactionERKNS_15ContainerConfigEb + 1068
[lcd-ds283:27730] [10] 12 libdbxml-2.5.dylib                  0x000000010f2f2dec _ZN5DbXml9ContainerC2ERNS_7ManagerERKNSt3__112basic_stringIcNS3_11char_traitsIcEENS3_9allocatorIcEEEEPNS_11TransactionERKNS_15ContainerConfigEb + 492
[lcd-ds283:27730] [11] 13 libdbxml-2.5.dylib                  0x000000010f32a0af _ZN5DbXml7Manager14ContainerStore13findContainerERS0_RKNSt3__112basic_stringIcNS3_11char_traitsIcEENS3_9allocatorIcEEEEPNS_11TransactionERKNS_15ContainerConfigEb + 175
[lcd-ds283:27730] [12] 14 libdbxml-2.5.dylib                  0x000000010f329f75 _ZN5DbXml7Manager13openContainerERKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEEPNS_11TransactionERKNS_15ContainerConfigEb + 101
[lcd-ds283:27730] [13] 15 libdbxml-2.5.dylib                  0x000000010f34cd46 _ZN5DbXml10XmlManager13openContainerERKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEERKNS_18XmlContainerConfigE + 102
Can I ask if it's clear to anyone what I am doing wrong?

Is it possible that the root problem to this is in the MPI code or usage? Because if the writer process crashes while holding an active transaction or open database handles, it could leave the environment in an inconsistent state that would result in the readers throwing a PANIC error when they notice the inconsistent environment.
Thanks for looking into this.
It looks like there was a small typo in the code I quoted, and I think it was this which caused the segmentation fault or memory corruption. Although I checked a few times that the code snippet produced expected results before posting it, I must have been unlucky that it just happened not to cause a segfault on those attempts.
This is a corrected version:
#include <iostream>
#include <vector>
#include "dbxml/db.h"
#include "dbxml/dbxml/DbXml.hpp"
#include "boost/mpi.hpp"
static std::string envname = std::string("test");
static std::string pkgname = std::string("packages.dbxml");
static std::string intname = std::string("integrations.dbxml");
int main(int argc, char *argv[])
    boost::mpi::environment mpi_env;
    boost::mpi::communicator mpi_world;
    if(mpi_world.rank() == 0)
        std::cerr << "-- Writer creating environment" << std::endl;
        DB_ENV *env;
        int dberr = ::db_env_create(&env, 0);
        std::cerr << "**   creation response = " << dberr << std::endl;
        if(dberr > 0) std::cerr << "**   " << ::db_strerror(dberr) << std::endl;
        std::cerr << "-- Writer opening environment" << std::endl;
        u_int32_t env_flags = DB_INIT_LOCK | DB_INIT_LOG | DB_INIT_MPOOL | DB_REGISTER | DB_RECOVER | DB_INIT_TXN | DB_CREATE;
        dberr = env->open(env, envname.c_str(), env_flags, 0);
        std::cerr << "**   opening response = " << dberr << std::endl;
        if(dberr > 0) std::cerr << "**   " << ::db_strerror(dberr) << std::endl;
        // set up XmlManager object
        DbXml::XmlManager *mgr = new DbXml::XmlManager(env, DbXml::DBXML_ADOPT_DBENV | DbXml::DBXML_ALLOW_EXTERNAL_ACCESS);
        // create containers - these will be used by the workers
        DbXml::XmlContainerConfig pkg_config;
        DbXml::XmlContainerConfig int_config;
        pkg_config.setTransactional(true);
        int_config.setTransactional(true);
        std::cerr << "-- Writer creating containers" << std::endl;
        DbXml::XmlContainer packages       = mgr->createContainer(pkgname.c_str(), pkg_config);
        DbXml::XmlContainer integrations   = mgr->createContainer(intname.c_str(), int_config);
        std::cerr << "-- Writer instructing workers" << std::endl;
        std::vector<boost::mpi::request> reqs(mpi_world.size() - 1);
        for(unsigned int                 i = 1; i < mpi_world.size(); i++)
            reqs[i - 1] = mpi_world.isend(i, 0); // instruct workers to open the environment
        // wait for all messages to be received
        boost::mpi::wait_all(reqs.begin(), reqs.end());
        std::cerr << "-- Writer waiting for termination responses" << std::endl;
        // wait for workers to advise successful termination
        unsigned int outstanding_workers = mpi_world.size() - 1;
        while(outstanding_workers > 0)
            boost::mpi::status stat = mpi_world.probe();
            switch(stat.tag())
                case 1:
                    mpi_world.recv(stat.source(), 1);
                    outstanding_workers--;
                    break;
        delete mgr; // exit, closing database and environment
    else
        mpi_world.recv(0, 0);
        std::cerr << "++ Reader " << mpi_world.rank() << " beginning work" << std::endl;
        DB_ENV *env;
        ::db_env_create(&env, 0);
        u_int32_t env_flags = DB_INIT_LOCK | DB_INIT_LOG | DB_INIT_MPOOL | DB_REGISTER | DB_RECOVER | DB_INIT_TXN | DB_CREATE;
        env->open(env, envname.c_str(), env_flags, 0);
        // set up XmlManager object
        DbXml::XmlManager *mgr = new DbXml::XmlManager(env, DbXml::DBXML_ADOPT_DBENV | DbXml::DBXML_ALLOW_EXTERNAL_ACCESS);
        // open containers which were set up by the master
        DbXml::XmlContainerConfig pkg_config;
        DbXml::XmlContainerConfig int_config;
        pkg_config.setTransactional(true);
        pkg_config.setReadOnly(true);
        int_config.setTransactional(true);
        int_config.setReadOnly(true);
        DbXml::XmlContainer packages     = mgr->openContainer(pkgname.c_str(), pkg_config);
        DbXml::XmlContainer integrations = mgr->openContainer(intname.c_str(), int_config);
        mpi_world.isend(0, 1);
        delete mgr; // exit, closing database and environment
    return (EXIT_SUCCESS);
This repeatably causes the crash on OS X Mavericks 10.9.1. Also, I have checked that it repeatably causes the crash on a virtualized OS X Mountain Lion 10.8.5. But I do not see any crashes on a virtualized Ubuntu 13.10. My full code likewise works as expected with a large number of readers under the virtualized Ubuntu. I am compiling with clang and libc++ on OS X, and gcc 4.8.1 and libstdc++ on Ubuntu, but using openmpi in both cases. Edit: I have also compiled with clang and libc++ on Ubuntu, and it works equally well.
Because the virtualized OS X experiences the crash, I hope the fact that it works on Ubuntu is not just an artefact of virtualization. (Unfortunately I don't currently have a physical Linux machine with which to check.) In that case the implication would seem to be that it's an OS X-specific problem. 2nd edit (14 Feb 2014): I have now managed to test on a physical Linux cluster, and it appears to work as expected. Therefore it does appear to be an OS X-specific issue.
In either OS X 10.8 or 10.9, the crash produces this result:
-- Writer creating environment
**   creation response = 0
-- Writer opening environment
**   opening response = 0
-- Writer creating containers
++ Reader 7 beginning work
-- Writer instructing workers
-- Writer waiting for termination responses
++ Reader 1 beginning work
++ Reader 2 beginning work
++ Reader 3 beginning work
++ Reader 4 beginning work
++ Reader 5 beginning work
++ Reader 6 beginning work
pthread lock failed: Invalid argument
PANIC: Invalid argument
PANIC: fatal region error detected; run recovery
PANIC: fatal region error detected; run recovery
PANIC: fatal region error detected; run recovery
PANIC: fatal region error detected; run recovery
PANIC: fatal region error detected; run recovery
PANIC: fatal region error detected; run recovery
PANIC: fatal region error detected; run recovery
libc++abi.dylib: terminate called throwing an exception
[mountainlion-test-rig:00319] *** Process received signal ***
[mountainlion-test-rig:00319] Signal: Abort trap: 6 (6)
[mountainlion-test-rig:00319] Signal code: (0)
David
Message was edited by: ds283

Berkeley DB

Similar Messages

Maybe you are looking for