Concurrency questions, Secondary Databases, etc.

Hi,
i have the following requirements:
- Multiple threads, every thread opens one ore more database
- Databases have RefCounting if they are used in more than one thread
- Every database has a SecondaryDatabase associated
- All Threads are performing only put operations one the databases
- Keys and SecondaryKeys are unique (no duplicates)
I tested with normal Databases and SecondaryDatabases:
- no Transactions used
- deferredWrite is on true
Everything worked and the performance is within our expectations.
my Questions now:
- Does this setup work as long as all threads are only writing (put)
(i read in another post that SecondaryDatabases work only with transactions...
i think thats true only if you read/write ??)
- Is there anything i should take care of? I already checked my SecondaryKeyCreator
for concurrency issues...
- Does it help (in this setup) to disable the CheckpointerThread? The Databases are
synced and closed after all writes are finished. We don't need recovery...
- Are there any penalties if i increase the LogFileSize? We are writing around 80 to 150GB
of data and with the default size (10MB) we get a lot of files...
- Caching is a non-issue as long as we are only writing... is this correct?
Sorry for the amount of questions & thanks in advance for any answers!
Greets,
Chris

Hi Chris,
- Does this setup work as long as all threads are only writing (put)(i read in another post that SecondaryDatabases work only with transactions...
i think thats true only if you read/write ??)>
When using secondaries, if you don't configure transactions and there is an exception during the write operation, corruption can result. If you are reading and writing, lock conflict exceptions are likely to occur -- with transactions you can simply retry, but without transactions you can't. Since you are not reading, it is unlikely that this type of exception will occur. See below for more.
- Is there anything i should take care of? I already checked my SecondaryKeyCreatorfor concurrency issues...>
Since your secondary keys are unique, you'll get an exception if you attempt to write a primary record containing a secondary key that already exists. To avoid corruption, you'll have to prevent this from happening. If you are assigning secondary keys from a sequence, or something similar, then you'll be fine. Another way is to check the keys for existence in the secondary before the write. To do this, open the secondary as a regular database (openDatabase not openSecondaryDatabase). You don't want to read the primary (that could cause lock conflicts), which is what happens when you use openSecondaryDatabase and read via the secondary.
- Does it help (in this setup) to disable the CheckpointerThread? The Databases aresynced and closed after all writes are finished. We don't need recovery...>
Yes, if you don't care about recovery time, then disabling the checkpointer during the write operations will reduce the amount of disk space used and overall overhead.
When you say you don't need recovery, what do you mean? In general, this means that if there is a crash, you can either 1) revert to a backup or 2) recreate the data from scratch.
- Are there any penalties if i increase the LogFileSize? We are writing around 80 to 150GBof data and with the default size (10MB) we get a lot of files...>
The log cleaner may become inefficient if the log files are too large, so I don't recommend a file size larger than 50 MB.
- Caching is a non-issue as long as we are only writing... is this correct?The JE cache is not just a cache, it's the memory space for internal information and the Btree. For good performance during the write operations you should configure the cache large enough to hold all internal Btree nodes. The DbCacheSize program (in com.sleepycat.je.util) can be used to calculate this size.
An exception to this rule is when you are inserting keys sequentially. If both the primary keys and secondary keys are assigned and written sequentially, then the cache size can normally be much smaller, perhaps only 10 MB. But this is an unusual use case, especially with secondaries.
--mark

Similar Messages

Secondary Database Question

Hi,
First of all I am using the Java API of BerkeleyDB.
I have a secondary database associated with a primary one. I am making three kind of queries:
-search for the data of a given key
-search for the key of a given data
-search for the data of a given range of keys.
I have been told that using a secondary index would make all these queries much more efficient than using a single primary database. Is that true?
It is quite natural for the second query since the
get(Transaction txn, DatabaseEntry key, DatabaseEntry pKey, DatabaseEntry data, LockMode lockMode)
method supports finding the record at once whereas using a primary database, one has to sequentially search for it. But I am confused about the first and third ones: would using them make the queries faster? If it would, then how would one implement them?
thank you in advance

Hi,
First of all I am using the Java API of BerkeleyDB.
I have a secondary database associated with a primary
one. I am making three kind of queries:
-search for the data of a given key
-search for the key of a given data
-search for the data of a given range of keys.
I have been told that using a secondary index would
make all these queries much more efficient than using
a single primary database. Is that true?This is true for the second type of queries. You would create secondary indexes (secondary database) which will have as key the data items that you will query on, and as data portion the key of the primary database.
It is quite natural for the second query since the
get(Transaction txn, DatabaseEntry key, DatabaseEntry
pKey, DatabaseEntry data, LockMode lockMode)
method supports finding the record at once whereas
using a primary database, one has to sequentially
search for it. But I am confused about the first and
third ones: would using them make the queries faster?
If it would, then how would one implement them?For the first type of query, using secondary has no logic, since you will query for the key. Secondary indexes are of use when you want to query your records on items that are part of the data portion.
Regarding the third type of queries, the most efficient way to speed up your queries is to use key range search cursors (Cursor.getSearchKeyRange()):
http://www.oracle.com/technology/documentation/berkeley-db/db/java/index.html
You can check also an interesting comparison between the use of range search cursors and secondary databases for queries that search on a given range of keys or a given range of data items here:
Re: Retriving Miltiple Records
Regards,
Andrei

Concurrent access of a primary record referenced by secondary database

Hello
We need to implement the trick :
get record from secondary database, then update it in primary database, so the key for secondary database will be modified
we are facing strange issue - when we are working in multi-threaded environment, several threads can access same record in secondary database and update it, while only one thread should be allowed to do this,
we are using LockMode.RMW in secondary cursor searches on secondary database (cursor.getSearchRange), we were assuming that will locks associated record in primary database - but it seems it doesn't.
Do we miss something?
Thank you in advance!

I have reproduced this and the fix is well underway but not completely done.
In the meantime, a work around is to use a transaction. If you read via a secondary with an explicit transaction, the secondary and primary will both be locked by the transaction.
Mark

Secondary database performance and CacheMode

This is somewhat a follow-on thread related to: Lock/isolation with secondary databases
In the same environment, I'm noticing fairly low-performance numbers on my queries, which are essentially a series of key range-scans on my secondary index.
Example output:
08:07:37.803 BDB - Retrieved 177 entries out of index (177 ranges, 177 iters, 1.000 iters/range) in: 87ms
08:07:38.835 BDB - Retrieved 855 entries out of index (885 ranges, 857 iters, 0.968 iters/range) in: 346ms
08:07:40.838 BDB - Retrieved 281 entries out of index (283 ranges, 282 iters, 0.996 iters/range) in: 101ms
08:07:41.944 BDB - Retrieved 418 entries out of index (439 ranges, 419 iters, 0.954 iters/range) in: 160ms
08:07:44.285 BDB - Retrieved 2807 entries out of index (2939 ranges, 2816 iters, 0.958 iters/range) in: 1033ms
08:07:50.422 BDB - Retrieved 253 entries out of index (266 ranges, 262 iters, 0.985 iters/range) in: 117ms
08:07:52.095 BDB - Retrieved 2838 entries out of index (3021 ranges, 2852 iters, 0.944 iters/range) in: 835ms
08:07:58.253 BDB - Retrieved 598 entries out of index (644 ranges, 598 iters, 0.929 iters/range) in: 193ms
08:07:59.912 BDB - Retrieved 143 entries out of index (156 ranges, 145 iters, 0.929 iters/range) in: 32ms
08:08:00.788 BDB - Retrieved 913 entries out of index (954 ranges, 919 iters, 0.963 iters/range) in: 326ms
08:08:03.087 BDB - Retrieved 325 entries out of index (332 ranges, 326 iters, 0.982 iters/range) in: 103ms
To explain those numbers, a "range" corresponds to a sortedMap.subMap() call (ie: a range scan between a start/end key) and iters is the number of iterations over the subMap results to find the entry we were after (implementation detail).
In most cases, the iters/range is close to 1, which means that only 1 key is traversed per subMap() call - so, in essence, 500 entries means 500 ostensibly random range-scans, taking only the first item out of each rangescan.
However, it seems kind of slow - 2816 entries is taking 1033ms, which means we're really seeing a key/query rate of ~2700 keys/sec.
Here's performance profile output of this process happening (via jvisualvm): https://img.skitch.com/20120718-rbrbgu13b5x5atxegfdes8wwdx.jpg
Here's stats output after it running for a few minutes:
I/O: Log file opens, fsyncs, reads, writes, cache misses.
bufferBytes=3,145,728
endOfLog=0x143b/0xd5b1a4
nBytesReadFromWriteQueue=0
nBytesWrittenFromWriteQueue=0
nCacheMiss=1,954,580
nFSyncRequests=11
nFSyncTime=12,055
nFSyncTimeouts=0
nFSyncs=11
nFileOpens=602,386
nLogBuffers=3
nLogFSyncs=96
nNotResident=1,954,650
nOpenFiles=100
nRandomReadBytes=6,946,009,825
nRandomReads=2,577,442
nRandomWriteBytes=1,846,577,783
nRandomWrites=1,961
nReadsFromWriteQueue=0
nRepeatFaultReads=317,585
nSequentialReadBytes=2,361,120,318
nSequentialReads=653,138
nSequentialWriteBytes=262,075,923
nSequentialWrites=257
nTempBufferWrites=0
nWriteQueueOverflow=0
nWriteQueueOverflowFailures=0
nWritesFromWriteQueue=0
Cache: Current size, allocations, and eviction activity.
adminBytes=248,252
avgBatchCACHEMODE=0
avgBatchCRITICAL=0
avgBatchDAEMON=0
avgBatchEVICTORTHREAD=0
avgBatchMANUAL=0
cacheTotalBytes=2,234,217,972
dataBytes=2,230,823,768
lockBytes=224
nBINsEvictedCACHEMODE=0
nBINsEvictedCRITICAL=0
nBINsEvictedDAEMON=0
nBINsEvictedEVICTORTHREAD=0
nBINsEvictedMANUAL=0
nBINsFetch=7,104,094
nBINsFetchMiss=575,490
nBINsStripped=0
nBatchesCACHEMODE=0
nBatchesCRITICAL=0
nBatchesDAEMON=0
nBatchesEVICTORTHREAD=0
nBatchesMANUAL=0
nCachedBINs=575,857
nCachedUpperINs=8,018
nEvictPasses=0
nINCompactKey=268,311
nINNoTarget=107,602
nINSparseTarget=468,257
nLNsFetch=1,771,930
nLNsFetchMiss=914,516
nNodesEvicted=0
nNodesScanned=0
nNodesSelected=0
nRootNodesEvicted=0
nThreadUnavailable=0
nUpperINsEvictedCACHEMODE=0
nUpperINsEvictedCRITICAL=0
nUpperINsEvictedDAEMON=0
nUpperINsEvictedEVICTORTHREAD=0
nUpperINsEvictedMANUAL=0
nUpperINsFetch=11,797,499
nUpperINsFetchMiss=8,280
requiredEvictBytes=0
sharedCacheTotalBytes=0
Cleaning: Frequency and extent of log file cleaning activity.
cleanerBackLog=0
correctedAvgLNSize=87.11789
estimatedAvgLNSize=82.74727
fileDeletionBacklog=0
nBINDeltasCleaned=2,393,935
nBINDeltasDead=239,276
nBINDeltasMigrated=2,154,659
nBINDeltasObsolete=35,516,504
nCleanerDeletions=96
nCleanerEntriesRead=9,257,406
nCleanerProbeRuns=0
nCleanerRuns=96
nClusterLNsProcessed=0
nINsCleaned=299,195
nINsDead=2,651
nINsMigrated=296,544
nINsObsolete=247,703
nLNQueueHits=2,683,648
nLNsCleaned=5,856,844
nLNsDead=88,852
nLNsLocked=29
nLNsMarked=5,767,969
nLNsMigrated=23
nLNsObsolete=641,166
nMarkLNsProcessed=0
nPendingLNsLocked=1,386
nPendingLNsProcessed=1,415
nRepeatIteratorReads=0
nToBeCleanedLNsProcessed=0
totalLogSize=10,088,795,476
Node Compression: Removal and compression of internal btree nodes.
cursorsBins=0
dbClosedBins=0
inCompQueueSize=0
nonEmptyBins=0
processedBins=22
splitBins=0
Checkpoints: Frequency and extent of checkpointing activity.
lastCheckpointEnd=0x143b/0xaf23b3
lastCheckpointId=850
lastCheckpointStart=0x143a/0xf604ef
nCheckpoints=11
nDeltaINFlush=1,718,813
nFullBINFlush=398,326
nFullINFlush=483,103
Environment: General environment wide statistics.
btreeRelatchesRequired=205,758
Locks: Locks held by data operations, latching contention on lock table.
nLatchAcquireNoWaitUnsuccessful=0
nLatchAcquiresNoWaitSuccessful=0
nLatchAcquiresNoWaiters=0
nLatchAcquiresSelfOwned=0
nLatchAcquiresWithContention=0
nLatchReleases=0
nOwners=2
nReadLocks=2
nRequests=10,571,692
nTotalLocks=2
nWaiters=0
nWaits=0
nWriteLocks=0
My database(s) are sizeable, but on an SSD in a machine with more RAM than DB size (16GB vs 10GB). I have CacheMode.EVICT_LN turned on, however, am thinking this may be harmful. I have tried turning it on, but it doesn't seem to make a dramatic difference.
Really, I only want the secondary DB cached (as this is where all the read-queries happen), however, I'm not sure if it's (meaningfully) possible to only cache a secondary DB, as presumably it needs to look up the primary DB's leaf-nodes to return data anyway.
Additionally, the updates to the DB(s) tend to be fairly large - ie: potentially modifying ~500,000 entries at a time (which is about 2.3% of the DB), which I'm worried tends to blow the secondary DB cache (tho don't know how to prove one way or another).
I understand different CacheModes can be set on separate databases (and even at a cursor level), however, it's somewhat opaque as to how this works in practice.
I've tried to run DbCacheSize, but a combination of variable length keys combined with key-prefixing being enabled makes it almost impossible to get meaningful numbers out of it (or at the very least, rather confusing :)
So, my questions are:
- Is this actually slow in the first place (ie: 2700 random keys/sec)?
- Can I speed this up with caching? (I've failed so far)
- Is it possible (or useful) to cache a secondary DB in preference to the primary?
- Would switching from using a StoredSortedMap to raw (possibly reusable) cursors give me a significant advantage?
Thanks so much in advance,
fb.

nBINsFetchMiss=575,490The first step in tuning the JE cache, as related to performance, is to ensure that nBINsFetchMiss goes to zero. That tells you that you've sized your cache large enough to hold all internal nodes (I know you have lots of memory, but we need to prove that by looking at the stats).
If all your internal nodes are in cache, that means your entire secondary DB is in cache, because you've configured duplicates (right?). A dup DB does not keep its LNs in cache, so it consists of nothing but internal nodes in cache.
If you're using EVICT_LN (please do!), you also want to make sure that nEvictPasses=0, and I see that it is.
Here are some random hints:
+ In general always use getStats(new StatsConfig().setClear(true)). If you don't clear the stats every time interval, then they are cumulative and it's almost impossible to correlate them to what's going on in that time interval.
+ If you're starting with a non-empty env, first load the entire data set and clear the stats, so the fetches for populating the cache don't show up in subsequent stats.
+ If you're having trouble using DbCacheSize, you may want to find out experimentally how much cache is needed to hold the internal nodes, for a given data set in your app. You can do this simply by reading your data set into cache. When nEvictPasses becomes non-zero, the cache has overflowed. This is going to be much more accurate than DbCacheSize anyway.
+ When you measure performance, you need to collect the JE stats (as you have) plus all app performance info (txn rate, etc) for the same time interval. They need to be correlated. The full set of JE environment settings, database settings, and JVM params is also needed.
On the question of using StoredSortedMap.subMap vs a Cursor directly, there may be an optimization you can make, if your LNs are not in cache, and they're not if you're using EVICT_LN, or if you're not using EVICT_LN but not all LNs fit. However, I think you can make the same optimization using StoredSortedMap.
Namely when using a key range (whatever API is used), it is necessary to read one key past the range you want, because that's the only way to find out whether there are more keys in the range. If you use subMap or the Cursor API in the most obvious way, this will not only have to find the next key outside the range but will also fetch its LN. I'm guessing this is part of the reason you're seeing a lower operation rate than you might expect. (However, note that you're actually getting double the rate you mention from a JE perspective, because each secondary read is actually two JE reads, counting the secondary DB and primary DB.)
Before I write a bunch more about how to do that optimization, I think it's worth confirming that the extra LN is being fetched. If you do the measurements as I described, and you're using EVICT_LN, you should be able to get the ratio of LNs fetched (nLNsFetchMiss) to the number of range lookups. So if there is only one key in the range, and I'm right about reading one key beyond it, you'll see double LNs fetched as number of operations.
--mark

How ADBC connection is benefits by using SAP HANA as secondary database ?

Hi,
I have one more important question.
How ADBC connection is benefits by using SAP HANA as secondary database in terms of performance wise for the access of data from HANA database as a secondary database.
I have 2 options and which is better for the good performance for accessing the data-
1 . In ABAP Reports in the SELECT statements by using CONNECTION (“HDB”) will this improve the
     performance.
      e.g : select * from BSEG into TABLE IT_TAB CONNECTION (“HDB”).
2. Will Create the Stored procedure in HANA studio and Call
   from ABAP as below by using NATIVE SQL–
     EXEC SQL
      SET CONNECTION (‘HDB’).
     ENDEXEC.
    EXEC SQL.
      EXECUTE PROCEDURE proc (IN    p_in1
                             OUT   p_out1   OUT   p_out2 )
   ENDEXEC.
Regards,
Pravin
Message was edited by: Jens Weiler
Branched from http://scn.sap.com/thread/3498161

Hi Pravin,
Option 1: In this case ADBC might even worsen the performance due to the overhead in the ADBC framework. OpenSQL is the method to go here, as OpenSQL - from the ABAP point of view - features the optimal communication with the database while ADBC has overhead like constructor-calls for the statement, parameter binding, etc.
Option 2: In this case ADBC is comparable with EXEC SQL but features more options, e.g. clean concept of multiple connection (connection objects via CL_SQL_CONNECTION), exception handling, etc. So I strongly propose to favour ADBC over EXEC SQL, but not simply for performance reasons. You might have a look at the ABAP Language Help in your system on more information on ADBC and the advantages over Exec SQL.
Cheers,
Jasmin

Problem using secondary database, sequence (and custom tuple binding)

I get an exception when I try to open a Sequence to a database that has a custom tuple binding and a secondary database. I have a guess what the issue is (below), but it boils down to my custom tuple-binding being invoked when opening the sequence. Here is the exception:
java.lang.IndexOutOfBoundsException
at com.sleepycat.bind.tuple.TupleInput.readUnsignedInt(TupleInput.java:4
14)
at com.sleepycat.bind.tuple.TupleInput.readInt(TupleInput.java:233)
at COM.shopsidekick.db.community.Shop_URLTupleBinding.entryToObject(Shop
_URLTupleBinding.java:72)
at com.sleepycat.bind.tuple.TupleBinding.entryToObject(TupleBinding.java
:73)
at COM.tagster.db.community.SecondaryURLKeyCreator.createSecondaryKey(Se
condaryURLKeyCreator.java:38)
at com.sleepycat.je.SecondaryDatabase.updateSecondary(SecondaryDatabase.
java:546)
at com.sleepycat.je.SecondaryTrigger.databaseUpdated(SecondaryTrigger.ja
va:42)
at com.sleepycat.je.Database.notifyTriggers(Database.java:1343)
at com.sleepycat.je.Cursor.putInternal(Cursor.java:770)
at com.sleepycat.je.Cursor.putNoOverwrite(Cursor.java:352)
at com.sleepycat.je.Sequence.<init>(Sequence.java:139)
at com.sleepycat.je.Database.openSequence(Database.java:332)
Here is my code:
// URL ID DB
DatabaseConfig urlDBConfig = new DatabaseConfig();
urlDBConfig.setAllowCreate(true);
urlDBConfig.setReadOnly(false);
urlDBConfig.setTransactional(true);
urlDBConfig.setSortedDuplicates(false); // No sorted duplicates (can't have them with a secondary DB)
mURLDatabase = mDBEnv.openDatabase(txn, "URLDatabase", urlDBConfig);
// Reverse URL lookup DB table
SecondaryConfig secondaryURLDBConfig = new SecondaryConfig();
secondaryURLDBConfig.setAllowCreate(true);
secondaryURLDBConfig.setReadOnly(false);
secondaryURLDBConfig.setTransactional(true);
TupleBinding urlTupleBinding = DataHelper.instance().createURLTupleBinding();
SecondaryURLKeyCreator secondaryURLKeyCreator = new SecondaryURLKeyCreator(urlTupleBinding);
secondaryURLDBConfig.setKeyCreator(secondaryURLKeyCreator);
mReverseLookpupURLDatabase = mDBEnv.openSecondaryDatabase(txn, "SecondaryURLDatabase", mURLDatabase, secondaryURLDBConfig);
// Open the URL ID sequence
SequenceConfig urlIDSequenceConfig = new SequenceConfig();
urlIDSequenceConfig.setAllowCreate(true);
urlIDSequenceConfig.setInitialValue(1);
mURLSequence = mURLDatabase.openSequence(txn, new DatabaseEntry(URLID_SEQUENCE_NAME.getBytes("UTF-8")), urlIDSequenceConfig);
My secondary key creator class looks like this:
public class SecondaryURLKeyCreator implements SecondaryKeyCreator {
// Member variables
private TupleBinding mTupleBinding; // The tuple binding
* Constructor.
public SecondaryURLKeyCreator(TupleBinding iTupleBinding) {
mTupleBinding = iTupleBinding;
* Create the secondary key.
public boolean createSecondaryKey(SecondaryDatabase iSecDB, DatabaseEntry iKeyEntry, DatabaseEntry iDataEntry, DatabaseEntry oResultEntry) {
try {
URLData urlData = (URLData)mTupleBinding.entryToObject(iDataEntry);
String URL = urlData.getURL();
oResultEntry.setData(URL.getBytes("UTF-8"));
catch (IOException willNeverOccur) {
// Success
return(true);
I think I understand what is going on, and I only noticed it now because I added more fields to my custom data (and tuple binding):
com.sleepycat.je.Sequence.java line 139 (version 3.2.44) does this:
status = cursor.putNoOverwrite(key, makeData());
makeData creates a byte array of size MAX_DATA_SIZE (50 bytes) -- which has nothing to do with my custom data.
The trigger causes an call to SecondaryDatable.updateSecondary(...) to the secondary DB.
updateSecondary calls createSecondaryKey in my SecondaryKeyCreator, which calls entityToObject() in my tuple-binding, which calls TupleInput.readString(), etc to match my custom data. Since what is being read goes for more than the byte array of size 50, I get the exception.
I didn't notice before because my custom tuple binding used to read fewer that 50 bytes.
I think the problem is that my tuple binding is being invoked at all at this point -- opening a sequence -- since there is no data on which it can act.

Hi,
It looks like you're making a common mistake with sequences which is to store the sequence itself in a database that is also used for application data. The sequence should normally be stored in separate database to prevent configuration conflicts and actual data conflicts between the sequence record and the application records.
I suggest that you create another database whose only purpose is to hold the sequence record. This database will contain only a single record -- the sequence. If you have more than one sequence, storing all sequences in the same database makes sense and is safe.
The database used for storing sequences should not normally have any associated secondary databases and should not be configured for duplicates.
--mark

Database much larger than expected when using secondary databases

Hi Guys,
When I load data into my database it is much larger than I expect it to be when I turn on secondary indexes. I am using the Base API.
I am persisting (using TupleBindings) the following data type:
A Key of size ~ *80 Bytes*
A Value consisting of 4 longs ~ 4*8= *32 Bytes*
I am persisting ~ *280k* of such records
I therefore expect ballpark 280k * (80+32) Bytes ~ *31M* of data to be persisted. I actually see ~ *40M* - which is fine.
Now, when I add 4 secondary database indexes - on the 4 long values - I would expect to see approximately an additional 4 * 32 * 280k Bytes -> ~ *35M*
This would bring the total amount of data to (40M + 35M) ~ *75M*
(although I would expect less given that many of the secondary keys are duplicates)
What I am seeing however is *153M*
Given that no other data is persisted that could account for this, is this what you would expect to see?
Is there any way to account for the extra unexpected 75M of data?
Thanks,
Joel
Edited by: JoelH on 10-Feb-2010 10:59
Edited by: JoelH on 10-Feb-2010 10:59

Hi Joel,
Now, when I add 4 secondary database indexes - on the 4 long values - I would expect to see approximately an additional 4 * 32 * 280k Bytes -> ~ 35MA secondary index consists of a regular JE database containing key-value pairs, where the key of the pair is the secondary key and the value of the pair is the primary key. Your primary key is 80 bytes, so this would be 4 * (4 + 80) * 280k Bytes -> ~ 91M.
The remaining space is taken by per-record overhead for every primary and secondary record, plus the index of keys itself. There are (280k * 5) records total. Duplicates do not take less space.
I assume this is an insert-only test, and therefore the log should contain little obsolete data. To be sure, please run the DbSpace utility.
Does this answer your question?
--mark

Questions About Database Recovery (-30975)

Hello,
In Berkeley 4.5.20, we are seeing the following error sporadically, but more frequently than we'd like (which is, to say, not at all): "BerkeleyDbErrno=-30975 - DbEnv::open: DB_RUNRECOVERY: Fatal error, run database recovery"
This exception is being thrown mostly, if not exclusively, during the environment open call. Still investigating.
I will post my environment below, but first some questions.
1. How often should a database become become corrupt?
2. What are the causes of this corruption? Can they be caused by "chance?" (I.e. app is properly coded.) Can they be caused by improper coding? If so, is there a list of common things to check?
3. Does Oracle expect application developers to create their own recovery handlers, especially for apps that require 100% uptime? E.g. using DB_ENV->set_event_notify or filtering on DB_RUNRECOVERY.
Our environment:
Windows Server 2003 SP2
Berkeley DB 4.5.20
set_verbose(DB_VERB_WAITSFOR, 1);
set_cachesize(0, 65536 * 1024, 1);
set_lg_max(10000000);
set_lk_detect(DB_LOCK_YOUNGEST);
set_timeout(60000000, DB_SET_LOCK_TIMEOUT);
set_timeout(60000000, DB_SET_TXN_TIMEOUT);
set_tx_max(100000);
set_flags(DB_TXN_NOSYNC, 1);
set_flags(DB_LOG_AUTOREMOVE, 1);
set_lk_max_lockers(10000);
set_lk_max_locks(10000);
set_lk_max_objects(10000);
open(sPath, DB_CREATE | DB_INIT_LOCK | DB_INIT_LOG | DB_INIT_MPOOL | DB_THREAD | DB_INIT_TXN | DB_RECOVER, 0);
set_pagesize (4096);
u_int32_t dbOpenFlags = DB_CREATE | DB_AUTO_COMMIT;
pDbPrimary->open(NULL, strFile, NULL, DB_HASH, dbOpenFlags, 0);
We also have a number of secondary databases.
One additional piece of information that might be relevant is that the databases where this happens (we have 8 in total managed by our process,) seem to be the two specific databases that at times aren't opened until well after the process is up and running due to the nature of their data. This is to say that 6 of the other databases are normally opened during startup of our service. We are still investigating this to see if this is consistently true.

Here is the output from the error logs (we didn't have this properly set up until now) when this error opening the environment happens:
12/17/2007 17:18:12 (e64/518) 1024: Berkeley Error: CDbBerkeley MapViewOfFile: Not enough storage is available to process this command.
12/17/2007 17:18:12 (e64/518) 1024: Berkeley Error: CDbBerkeley PANIC: Not enough space
12/17/2007 17:18:12 (e64/518) 1024: Berkeley Error: CDbBerkeley DeleteFile: C:\xxxxxxxx\Database\xxxJOB_OAT\__db.003: Access is denied.
12/17/2007 17:18:12 (e64/518) 1024: Berkeley Error: CDbBerkeley MapViewOfFile: Not enough storage is available to process this command.
12/17/2007 17:18:12 (e64/518) 1024: Berkeley Error: CDbBerkeley PANIC: Not enough space
12/17/2007 17:18:12 (e64/518) 1024: Berkeley Error: CDbBerkeley PANIC: DB_RUNRECOVERY: Fatal error, run database recovery
12/17/2007 17:18:30 (e64/518) 1024: Berkeley Error: CDbBerkeley unable to join the environment
12/17/2007 17:18:30 (e64/518) 1024: Berkeley Error: CDbBerkeley DeleteFile: C:\xxxxxxxx\Database\xxxJOB_OAT\__db.003.del.0547204268: Access is denied.
12/17/2007 17:18:30 (e64/518) 1024: Berkeley Error: CDbBerkeley DeleteFile: C:\xxxxxxxx\Database\xxxJOB_OAT\__db.003: Access is denied.
12/17/2007 17:19:18 (e64/518) 1024: Database EInitialize failed. (C:\xxxxxxxx\Database\xxxJOB_OAT: BerkeleyDbErrno=-30975 - DbEnv::open: DB_RUNRECOVERY: Fatal error, run database recovery)
The last line is generated by a DbException and was all we were seeing up until now.
I also set_verbose(DB_VERB_RECOVERY, 1) and set_msgcall to the same log file. We get verbose messages on the 1st 7 database files that open successfully, but none from the last one, I assume because they output to set_errcall instead.
There is 67GB of free space on this disk by the way, so not sure what "Not enough space" means.
Thanks again for your help.

Setting up a primary and secondary Database in Oracle 10G

Hi Experts
can you please tel me the steps involved in creation of primary and secondary database? This is the first time i am going to configure this setup. Please provide your helping hands.
Thanks alot in advance,
Ram

Absolutely glad to help.
Step 1: Clarify what it is you are trying to build. Are you talking about a Standby Database? Perhaps Physical or Logical Data Guard? If so what protection mode? Stand-alone or RAC? Or are you just trying to dup an existing database on another server in which case you can just use RMAN.
Step 2: Go to http://tahiti.oracle.com and read the relevant docs
Step 3: Go to http://metalink.oracle.com and look at the Knowledge Base docs
If you have any question thereafter contact us and be sure to include your version number, not a marketing label. 10g is a marketing label that refers to everything from the 10.1 Beta through 10.2.0.4.

DB Connect Error while establishing secondary Database DB2 in mainframe sys

Hello,
I am trying to establish a secondary database connection between my ECC system and a mainframe system.
In my SAP system, the database is Oracle and version is 10.2
I want to connect to a mainframe system DB2 (AS400).
I have installed DB2 client in SAP server.
I have checked the access to Mainframe system from DB2 client.
I have made the entry in DBCON table via t-code DBCO, which looks like this
AS4_HOST=<ip_address>;AS4_DB_LIBRARY=<LIB_NAME>;
But when I run, the std program ABCD_TEST_CONNECTION, I am getting error
Secondary DB connect FAILED!
Some doubts:
1) From an Oracle Database point, is it required to login to sqlplus and create a database link between Oracle Database and the Mainframe Database?
2) Is it required to have an entry in SM59, which connects the Mainframe system, if so what type of connection & name?
3) If anyone has faced situation like this, can you share what steps were taken to resolve the error?
If this is not the forum to post this question, can you suggest the correct forum?
Regards,
Vikas

Hello Prem,
I have established connectivity with Mainframe via DB-Connect.
THe steps are
1) Make entry in DBCON table
2) Install DB2 client in SAP-Server
3) There are some jobs which needs to be run in Mainframe system. The job name was mentioned in one of SAP notes.
If you are facing problems after this, just check SAP version and patch installed.
Regards,
Vikas

The best way to populate a secondary database

I'm trying to create a secondary database over an existing primary database. I've looked over the GSG and Collections docs and haven't found an example that explicitly populates a secondary database.
The one thing I did find was setAutoPopulate(true) on the SecondaryConfig.
Is this the only way to get a secondary database populated from a primary? Or is there another way to achieve this?
Thanks

However, after primary and secondary are in sync,
going forward, I'm unsure of the mechanics of how to
automagically ensure that updates to primary db are
reflected in secondary db. I'm sorry, I misunderstood your question earlier.
Does JE take care of updating secondary db in such
cases (provided both DBs are open)? In other words,
if I have a Map open on the primary and do a put(), I
can turn around and query the secondary (with apt
key) and I should be able to retrieve the record I
just put into the primary?Yes, JE maintains the secondaries automatically. The only requirement is that you always keep the secondary open while writing to the primary. JE uses your SecondaryKeyCreator implementation (you pass this object to SecondaryConfig.setKeyCreator when opening the secondary) to extract the secondary keys from the primary record, and automatically insert, update and delete records in the secondary databases as necessary.
For the base API and collections API, JE does not persistently store the association between primaries and secondaries, so you must always open your secondary databases explicitly after opening your primary databases. For the DPL API (persist package), JE maintains the relationship persistently, so you don't have to always open the secondary indexes explicitly.
I couldn't find an example illustrating this (nice)
feature - hence the questions.For the collections API (I see you're using the collections API):
http://www.oracle.com/technology/documentation/berkeley-db/je/collections/tutorial/UsingSecondaries.html
In the examples directory:
examples/collections/* -- all but the basic example use secondaries
Mark

Question: 10gR2 database can not see the 11gR2 ASM diskgroup?

Hi there,
env:
uname -rm
2.6.18-92.1.22.el5xen x86_64
Single server(non-RAC)
note: we don't want to upgrade 10gr2 database into 11gR2 yet. But we created the 11gR2 ASM, then a 11gr2 database on ASM, and plan to migrate datafile in 10gR2 database to 11gR2 ASM
1. oracle 10gR2 installed first version: 10.2.0.3.0
2. then install 11gR2 Grid Infrastructure, and created ASM (version 11gr2)
$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.1.0 Production on Tue Oct 19 10:30:56 2010
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Automatic Storage Management option
SQL> col name form a15
SQL> col COMPATIBILITY form a15
SQL> col DATABASE_COMPATIBILITY form a15
SQL> l
1* select name , STATE, COMPATIBILITY, DATABASE_COMPATIBILITY from v$asm_diskgroup
SQL> /
NAME STATE COMPATIBILITY DATABASE_COMPAT
ORCL_DATA1 MOUNTED 11.2.0.0.0 10.1.0.0.0
ORA_DATA MOUNTED 10.1.0.0.0 10.1.0.0.0
3. in 10gR2 database
sqlplus /
SQL*Plus: Release 10.2.0.3.0 - Production on Tue Oct 19 12:12:31 2010
Copyright (c) 1982, 2006, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning and Data Mining options
SQL> select * from v$asm_diskgroup;
no rows selected
4. pin the node into css
# /u01/app/product/11.2.0/grid/bin/crsctl pin css -n mynodename
CRS-4000: Command Pin failed, or completed with errors.
Question: 10gR2 database can not see the 11gR2 ASM diskgroup?
please help
Thanks
Scott

What is the output of
olsnodes -t -n
Also, see unix error log and ohasd error log if you find anything in that

Need help with sorting records in primary and secondary databases

Hi,
I would like to store data into primary and secondary db in different order. For the main primary, I want it to be ordered by login_ts instead of uuid which is the key.
For the user secondary database, I want it to be ordered by sip_user. For the timestampe secondary db, I want it to be ordered by login_ts.
This is what I have right now,
this is for main
uuid=029ae227-a188-4ba8-aea4-7cbc26783d6 sip_user=200003 login_ts=1264327630 logout_ts=
uuid=22966f76-8c8a-4ab4-b832-b36e8f8e14d sip_user=200003 login_ts=1264327688 logout_ts=
uuid=e1846e4a-e1f5-406d-b903-55905a2533a sip_user=200003 login_ts=1264327618 logout_ts=
uuid=e2f9a3cb-02d1-47ff-8af8-a3a371e20b5 sip_user=200003 login_ts=1264327613 logout_ts=
this is for user search
uuid=029ae227-a188-4ba8-aea4-7cbc26783d6 sip_user=200003 login_ts=1264327630 logout_ts=
uuid=22966f76-8c8a-4ab4-b832-b36e8f8e14d sip_user=200003 login_ts=1264327688 logout_ts=
uuid=e1846e4a-e1f5-406d-b903-55905a2533a sip_user=200003 login_ts=1264327618 logout_ts=
uuid=e2f9a3cb-02d1-47ff-8af8-a3a371e20b5 sip_user=200003 login_ts=1264327613 logout_ts=
this is for timestamp
uuid=029ae227-a188-4ba8-aea4-7cbc26783d6 sip_user=200003 login_ts=1264327630 logout_ts=
uuid=22966f76-8c8a-4ab4-b832-b36e8f8e14d sip_user=200003 login_ts=1264327688 logout_ts=
uuid=e1846e4a-e1f5-406d-b903-55905a2533a sip_user=200003 login_ts=1264327618 logout_ts=
uuid=e2f9a3cb-02d1-47ff-8af8-a3a371e20b5 sip_user=200003 login_ts=1264327613 logout_ts=
but what I want is :
this is for main
uuid=e2f9a3cb-02d1-47ff-8af8-a3a371e20b5 sip_user=200003 login_ts=1264327613 logout_ts=
uuid=e1846e4a-e1f5-406d-b903-55905a2533a sip_user=200004 login_ts=1264327618 logout_ts=
uuid=029ae227-a188-4ba8-aea4-7cbc26783d6 sip_user=200003 login_ts=1264327630 logout_ts=
uuid=22966f76-8c8a-4ab4-b832-b36e8f8e14d sip_user=200005 login_ts=1264327688 logout_ts=
this is for user search
uuid=e2f9a3cb-02d1-47ff-8af8-a3a371e20b5 sip_user=200003 login_ts=1264327613 logout_ts=
uuid=029ae227-a188-4ba8-aea4-7cbc26783d6 sip_user=200003 login_ts=1264327630 logout_ts=
uuid=e1846e4a-e1f5-406d-b903-55905a2533a sip_user=200004 login_ts=1264327618 logout_ts=
uuid=22966f76-8c8a-4ab4-b832-b36e8f8e14d sip_user=200004 login_ts=1264327688 logout_ts=
this is for timestamp
uuid=e2f9a3cb-02d1-47ff-8af8-a3a371e20b5 sip_user=200003 login_ts=1264327613 logout_ts=
uuid=e1846e4a-e1f5-406d-b903-55905a2533a sip_user=200003 login_ts=1264327618 logout_ts=
uuid=029ae227-a188-4ba8-aea4-7cbc26783d6 sip_user=200004 login_ts=1264327630 logout_ts=
uuid=22966f76-8c8a-4ab4-b832-b36e8f8e14d sip_user=200004 login_ts=1264327688 logout_ts=
Right now, I have:
int compare_login_ts(dbp, a, b)
     DB *dbp;
     const DBT a, b;
     int time_a = 0;
     int time_b = 0;
     time_a = atoi ( (char *)a->data);
     time_b = atoi ( (char *)b->data);
     return time_a - time_b ;
for the timestamp secondary, I set that compare function:
          if ((ret = (*sdb)->set_bt_compare(*sdb , compare_login_ts )) != 0){
Does anyone know how can I make it sorted according?

Hi,
The DB->set_bt_compare() is used to compare keys in Btree database. In the callback function, both the DBTs are key, but not data. Please refer to http://www.oracle.com/technology/documentation/berkeley-db/db/api_reference/C/dbset_bt_compare.html.
If you want any field in the data to be sorted, you might create a secondary index on it and define the compare function as you wish.
Regards,
Emily Fu, Oracle Berkeley DB

How to use the mirrored and log shipped secondary database for update or insert operations

Hi,
I am doing a DR Test where I need to test the mirrored and log shipped secondary database but without stopping the mirroring or log shipping procedures. Is there a way to get the data out of mirrored and log shipped database to another database for update
or insert operations?
Database snapshot can be used only for mirrored database but updates cannot be done. Also the secondary database of log shipping cannot used for database snapshot. Any ideas of how this can be implemented?
Thanks,
Preetha

Hmm in this case I think you need Merge Replication otherwise it breaks down the purpose of DR...again in that case..
Best Regards,Uri Dimant SQL Server MVP,
http://sqlblog.com/blogs/uri_dimant/
MS SQL optimization: MS SQL Development and Optimization
MS SQL Consulting:
Large scale of database and data cleansing
Remote DBA Services:
Improves MS SQL Database Performance
SQL Server Integration Services:
Business Intelligence

Removing record from secondary database

Could soembody please explain - if I'm removing entry from secondary database using a cursor, will that entry be removed from primary database as well, or I need to remove it explicitly?

looks like it does, as stated in javadocs, so nevermind ;)

Concurrency questions, Secondary Databases, etc.

Similar Messages

Maybe you are looking for