Thread pool configuration for write-behind cache store operation?

Hi,
Does Coherence have a thread pool configuration for the Coherence CacheStore operation?
Or the CacheStore implementation needs to do that?
We're using write-behind and want to use multiple threads to speed up the store operation (storeAll()...)
Thanks in advance for your help.

user621063 wrote:
Hi,
Does Coherence have a thread pool configuration for the Coherence CacheStore operation?
Or the CacheStore implementation needs to do that?
We're using write-behind and want to use multiple threads to speed up the store operation (storeAll()...)
Thanks in advance for your help.Hi,
read/write-through operations are carried out on the worker thread (so if you configured a thread-pool for the service the same thread-pool will be used for the cache-store operation).
for write-behind/read-ahead operations, there is a single dedicated thread per cache above whatever thread-pool is configured, except for remove operations which are synchronous and still carried out on the worker thread (see above).
All above is of course per storage node.
Best regards,
Robert

Similar Messages

Write-Behind Caching and Re-entrant Calls

Support Team -
     The Coherence User Guide states that:
     "The CacheStore implementation must not call back into the hosting cache service. This includes OR/M solutions that may internally reference Coherence cache services. Note that calling into another cache service instance is allowed, though care should be taken to avoid deeply nested calls (as each call will "consume" a cache service thread and could result in deadlock if a cache service threadpool is exhausted)."
     I have Load-tested a use case wherein I have two caches: ABCache and BACache. ABCache is accessed by the application for write operation, BACache is accessed by the application for read operation. ABCache is a write-behind cache whose CacheStore populates BACache by reversing key and value of each cache entry stored in the ABCache.
     The solution worked under load with no issues.
     But can I use it? Or is it too dangerous?
     My write-behind thread-count setting is left at default (0). The documentation states that
     "If zero, all relevant tasks are performed on the service thread."
     What does this mean? Can I re-enter the caching service if my thread-count is zero?
     Thank you,
     Denis.

Dimitri -
     I am not sure I fully understand your answer:
     1. "Your test worked because write-behing backing map invokes CacheStore methods asynchronously, on a write-behind thread." In my configuration, I have default value for thread-count, which is zero. According to the documentation, that means that CacheStore methods would be executed by the service thread and not by the write-behind thread. Do I understand this correctly?
     2. "If will fail if CacheStore method will need to be invoked synchronously on a service thread." I am not sure what is the purpose of the "service thread". In which scenarios the "CacheStore method will need to be invoked synchronously on a service thread"?
     Thank you,
     Denis.

Write-behind cache not removing entries after upgrade to 3.2

We recently upgraded tangosol.jar and coherence.jar from version 3.0 to version 3.2. After the upgrade, our write-behind caches began consuming all available memory and crashing the JVMs because the entries were not being removed from the cache after being written to the database. We rolled back to the 3.0 jars without making any other modifications and the caches behave as expected. We'd really like to move to 3.2 for the improved network fault tolerance, but we need to resolve this issue first.
What changes were made in 3.2 with respect to write-behind caches that might cause this issue? I've reviewed our configuration and our code and can't find anything unusual, but I'm not sure what I should be looking for.
Any ideas?

I've opened an SR, but I haven't heard back. In the meantime, I've continued digging and I've noticed something strange - in the store() method of our backing map implementation, we take the entry that we just persisted and remove it from the backing map.
In my small-scale local tests, the size of the map is 1 when we enter store() and is 0 when we leave, as expected. If we process another entry using the 3.0 jars, it's again 1 and then 0. However, it gets more interesting with the 3.2 jars - the size of the map is 1 when we enter store() the first time and 0 when we leave, but if we process another entry, the size is 2 when we enter and 1 when we leave. This pattern continues such that both values increase by 1 every time we process an entry.
This would imply that we're either removing the entries incorrectly, or they're somehow being reinserted into the map.
Any ideas?
Here's the body of our method (with a bunch of sysouts added to the normal logging because this app won't run correctly under a debugger):
        * Store the specified value under the specific key in the underlying
        * store, then remove the specific key from the internal map and hence
        * the cache itself. This method is intended to support both key/value
        * creation and value update for a specific key.
        * @param oKey   key to store the value under
        * @param oValue value to be stored
        * @throws UnsupportedOperationException if this implementation or the
        *                                       underlying store is read-only
        public void store(Object oKey, Object oValue)
            RemoveOnStoreRWBackingMap mapBacking = RemoveOnStoreRWBackingMap.this;
            System.out.println("map storing " + oKey);
            System.out.println("Size before = " + mapBacking.entrySet().size());
            Iterator entries = mapBacking.entrySet().iterator();
            while (entries.hasNext()) {
                System.out.println("entry = " + entries.next());
            String storeClassName = getCacheStore().getClass().getName();
            Logger log = Logger.getLogger(storeClassName);
            log.debug(storeClassName + ": In store method. Storing " + oKey);
            long cFailuresBefore = getStoreFailures();
            log.debug(storeClassName + ": failures before=" + cFailuresBefore);
            super.store(oKey, oValue);
            long cFailuresAfter = getStoreFailures();
            log.debug(storeClassName + ": failures afer=" + cFailuresAfter);
            if (cFailuresBefore == cFailuresAfter) {
                log.debug(storeClassName + ": About to remove");
                mapBacking = RemoveOnStoreRWBackingMap.this;
                Converter converter = mapBacking.getContext().getKeyToInternalConverter();
                System.out.println("removed " + mapBacking.remove(converter.convert(oKey)));
//                System.out.println("removed " + mapBacking.getInternalCache().remove(converter.convert(oKey)));
                log.debug(storeClassName + ": Removed");
            Converter converter = RemoveOnStoreRWBackingMap.this.getContext().getKeyFromInternalConverter();
            System.out.println("Size after = " + mapBacking.entrySet().size());
        }

Write behind cache, DB down, when should the system stop taking new data in

Hello:
We are trying to use Coherence for our custom ESB, which is brokering payloads of various size between consumer and provider applications.
Before Coherence, stopping our DB meant organization-wide outage for critically important business services.
Since we have at least 40G of RAM in production environment, we believe that our app
can use Coherence write-behind option for tolerating at least several hours worth of DB outage.
We are currently using a near cache backed by distributed cache in write-behind mode.
9 business service JVMs (storage enabled=false) use 30 storage enabled JVMs.
IMPORTANT: We need to create an automated alerting facility determining when
amount of unsaved data reaches critical level since DB goes down. This alert should help us decide when our application stops accepting inbound traffic.
It is hard to use QueueSize parameter for that because our payload memory footprint can vary from 1KB to 3MB.
We do not expire any entries in order to enable support queries against the cache during DB outage.
Our experiments with trying various flavors of overflow-scheme resulted in OutOfMemoryError, therefore
we decided to implement RAM-only cache as a first step.
<near-scheme>
<scheme-name>message_payload_scheme</scheme-name>
<front-scheme>
<local-scheme>
<scheme-ref>limited_entities_front_scheme</scheme-ref>
<high-units>100</high-units>
</local-scheme>
</front-scheme>
<back-scheme>
<distributed-scheme>
<backing-map-scheme>
<read-write-backing-map-scheme>
<internal-cache-scheme>
<local-scheme>
<scheme-ref>limited_bytes_scheme</scheme-ref>
<high-units>199229440</high-units>
</local-scheme>
</internal-cache-scheme>
<cachestore-scheme>
<class-scheme>
<class-name>com.comp.MessagePayloadStore</class-name>
</class-scheme>
</cachestore-scheme>
<read-only>false</read-only>
<write-delay-seconds>3</write-delay-seconds>
<write-requeue-threshold>2147483646</write-requeue-threshold>
</read-write-backing-map-scheme>
</backing-map-scheme>
<autostart>true</autostart>
</distributed-scheme>
</back-scheme>
</near-scheme>
<local-scheme>
<scheme-name>limited_entities_front_scheme</scheme-name>
<eviction-policy>LRU</eviction-policy>
<unit-calculator>FIXED</unit-calculator>
</local-scheme>
<local-scheme>
<scheme-name>limited_bytes_scheme</scheme-name>
<eviction-policy>HYBRID</eviction-policy>
<unit-calculator>BINARY</unit-calculator>
</local-scheme>

Good info ... I feel like I need to restate my original question along with a couple of new questions caused by the discussion above.
Q1. Does Coherence evict 'dirty', or 'queued', or 'unsaved' objects for cache configuration provided above?
The answer should be 'NO', otherwise Coherence is unsafe to use as a system of record,
it should not just drop unsaved information on the floor.
Q2. What happens to the front tier of the near+partitioned write behind cache described above when amount of unsaved data exceeds max cache capacity defined via high-units?
I would expect that map.put starts throwing exceptions: cache storage is full, so it should not accept more data
Q3. How can I determine a moment when amount of dirty data in bytes(!), not in objects, hits 85% of
max allowed cache capasity configured in bytes (using high-units param and BINARY calculator).
'DirtyUnits' counter can probably be built with some lower-level Coherence API. Can we use
this API?
Please, understand, that we purchased Coherence for reliability, for making our
system independent from short DB outages, for keeping our business services up
and running when DBA need some time for admin operations like rebuilding an index.
Performance benefits are secondary and are not as obvious for our system which
uses primary keys only and has a well-tuned co-located Oracle back-end.
We simply cannot put Coherence to production unless we prove that Coherence
can reliably hold the data and give us information about approaching crisis
(the cache full of unsaved data).
If possible, forward this message to Cameron Purdy,
who was presenting Coherence to our team several moths ago.
Thanks,
Vasili Smaliak
Applications Architect, Enterprise App Integration
GMAC ResCap
[email protected]

Write-Behind Caching and Multiple Puts

What happens when two consecutive puts are performed on the write-behind cache for the same key? Will CacheStore's store() or storeAll() be invoked once for every put() or only once for the last put() (the one which overrode the previous cached values)?

Hi Denis,
     If you use write-behind, there will be no unnesessary database updates - only the last put() will result in database update.
     Regards,
     Dimitri

Write-Behind Caching and Limited Internal Cache Size

Let's say I have a write-behind cache and configure its internal cache to be of a fixed limited size, e.g. 10000 units. What would happen if more than 10000 units are added to the write-behind cache within the write-delay period? Would my CacheStore's storeAll() get all of the added values or would some of the values be missed because of the internal cache size limitation?

Hi Denis,     >
     > If an entry is removed while it is still in the
     > write-behind queue, it will be removed from the queue
     > and CacheStore.store(oKey, oValue) will be invoked
     > immediately.
     >
     > Regards,
     > Dimitri
     Dimitri,
     Just to confirm, that I understand it right if there is a queued update to a key which is then remove()-ed from the cache, then the following happens:
     First CacheStore.store(key, queuedUpdateValue) is invoked.
     Afterwards CacheStore.erase(key) is invoked.
     Both synchronously to the remove() call.
     I expected only erase will be invoked.
     BR,
     Robert

Write-Behind Caching and Old Values

Is there a way to access the old value cached in the write-behind cache for the same key from the CacheStore's store() or storeAll() method?

I have a business POJO with three parts: partA,     > partB, partC inside. Each of these three parts is
     > persisted by a separate SQL. So, every time I persist
     > my POJO, up to 3 SQLs may be executed.
     I understand.
     > When a change happens in my POJO, it goes onto the
     > write-behind queue. In my CacheStore.store() or
     > CacheStore.storeAll() I would like to be able to make
     > an intelligent decision about which of the three
     > parts: partA, partB or partC has actually changed and
     > only run the SQL updates for the changed parts. This
     > would allow me to avoid massive amounts of
     > unnecessary SQL updates for the parts that did not
     > change.
     Right. Keep in mind that there are two conditions that you must be aware of:
     1) Multiple updates could have occurred to the object, meaning that the database update would have to "roll up" the results of multiple changes to the object.
     2) Some or all of the updates could have already occurred to the database. This may be a little trickier to understand, but it reflects the possible machine failure conditions that occurred while a write-behind was in progress.
     Although the latter are unlikely, they should be accounted for, and of course they are harder to test for with certainty. As a result, the updates to the information (the CacheStore implementation) must be built in an "idempotent" manner, i.e. allowing it to be executed more than once with no additional side-effects.
     > If I had access to the POJO stored under the same key
     > before the new value was put in cache, I could use
     > equals() on each of the three parts to find out
     > exactly which one of them changed.
     While this is true, you would need to compare the "known previous database state" version, not just the "old" version.
     > Of course, if this functionality is not available, I
     > would have to create dirty flags for each of the
     > three POJO parts. But I can't really clear my POJO's
     > flags and recache the POJO from within the store() or
     > storeAll(), right?
     Yes, but remember that those flags are "could be dirty" flags, because of the above failure modes that I described.
     Peace,
     Cameron Purdy
     Tangosol Coherence: The Java Data Grid

Migrating from 3.1 to 3.7 - write through for a custom cache store issues

We're migrating from 3.1 to 3.7. So far the migration and testing has been fairly uneventful, but there is one issue that came up yesterday that seems like it is going to be tricky to debug.
We have a set of storage-enabled nodes that use a custom CacheStore to read from and write behind to a mongo database. On another node connected to that caching service, read throughs work just fine. (I can set breakpoints on the CacheStore load method and see the load calls coming through just fine) - but what's not working is when the other node does a Cache.put - the Store method on the CacheStore is never called and so far I don't see anything in the logs indicating there is a problem on either side (I'm going to make sure that the coherence logging is up to the highest level on both the nodes today when I'm doing more testing)
I can see the cache put start to dive into the coherence jar, but I don't have source jars for coherence so it's fairly opague what might be going wrong after the Cache.put(object, object) call. I can see that it dives into various coherence methods, but
Any ideas on where to start debugging this?
This setup worked fine on 3.1, and as best we can tell all the API calls were converted over to their proper coherence 3.7 versions, and the coherence.xml files were migrated to use the new xsd etc.

it seems that the issue might be related to this:
2012-08-15 14:19:34.086 Tangosol Coherence 3.7.1.5 <Error> (thread=WriteBehindThread:CacheStoreWrapper(com.foo.cache.MongoCacheStore):Foo.com-CMS, member=13): Failed to store key="assetId=DEFAULT;assetStyle=DEFAULT;initial=c;siteId=foosite;"
2012-08-15 14:19:34.087 Tangosol Coherence 3.7.1.5 <Error> (thread=WriteBehindThread:CacheStoreWrapper(com.foo.configrepo.cache.MongoCacheStore):Foo.com-CMS, member=13): (Wrapped) java.io.StreamCorruptedException: invalid type: 13
     at com.tangosol.util.ExternalizableHelper.fromBinary(ExternalizableHelper.java:266)
     at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$ConverterFromBinary.convert(PartitionedCache.CDB:4)
     at com.tangosol.net.cache.BackingMapBinaryEntry.getValue(BackingMapBinaryEntry.java:124)
     at com.tangosol.net.cache.ReadWriteBackingMap$CacheStoreWrapper.storeInternal(ReadWriteBackingMap.java:5731)
     at com.tangosol.net.cache.ReadWriteBackingMap$StoreWrapper.store(ReadWriteBackingMap.java:4814)
     at com.tangosol.net.cache.ReadWriteBackingMap$WriteThread.run(ReadWriteBackingMap.java:4217)
     at com.tangosol.util.Daemon$DaemonWorker.run(Daemon.java:803)
     at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.StreamCorruptedException: invalid type: 13
     at com.tangosol.util.ExternalizableHelper.readObjectInternal(ExternalizableHelper.java:2303)
     at com.tangosol.util.ExternalizableHelper.deserializeInternal(ExternalizableHelper.java:2746)
     at com.tangosol.util.ExternalizableHelper.fromBinary(ExternalizableHelper.java:262)
     ... 7 moreLooks like it is an issue with the serialization? We're primarily using XmlBean, not POF for serialization.
Any tips on troubleshooting this?
Edited by: RyanGardner on Aug 15, 2012 7:37 AM
Edited by: RyanGardner on Aug 15, 2012 7:38 AM

How can I use the same thread pool implementation for different tasks?

Dear java programmers,
I have written a class which submits Callable tasks to a thread pool while illustrating the progress of the overall procedure in a JFrame with a progress bar and text area. I want to use this class for several applications in which the process and consequently the Callable object varies. I simplified my code and looks like this:
        threadPoolSize = 4;
        String[] chainArray = predock.PrepareDockEnvironment();
        int chainArrayLength = chainArray.length;
        String score = "null";
        ExecutorService executor = Executors.newFixedThreadPool(threadPoolSize);
        CompletionService<String> referee = new ExecutorCompletionService<String>(executor);
        for (int i = 0; i < threadPoolSize - 1; i++) {
            System.out.println("Submiting new thread for chain " + chainArray);
referee.submit(new Parser(chainArray[i]));
for (int chainIndex = threadPoolSize; chainIndex < chainArrayLength; chainIndex++) {
try {
System.out.println("Submiting new thread for chain " + chainArray[chainIndex]);
referee.submit(new Parser(chainArray[i]));
score = referee.poll(10, TimeUnit.MINUTES).get();
System.out.println("The next score is " + score);
executor.shutdown();
int index = chainArrayLength - threadPoolSize;
score = "null";
while (!executor.isTerminated()) {
score = referee.poll(10, TimeUnit.MINUTES).get();
System.out.println("The next score is " + score);
index++;
My question is how can I replace Parser object with something changeable, so that I can set it accordingly whenever I call this method to conduct a different task?
thanks,
Tom

OK lets's start from the beginning with more details. I have that class called ProgressGUI which opens a small window with 2 buttons ("start" and "stop"), a progress bar and a text area. It also implements a thread pool to conducts the analysis of multiple files.
My main GUI, which is much bigger that the latter, is in a class named GUI. There are 3 types of operations which implement the thread pool, each one encapsulated in a different class (SMAP, Dock, EP). The user can set the necessary parameters and when clicking on a button, opens the ProgressGUI window which depicts the progress of the respective operation at each time step.
The code I posted is taken from ProgressGui.class and at the moment, in order to conduct one of the supported operations, I replace "new Parser(chainArray)" with either "new SMAP(chainArray[i])", "new Dock(chainArray[i])", "new EP(chainArray[i])". It would be redundant to have exactly the same thread pool implementation (shown in my first post) written 3 different times, when the only thing that needs to be changed is "new Parser(chainArray[i])".
What I though at first was defining an abstract method named MainOperation and replace "new Parser(chainArray[i])" with:
new Callable() {
public void call() {
    MainOperation();
});For instance when one wants to use SMAP.class, he would initialize MainOperation as:
public abstract String MainOperation(){
    return new SMAP(chainArray));
That's the most reasonable explanation I can give, but apparently an abstract method cannot be called anywhere else in the abstract class (ProgressGUI.class in my case).
Firstly it should be Callable not Runnable.Can you explain why? You are just running a method and ignoring any result or exception. However, it makes little difference.ExecutorCompletionService takes Future objects as input, that's why it should be Callable and not Runnable. The returned value is a score (String).
Secondly how can I change that runMyNewMethod() on demand, can I do it by defining it as abstract?How do you want to determine which method to run?The user will click on the appropriate button and the GUI will initialize (perhaps implicitly) the body of the abstract method MainOperation accordingly. Don't worry about that, this is not the point.
Edited by: tevang2 on Dec 28, 2008 7:18 AM

ACE: Significance of mask in nat-pools configured for Source NAT

Hi guys
If I am using source nat in ACE (One IP address 10.10.10.200) used for all client address translations.
What would be the difference between the nat-pools configured with different netmask.
What is the recommended netmask for pat, 255.255.255.255 or Vlan interface's Mask (/24 in this case)
and why?
case1:
interface vlan 7
ip address 10.10.10.100 255.255.255.0
nat-pool 1 10.10.10.200 10.10.10.200 netmask 255.255.255.0 pat
service-policy input clientvips
no shutdown
case2:
interface vlan 7
ip address 10.10.10.100 255.255.255.0
nat-pool 1 10.10.10.200 10.10.10.200 netmask 255.255.255.255 pat
service-policy input clientvips
no shutdown
Thanks in Advance
A.

Gilles
Thanks a lot. It makes more sense now.
I posted another question for an ACE design validation. Could you please validate this
I am planning to deploy ACE module in following manner:
> ACE will be in one arm mode ( Only one vlan connected to the ACE).
> Vips & Rservers (all serverfarms) will be in the same Vlan X.
> Default gateway on the ACE & Real servers will be the upstream router
> There will be Source NAT configured for all Serverfarms.
ACE --- Vlan X -------Router--- internet
.................|
.................|-- Sfarm 1
.................|
.................|-- Sfarm 2
.................|
.................|-- Sfarm n
I am pretty sure that it should work.
Just wanted an expert opinion.
Thanks

What is the best hard drive configuration for disk/media cache?

I just installed a 480 GB PCI SSD for my boot drive. I also have a 240 GB SSD and a 1TB 7200 rpm hd. Can anyone suggest on which drives to place my source files, aep files, media and disk cache and output for the best read/write speed? Thanks for any advice. Same question for Premiere Pro too, if you know!
I'm using a Mac Pro dual proc, 12 core, 40 GB.

Hi YoshBear,
Welcome to Adobe forums,
We recommend you to go through following link as it provides more information about setting up hardware for Premiere pro and After effects :
http://forums.adobe.com/thread/878520
http://helpx.adobe.com/after-effects/using/improve-performance.html
http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/products/creativesuite/produ ction/cs6/pdfs/adobe-hardware-performance-whitepaper.pdf
Regards
Abhishek

TTL specified in put operation doesn't always work when using write-behind

I'm using a distributed cache with a write-behind cache store (see the config below). I found that when I do something like myCache.put(key, value, ttl), the entry survives the specified ttl. I tried doing the same with a distributed cache with a write-through cachestore and there everything does happen correctly.
Is this sort of operation not permitted in caches containing a write-behind cachestore? If not wouldn't it be better to throw an UnsupportedOperationException.
I created a small test to simulate this. I added values to the cache with a TTL of 1 to 10 seconds and found that the 10 second entries stayed in the cache.
Configuration used:
<?xml version="1.0"?>
<!DOCTYPE cache-config SYSTEM "cache-config.dtd">
<cache-config>
     <caching-scheme-mapping>
          <cache-mapping>
               <cache-name>TTL_TEST</cache-name>
               <scheme-name>testScheme</scheme-name>
          </cache-mapping>
     </caching-scheme-mapping>
     <caching-schemes>
          <distributed-scheme>
               <scheme-name>testScheme</scheme-name>
               <service-name>testService</service-name>
               <backing-map-scheme>
                    <read-write-backing-map-scheme>
                         <internal-cache-scheme>
                              <local-scheme>
                                   <service-name>testBackLocalService</service-name>
                              </local-scheme>
                         </internal-cache-scheme>
                         <cachestore-scheme>
                              <class-scheme>
                                   <scheme-name>testBackStore</scheme-name>
                                   <class-name>TTLTestServer$TestCacheStore</class-name>
                              </class-scheme>
                         </cachestore-scheme>
                         <write-delay>3s</write-delay>
                    </read-write-backing-map-scheme>
               </backing-map-scheme>
               <local-storage>true</local-storage>
               <autostart>true</autostart>
          </distributed-scheme>
     </caching-schemes>
</cache-config>Code of test:
import java.util.Collection;
import java.util.List;
import java.util.Map;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import org.joda.time.DateTime;
import org.joda.time.Duration;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.util.StopWatch;
import org.testng.annotations.BeforeClass;
import org.testng.annotations.Test;
import com.google.common.collect.Lists;
import com.tangosol.net.CacheFactory;
import com.tangosol.net.NamedCache;
import com.tangosol.net.cache.CacheStore;
@Test
public class TTLTestServer
     private static final int RETRIES = 5;
     private static final Logger logger = LoggerFactory.getLogger( TTLTestServer.class );
     private NamedCache m_cache;
      * List of Time-To-Lives in seconds to check
     private final List<Integer> m_listOfTTLs = Lists.newArrayList(1, 3, 5, 10);
      * Test is done in separate threads to speed up the test
     private final ExecutorService m_executorService = Executors.newCachedThreadPool();
     @BeforeClass
     public void setup()
          logger.info("Getting the cache");
          m_cache = CacheFactory.getCache("TTL_TEST");
     public static class TestCacheStore implements CacheStore
          public void erase(Object arg0)
          public void eraseAll(Collection arg0)
          public void store(Object arg0, Object arg1)
          public void storeAll(Map arg0)
          public Object load(Object arg0)
          {return null;}
          public Map loadAll(Collection arg0)
          {return null;}
     public void testTTL() throws InterruptedException, ExecutionException
          logger.info("Starting TTL test");
          List<Future<StopWatch>> futures = Lists.newArrayList();
          for (final Integer ttl : m_listOfTTLs)
               futures.add(m_executorService.submit(new Callable()
                    public Object call() throws Exception
                         StopWatch stopWatch= new StopWatch("TTL=" + ttl);
                         for (int retry = 0; retry < RETRIES; retry++)
                              logger.info("Adding a value in cache for TTL={} in try={}", ttl, retry+1);
                              stopWatch.start("Retry="+retry);
                              m_cache.put(ttl, null, ttl*1000);
                              waitUntilNotInCacheAnymore(ttl, retry);
                              stopWatch.stop();
                         return stopWatch;
                    private void waitUntilNotInCacheAnymore(final Integer ttl, final int currentTry) throws InterruptedException
                         DateTime startTime = new DateTime();
                         long maxMillisToWait = ttl*2*1000;     //wait max 2 times the time of the ttl
                         while(m_cache.containsKey(ttl) )
                              Duration timeTaken = new Duration(startTime, new DateTime());
                              if(timeTaken.getMillis() > maxMillisToWait)
                                   throw new RuntimeException("Already waiting " + timeTaken + " for ttl=" + ttl + " and retry=" + currentTry);
                              Thread.sleep(1000);
          logger.info("Waiting until all futures are finished");
          m_executorService.shutdown();
          logger.info("Getting results from futures");
          for (Future<StopWatch> future : futures)
               StopWatch sw = future.get();
               logger.info(sw.prettyPrint());
}Failure message:
FAILED: testTTL
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Already waiting PT20.031S for ttl=10 and retry=0
     at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source)
     at java.util.concurrent.FutureTask.get(Unknown Source)
     at TTLTestServer.testTTL(TTLTestServer.java:159)
Caused by: java.lang.RuntimeException: Already waiting PT20.031S for ttl=10 and retry=0
     at TTLTestServer$1.waitUntilNotInCacheAnymore(TTLTestServer.java:139)
     at TTLTestServer$1.call(TTLTestServer.java:122)
     at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
     at java.util.concurrent.FutureTask.run(Unknown Source)
     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
     at java.lang.Thread.run(Unknown Source)I'm using Coherence 3.4.2.
Best regards
Jan

Hi, still no luck. However, I noticed that setting the write-delay value of the write-behind store to 0s or 1s, solved the problem. It only starts to given me "the node has already been removed" excpetions once the write-delay value is 2s or higher.
You can find the coherence-cache-config.xml below:
<?xml version="1.0"?>
<!DOCTYPE cache-config SYSTEM "cache-config.dtd">
<cache-config>
     <caching-scheme-mapping>
          <cache-mapping>
               <cache-name>TTL_TEST</cache-name>
               <scheme-name>testScheme</scheme-name>
          </cache-mapping>
     </caching-scheme-mapping>
     <caching-schemes>
          <distributed-scheme>
               <scheme-name>testScheme</scheme-name>
               <service-name>testService</service-name>
               <backing-map-scheme>
                    <read-write-backing-map-scheme>
                         <internal-cache-scheme>
                              <local-scheme>
                                   <service-name>testBackLocalService</service-name>
                              </local-scheme>
                         </internal-cache-scheme>
                         <cachestore-scheme>
                              <class-scheme>
                                   <scheme-name>testBackStore</scheme-name>
                                   <class-name>TTLTestServer$TestCacheStore</class-name>
                              </class-scheme>
                         </cachestore-scheme>
                         <write-delay>2s</write-delay>
                    </read-write-backing-map-scheme>
               </backing-map-scheme>
               <local-storage>true</local-storage>
               <autostart>true</autostart>
          </distributed-scheme>
     </caching-schemes>
</cache-config>You can find the test program below:
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
import java.util.Map;
import org.joda.time.DateTime;
import org.joda.time.Duration;
import org.springframework.util.StopWatch;
import com.tangosol.net.CacheFactory;
import com.tangosol.net.NamedCache;
import com.tangosol.net.cache.CacheStore;
public class TTLTestServer
     private static final int RETRIES = 5;
     private NamedCache m_cache;
      * List of Time-To-Lives in seconds to check
     private final List<Integer> m_listOfTTLs = new ArrayList<Integer>();
      * @param args
      * @throws Exception
     public static void main( String[] args ) throws Exception
          new TTLTestServer().test();
      * Empty CacheStore
      * @author jbe
     public static class TestCacheStore implements CacheStore
          public void erase(Object arg0)
          @SuppressWarnings ( "unchecked" )
          public void eraseAll(Collection arg0)
          public void store(Object arg0, Object arg1)
          @SuppressWarnings ( "unchecked" )
          public void storeAll(Map arg0)
          public Object load(Object arg0)
          {return null;}
          @SuppressWarnings ( "unchecked" )
          public Map loadAll(Collection arg0)
          {return null;}
      * Sets up and executes the test setting values in a cache with a given time-to-live value and waiting for the value to disappear.
      * @throws Exception
     private void test() throws Exception
          System.out.println(new DateTime() + " - Setting up TTL test");
          m_cache = CacheFactory.getCache("TTL_TEST");
          m_listOfTTLs.add( 1 );
          m_listOfTTLs.add( 3 );
          m_listOfTTLs.add( 5 );
          m_listOfTTLs.add( 10);
          System.out.println(new DateTime() + " - Starting TTL test");
          for (final Integer ttl : m_listOfTTLs)
               StopWatch sw = doTest(ttl);
               System.out.println(sw.prettyPrint());
      * Adds a value to the cache with the time-to-live as given by the ttl parameter and waits until it's removed from the cache.
      * Repeats this {@link #RETRIES} times
      * @param ttl
      * @return
      * @throws Exception
     private StopWatch doTest(Integer ttl) throws Exception
          StopWatch stopWatch= new StopWatch("TTL=" + ttl);
          for (int retry = 0; retry < RETRIES; retry++)
               System.out.println(new DateTime() + " - Adding a value in cache for TTL=" + ttl + " in try= " + (retry+1));
               stopWatch.start("Retry="+retry);
               m_cache.put(ttl, null, ttl*1000);
               waitUntilNotInCacheAnymore(ttl, retry);
               stopWatch.stop();
          return stopWatch;
      * Wait until the value for the given ttl is not in the cache anymore
      * @param ttl
      * @param currentTry
      * @throws InterruptedException
     private void waitUntilNotInCacheAnymore(final Integer ttl, final int currentTry) throws InterruptedException
          DateTime startTime = new DateTime();
          long maxMillisToWait = ttl*2*1000;     //wait max 2 times the time of the ttl
          while(m_cache.containsKey(ttl) )
               Duration timeTaken = new Duration(startTime, new DateTime());
               if(timeTaken.getMillis() > maxMillisToWait)
                    throw new RuntimeException("Already waiting " + timeTaken + " for ttl=" + ttl + " and retry=" + currentTry);
               Thread.sleep(1000);
}You can find the output below:
2009-12-03T11:50:04.584+01:00 - Setting up TTL test
2009-12-03 11:50:04.803/0.250 Oracle Coherence 3.5.2/463p2 <Info> (thread=main, member=n/a): Loaded operational configuration from resource "jar:file:/C:/Temp/coherence3.5.2/coherence-java-v3.5.2b463-p1_2/coherence/lib/coherence.jar!/tangosol-coherence.xml"
2009-12-03 11:50:04.803/0.250 Oracle Coherence 3.5.2/463p2 <Info> (thread=main, member=n/a): Loaded operational overrides from resource "jar:file:/C:/Temp/coherence3.5.2/coherence-java-v3.5.2b463-p1_2/coherence/lib/coherence.jar!/tangosol-coherence-override-dev.xml"
2009-12-03 11:50:04.803/0.250 Oracle Coherence 3.5.2/463p2 <D5> (thread=main, member=n/a): Optional configuration override "/tangosol-coherence-override.xml" is not specified
2009-12-03 11:50:04.803/0.250 Oracle Coherence 3.5.2/463p2 <D5> (thread=main, member=n/a): Optional configuration override "/custom-mbeans.xml" is not specified
Oracle Coherence Version 3.5.2/463p2
Grid Edition: Development mode
Copyright (c) 2000, 2009, Oracle and/or its affiliates. All rights reserved.
2009-12-03 11:50:04.943/0.390 Oracle Coherence GE 3.5.2/463p2 <Info> (thread=main, member=n/a): Loaded cache configuration from "file:/C:/jb/workspace3.5/TTLTest/target/classes/coherence-cache-config.xml"
2009-12-03 11:50:05.318/0.765 Oracle Coherence GE 3.5.2/463p2 <D5> (thread=Cluster, member=n/a): Service Cluster joined the cluster with senior service member n/a
2009-12-03 11:50:08.568/4.015 Oracle Coherence GE 3.5.2/463p2 <Info> (thread=Cluster, member=n/a): Created a new cluster "cluster:0xD3FB" with Member(Id=1, Timestamp=2009-12-03 11:50:05.193, Address=172.16.44.32:8088, MachineId=36896, Location=process:11848, Role=TTLTestServerTTLTestServer, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=2) UID=0xAC102C20000001255429380990201F98
2009-12-03 11:50:08.584/4.031 Oracle Coherence GE 3.5.2/463p2 <D5> (thread=Invocation:Management, member=1): Service Management joined the cluster with senior service member 1
2009-12-03 11:50:08.756/4.203 Oracle Coherence GE 3.5.2/463p2 <D5> (thread=DistributedCache:testService, member=1): Service testService joined the cluster with senior service member 1
2009-12-03T11:50:08.803+01:00 - Starting TTL test
2009-12-03T11:50:08.818+01:00 - Adding a value in cache for TTL=1 in try= 1
2009-12-03T11:50:09.818+01:00 - Adding a value in cache for TTL=1 in try= 2
Exception in thread "main" (Wrapped: Failed request execution for testService service on Member(Id=1, Timestamp=2009-12-03 11:50:05.193, Address=172.16.44.32:8088, MachineId=36896, Location=process:11848, Role=TTLTestServerTTLTestServer)) java.lang.IllegalStateException: the node has already been removed
     at com.tangosol.util.Base.ensureRuntimeException(Base.java:293)
     at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.tagException(Grid.CDB:36)
     at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onContainsKeyRequest(DistributedCache.CDB:41)
     at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$ContainsKeyRequest.run(DistributedCache.CDB:1)
     at com.tangosol.coherence.component.net.message.requestMessage.DistributedCacheKeyRequest.onReceived(DistributedCacheKeyRequest.CDB:12)
     at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:9)
     at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:136)
     at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onNotify(DistributedCache.CDB:3)
     at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
     at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.IllegalStateException: the node has already been removed
     at com.tangosol.util.AbstractSparseArray$Crawler.remove(AbstractSparseArray.java:1274)
     at com.tangosol.net.cache.OldCache.evict(OldCache.java:580)
     at com.tangosol.net.cache.OldCache.containsKey(OldCache.java:171)
     at com.tangosol.net.cache.ReadWriteBackingMap.containsKey(ReadWriteBackingMap.java:597)
     at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache.onContainsKeyRequest(DistributedCache.CDB:25)
     ... 7 more
2009-12-03 11:50:10.834/6.281 Oracle Coherence GE 3.5.2/463p2 <D4> (thread=ShutdownHook, member=1): ShutdownHook: stopping cluster node
2009-12-03 11:50:10.834/6.281 Oracle Coherence GE 3.5.2/463p2 <D5> (thread=Cluster, member=1): Service Cluster left the clusterBest regards
Jan

Can a db slowdown with write-behind cause a slowdown in cache operations?

If we have a coherence cluster, and one cache configured with write-behind is having trouble writing to the db (ie, it's slow), and we keep adding objects to the cache that exceed the ability of the db to consume them; will flow-control kick in and cause the writes to the cache to block/slow-down? Ie, the classic producer-consumer problem, where we are adding objects to the cache, faster than the cachestore can consume them.
What happens in this case? Will flow-control kick in and block writes to the cache? Will an internal buffer just keep growing? Are there any knobs to tweak this behavior (eg, in the case of spikes, where temporarily the producer is producing faster than the consumer can consume for a brief period of time, but then things go back to normal)?

user9222505 wrote:
I believe we discovered that the same thread pool is used for all requests to the cache, including gets, puts and calls into the cachestore. So if the writes are slow within the cachestore, then it uses up all of the threads and slows everything down.Hi,
This is not really correct.
If a cache in a service is configured to use write-behind then a separate thread for that service is started, which deals with write-behind store and storeAll operations.
The remove operations need to be handled synchronously to avoid corruption of the data-set in the scenario of reading a entry from the cache immediately after removing it (if it were not synchronously deleted from the backing storage, then reading it back could give an incorrect non-null value). Therefore remove operations are handled synchronously on the service / worker thread, and not delayed on the write-behind thread.
Gets are also synchronously handled, so they again are served on the service / worker thread.
So if the puts are slow and wait too much, that may delay other puts but should not contend with other threads. If the puts are computation intensive, then obviously they hinder other threads because of consumption of the same CPU resource, and not simply because they execute.
Best regards,
Robert

Cache write-behind complete check

Is there a surefire way to check a cache that has a store persisting objects to the database and write-behind set to 2 seconds, has persisted all objects put into the cache?
We have tried using JMX, querying the Cache's QueueSize and waiting until it reaches 0. It turns out that when putting objects into the write-behind cache, the write-behind queue is not necessarily non-zero immediately after the put(s). e.g. QueueSize may be 0, even if objects still need to be persisted.
For our nightly integration tests we need to clear out the cache, but want to make sure we do not call NamedCache.clear() on a cache that still has objects that need to be persisted.
Any ideas?

Hi Rob,
The problem may actually be the timeliness of updates to the QueueSize JMX attribute as we're using the MBeanConnector to obtain information on our cache members (and providers). Assuming that objects actually make it to the write-behind queue during the cache put call and certain tests need to be sure these objects are persisted, instead of doing the accounting approach discussed previously, I found a forum thread on ReadWriteBackingMap flush calls.
To get access to the ReadWriteBackingMap.flush() I created a small test today using a subclass of ReadWriteBackingMap that registers the backingmap with our CashStore implementation:
    protected void configureCacheStore(CacheStore store, boolean readOnly) {
        super.configureCacheStore(store, readOnly);
        if (store instanceof CacheLoaderWriterProvider) {
            ((CacheLoaderWriterProvider)store).registerBackingMap(this);
    }Our cachstore (CacheLoaderWriterProvider) in turn exposes a call to the registered map's flush method as a JMX operation.
Whenever we need to be sure the write-behind queue is empty during our tests, we'll call this JMX operation.
Best Regards,
Marcel.

Coherence 3.3.1 Version, Write Behind Replicated Cache Error

Hi,
I am using Cohernece 3.3.1 Version, i have a Write Behind cache, the put method is throwing following exception:
java.lang.IllegalArgumentException: Invalid internal format: Inactive
at com.tangosol.coherence.component.util.BackingMapManagerContext.addInternalValueDecoration(BackingMapManagerCo
ntext.CDB:11)
at com.tangosol.net.cache.ReadWriteBackingMap.put(ReadWriteBackingMap.java:737)
at com.tangosol.coherence.component.util.CacheHandler.onLeaseUpdate(CacheHandler.CDB:52)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.ReplicatedCache.performUpdate(ReplicatedC
ache.CDB:11)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.ReplicatedCache.onLeaseUpdateRequest(Repl
icatedCache.CDB:22)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.ReplicatedCache$LeaseUpdateRequest.onRece
ived(ReplicatedCache.CDB:5)
at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.onMessage(Service.CDB:9)
at com.tangosol.coherence.component.util.daemon.queueProcessor.Service.onNotify(Service.CDB:123)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.ReplicatedCache.onNotify(ReplicatedCache.
CDB:3)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:35)
at java.lang.Thread.run(Thread.java:534)
============
The same cache works fine if i change the
value of <write-delay-seconds> parameter to 0 i.e. if i make the cache write through.
Could someone help me out with this issue.
-thanks
Krishan

Write-behind caching is not supported with Replicated cache. Even with write-through, you'll end up generating replicated writes back to the back-end database, drastically increasing load.
For more details, please see:
http://wiki.tangosol.com/display/COH33UG/Read-Through,+Write-Through,+Refresh-Ahead+and+Write-Behind+Caching
For applications where write-behind would be used, the partitioned (distributed) cache is almost always a far better option. Is there a reason to not use this?
Jon Purdy
Oracle

Thread pool configuration for write-behind cache store operation?

Similar Messages

Maybe you are looking for