Coherence filters

Hi,
I am facing few problem with coherence filters .
My code looks like,
     ValueExtractor extractor = new ReflectionExtractor("getMethod");
          Set<Map.Entry<String, ValueObject>> entries = cache.entrySet(new LikeFilter(extractor, datakey+"%", '~', true));
          ValueObject vo;
          List<Double> hmlist1 = new ArrayList();
          for (Map.Entry<String, ValueObject> entry : entries) {
               vo = entry.getValue();
               System.out.println("Search Value:"+vo.getData());
               double data=vo.getData();
               hmlist1.add(data);
          System.out.println("the size and values-->"+hmlist1.size()+"......."+hmlist1.get(9));
double min = (Double) cache.aggregate(hmlist1,
                    new DoubleMin(extractor)
My question is:
I am able to get the size as well values of the List hmlist1 separately as given in the Sysout.
I pass the same List hmlist1 to the aggregate function for finding the min value where I get the null pointer exception.
I tried to get count of the holist using count aggregate function where I get the count value as 0.
How size became to zero in the aggregate function.
Can anyone please help me..
Thanks in advance.

918079 wrote:
Hi,
I am facing few problem with coherence filters .
My code looks like,
     ValueExtractor extractor = new ReflectionExtractor("getMethod");
          Set<Map.Entry<String, ValueObject>> entries = cache.entrySet(new LikeFilter(extractor, datakey+"%", '~', true));
          ValueObject vo;
          List<Double> hmlist1 = new ArrayList();
          for (Map.Entry<String, ValueObject> entry : entries) {
               vo = entry.getValue();
               System.out.println("Search Value:"+vo.getData());
               double data=vo.getData();
               hmlist1.add(data);
          System.out.println("the size and values-->"+hmlist1.size()+"......."+hmlist1.get(9));
double min = (Double) cache.aggregate(hmlist1,
                    new DoubleMin(extractor)
My question is:
I am able to get the size as well values of the List hmlist1 separately as given in the Sysout.
I pass the same List hmlist1 to the aggregate function for finding the min value where I get the null pointer exception.
I tried to get count of the holist using count aggregate function where I get the count value as 0.
How size became to zero in the aggregate function.
Can anyone please help me..
Thanks in advance.Hi,
which exact Coherence version (down to the patch level, so 3.x.x.x) do you use?
Also, could you please post that exception?
Best regards,
Robert

Similar Messages

Custom aggregators and filters in coherence

Hi
I want get max,min,avg function using custom aggregators for my cache data.
I need to filter the data in the cache(),so I am using Likefilter to filter the data and den I need to apply the custom aggregation on the filtered data.
can anyone send me an example to implement this function...

918079 wrote:
Hi
I want get max,min,avg function using custom aggregators for my cache data.
I need to filter the data in the cache(),so I am using Likefilter to filter the data and den I need to apply the custom aggregation on the filtered data.
can anyone send me an example to implement this function...This is the forum for Application Express, not "Coherence".
(Please indulge my curiosity: after previously making 4 posts in the Coherence forum, why did you post this here?)

Custom aggregators and filters in coherence aggregators

Hi
I want get max,min,avg function using custom aggregators for my cache data.
I need to filter the data in the cache(),so I am using Likefilter to filter the data and den I need to apply the custom aggregation on the filtered data.
can anyone send me an example to implement this function...

918079 wrote:
I need to get all the values of min,max,count,avg for lakhs of records for multiple times and set to a value object.
If we use default aggregators,Double Min,DoubleMAx,etc., den every time it will create new instances for each aggregator function gets called.
But, I don't want to create multiple instances of each entry aggregators for each functions.
Instead I want to create custom aggregator , which does all the aggregate functions(min,max,avg,count) within the custom aggregator.Hence I can reduce number of instances.
Please help.The worry about object allocation is not really justified. Let's see what allocates objects during executing a parallel aggregation.
1. You allocate memory when deserializing the aggregator: you can't really do anything about it.
*2. You allocate memory for deserializing the entry if you use entry.getValue(). This deserialized object is cached in the entry during execution so it does not matter if you use a single custom aggregator or a parallel composite aggregator composed of 3 parallel aggregators. You can avoid this by not deserializing, instead you use indexes or POF extraction.*
3. You allocate memory when you extract something out of the entry. This allocation practically does not escape eden, so it is unlikely to cause problems. If you don't deserialize the object, then you likely can't escape this one anyway.
4. You allocate a result object from the entry aggregators. This is one object per node sending an aggregation if you have a custom aggregator, and an array and one additional object (for average). The min and max value is already accounted for at bulletpoint 3.
Do you REALLY care about that one very likely stack-allocated object per node per aggregation? How much allocation do you think Coherence does when receiving messages from other nodes? Rest assured, it is more than one result object per node.
Optimize your aggregation by using POF extraction and indexes, that can give you visible results. But you should not worry about the rest in case we are speaking about this use case.
And if you want to know how to write custom aggregators, then look at the documentation of ParallelAwareAggregator and EntryAggregator, and also read the chapter from the Coherence book:
http://www.packtpub.com/article/working-with-aggregators-in-oracle-coherence-3.5
This chapter also contains an example of writing an average calculating aggregator. You can start from there to implement your own version, but as I mentioned, it won't make a visible difference in allocation for this use case. Optimizing away entry deserialization is what makes a difference.
Best regards,
Robert

Want to input interference signal in labview + perform bandpass filtering/demodulation

Hello to whom it may concern,
I have a fiber-based OCT system linked to a Tektronix TDS 2024B digital oscilloscope, with which is showing the interference fringes of the low-coherence lightsource. The signal is linked to the computer using the NI DAQ Board USB-6251. I want to input the fringes using Labview and not only that, but to perform a high-pass filter command, along with an active full-wave rectifier and a lowpass filter to see the group velocity of the interference pulse. Now I am new to using Labview so this question be a little elementary, however I am struggling here. Any helps are greatly appreciated.

Hello!
Basically you will need to measure a signal in LabVIEW and then perform some processing on it. First I suggest, if you haven’t, getting familiar to LabVIEW and DAQmx that are basically what you will use for acquisition. For processing, you might find useful some toolkits we have for filters and signal processing. I’m linking information on these subjects, if you wish, you can download them in evaluation mode and take a look at the examples on how to use those functions.
NI LabVIEW Digital Filter Design Toolkit
NI LabVIEW Advanced Signal Processing Toolkit
Getting Started with NI-DAQmx
Getting Started with LabVIEW
Regards,
Alina M

Coherence *** WISH LIST ***

Some things I'd love to see in the future...
1) Add TCMP and Extend protocols to wireshark so I can monitor what apps are actually doing, ie. when CQC registrations go out, how much data comes back, etc
2) need some way to introduce various errors/disconnets into the cluster to see how applications function when cluster is not performing correctly. Need to simulate deadlocks, timeouts, service restarts, long GC delays. maybe using an mbean on each member. Waiting around for things to go wrong in production so you can play whac-a-mole is no good.
3) There should be some way to tell what's going on behind the scenes. I'm using tangosol.coherence.log.level=9 and it does not log when filters, queries, etc are received by the cluster or node. It would be very helpful for diagnostics if a cluster could be monitored for what processors and CQCs are running on it.
4) Have verbose logging levels include thread pool utilization info whenever some threashold (ie, 80%) is crossed: "pool usage high: avg [7/8] 87% use".
5) Configuring coherence is still too much of a black art. coherence should come "out of the box" with JVM args which limit GC times such that cluster members are never declared as paused, removed from the cluster, etc. Let application performance be as poor as necessary using these fail-safe defaults but the cluster should protect itself first. It seems that if you have all Coherence client and server JVMs on one physical machine and its CPU utilization never goes over 50% then you should never have timeouts, rescheduled packets, nodes leaving the cluster, etc. Or is that not reasonable?
6) Documentation of <Error> messages: When a log message like this shows up:
2012-03-21 09:11:00.886/16140.585 Oracle Coherence GE 3.7.1.1 <Error> (thread=Cluster, member=6): Assertion failed: Member 4 is unknown to this member
I have no idea what a likely cause or solution is. A document explaining all the possible errors and what to maybe do about them would be great.
-Andrew

snidely_whiplash wrote:
Some things I'd love to see in the future...
1) Add TCMP and Extend protocols to wireshark so I can monitor what apps are actually doing, ie. when CQC registrations go out, how much data comes back, etc
2) need some way to introduce various errors/disconnets into the cluster to see how applications function when cluster is not performing correctly. Need to simulate deadlocks, timeouts, service restarts, long GC delays. maybe using an mbean on each member. Waiting around for things to go wrong in production so you can play whac-a-mole is no good.
Yep, that would be good.
3) There should be some way to tell what's going on behind the scenes. I'm using tangosol.coherence.log.level=9 and it does not log when filters, queries, etc are received by the cluster or node. It would be very helpful for diagnostics if a cluster could be monitored for what processors and CQCs are running on it.
You can intercept the deserialization of filters, queries, etc, already in two (three) ways:
1. Change pof configuration so that you have a custom PofSerializer which delegates to the originally configured PofSerializer but does whatever monitoring you want. Beware that it will make things slower if you want to write to persistent store from this, so don't do that.
2. Change the serializer of the service to something which does the same (e.g. a ConfigurablePofContext subclass) which overrides and wraps the Serializer interface methods and does the monitoring.
3. With your favourite AOP framework have an around advice which does the monitoring pointed to the two methods from Serializer in ConfigurablePofContext.
4) Have verbose logging levels include thread pool utilization info whenever some threashold (ie, 80%) is crossed: "pool usage high: avg [7/8] 87% use".
It is possibly not a good idea. If you have so many threads that this situation only rarely occurs, then you have overprovisioned your proxy or your service and having less threads may give you less overhead (idle threads still occupy memory due to their allocated stack, which by default is 2megs per thread, if I correctly remember).
5) Configuring coherence is still too much of a black art. coherence should come "out of the box" with JVM args which limit GC times such that cluster members are never declared as paused, removed from the cluster, etc. Let application performance be as poor as necessary using these fail-safe defaults but the cluster should protect itself first. It seems that if you have all Coherence client and server JVMs on one physical machine and its CPU utilization never goes over 50% then you should never have timeouts, rescheduled packets, nodes leaving the cluster, etc. Or is that not reasonable?
It is not reasonable. The ideal GC settings always depend on your application's garbage generation speed and the amount of long-lived data you should have in old gen and how frequently that long-lived data is replaced with newer long-lived data. There is no silver bullet.
6) Documentation of <Error> messages: When a log message like this shows up:
2012-03-21 09:11:00.886/16140.585 Oracle Coherence GE 3.7.1.1 <Error> (thread=Cluster, member=6): Assertion failed: Member 4 is unknown to this member
I have no idea what a likely cause or solution is. A document explaining all the possible errors and what to maybe do about them would be great.
-AndrewTheoretically there is one for TCMP, it may not be up-to-date if it does not contain your message:
http://docs.oracle.com/cd/E24290_01/coh.371/e22838/appendix_errormsgs.htm
Best regards,
Robert
Edited by: robvarga on Apr 12, 2012 11:07 AM

Best way to query cache - get() vs filters?

Hi,
I am in a dillemma. Whether to use NamedCache.get() or entrySet(filter) methods to query the cache. Please guide me..
My understanding is that when using
1. get() or getAll(), Coherence checks whether the entry is in the cache, if it does exist in the cache, Coherence get its from the DataStore
2. entrySet(Filters), Coherence just checks the cache and returns results based on whats available in the cache.
In that case, Isnt it better to use get instead of entrySet in a case where one is not sure whether up to date data is available in the Cache?
1. What is the difference between using a get and using entrySet ?
2. How does one make sure that up to date data is available in the cache when not using a write-behind scenario?
I am newbie...Gurus, please guide me..

sjohn wrote:
Hi,
I am in a dillemma. Whether to use NamedCache.get() or entrySet(filter) methods to query the cache. Please guide me..
My understanding is that when using
1. get() or getAll(), Coherence checks whether the entry is in the cache, if it does exist in the cache, Coherence get its from the DataStoreThat's not the relevant part.
In this case because you specified the keys to the entries, Coherence knows exactly where each entry resides, and optimally communicates with the owner nodes (1 network call / owner node) to get the data, and it will return the data without deserialization on the owner node.
2. entrySet(Filters), Coherence just checks the cache and returns results based on whats available in the cache.
In this case, depending on the actual filter (hierarchy) Coherence has to contact all nodes (if you did not filter it with KeyAssociatedFilter or PartitionFilter), which is not scalable, then depending on the filter(s) used it may have to deserialize possibly all cached entries (which is expensive) to evaluate the filter. On the other hand this method is usable for non-key-based access, which the get/getAll is not able to do.
In that case, Isnt it better to use get instead of entrySet in a case where one is not sure whether up to date data is available in the Cache?
1. What is the difference between using a get and using entrySet ?Heaven and earth...
2. How does one make sure that up to date data is available in the cache when not using a write-behind scenario?
You have to preload it, or trigger fetching it from a cache-store.
Best regards,
Robert

When do the Filters turn into IndexAwareFilters ?

Hi, I've looked through the documentation and this forum and probably found the answer already. But just to be sure ...
Re: Aggregation with a Partitioned Cache: unnecessary serialization This is post which i found useful for answering my question.
So am i right that any Filters which implement IndexAwareFilter interface (GreaterFilter, LessFilter, etc) start
behaving as a IndexAwareFilter after indexing the properties?
I've been looking for the solution which could optimize quering performance and it looks like there is only one choice. It's ok for me )) . When i started my search i thought about Extractors which could deal with POF format to extract some specific values without deserialing object. But in that case you still have to iterate over all values and cache those extracted values. The Coherence's indexing solution makes the same without requering additional effort from you.

Hi,
IndexAwareFilters are EntryFilters, but before iterating over the entire candidate set, they can first check whether there are applicable indexes they could use, and they do this before deserializing any entries.
This usually means that they check that an index exists for their extractor, and if yes, they use it, but there are exceptions, e.g. InKeySetFilter is an IndexAwareFilter, but it does not need any indexes to exist, it instead just filters the binary form of the keys.
So yes, most IndexAwareFilter give you a plus only if your extractor is indexed, but that is just because those kind of filters are provided.
As for extracting directly from the Binary form: one advantage is indeed that if an attribute is not indexed, then you don't need to pay the deserialization cost at querying, but there is a more significant advantage, too: you don't need to deserialize upon updates.
Coherence needs to deserialize your new entry value for updated entry or for an inserted entry on the storage node in case any of the following is true:
- an index is created on the cache which does not directly go into the Binary
- a map event filter-based cache listener or backing map listener is registered on the cache
- a backing map listener wants to examine the new value
- a map event transformer wants to transform the new value
Possibly most of these can benefit from not having to deserialize the entry if only Binary-aware extractors are used.
Best regards,
Robert

Coherence version mismatch

Hi ,
I have a scenario where I run a bunch of coherence jvms in a cluster.
There is a slight mismatch in coherence version on the data nodes (3.5.3/465 p2) & that of the non-storage nodes (3.5.3/465).
I intermittently see issues which prevents the managed node from joining the cluster, with the following error -
This member could not join the cluster because of an incompatibility between the cluster protocol used by this member and the one being used by the rest of the cluster. This is most likely caused by a Coherence version mismatch, or by mismatched protocol filters (e.g. compression, or encryption). Rejected by Member(Id=1 ....
All my nodes have the same config override file.
Is it mandatory that the non-storage & storage nodes have the same minor version of coherence (in this case it seems like the difference is in build# right ? ).
I am using unicast listener for WKA which explicitly points the data nodes in cluster.
Any pointers/suggestions , highly appreciated.

I was told by coherence support that having things on different minor versions is "not really supported". They let you join them together to support a gradual rolling upgrade, but they really want them all on the same version.
That being said I've never had that issue that you are talking about with things with the same minor version since the protocol versions don't change in minor versions.
There is no source visibility so it is impossible to say what exactly changed between those versions - but it is unlikely that the protocol changed. (I'm not super familiar with the nuances of 3.5.3 - we were on Ye Olde Coherence 3.1 for far too long and then did a crazy jump straight to 3.7.1)

Re: Filters in Weblogic 5.1?

No, I seriously doubt that 5.1 will ever support Servlet 2.3 spec, so,
          you will not be able to use filters. On the bright side, 6.x doesn't support
          final Servlet 2.3 spec either, so, if you really want to use filters you can
          plan your upgrade to 7.0.
          Frank LaRosa <[email protected]> wrote:
          > Hi,
          > I would very much like to use the Filter mechanism introduced in the
          > Servlet 2.3 specification with my application. My server is Weblogic
          > 5.1, currently running SP 9 and being updated to SP 11 shortly. Will I
          > be able to use filters on this platform?
          > Unfortunately I have no control over the production environment so I
          > can't upgrade the server to WL 6.
          > Thanks.
          > Frank
          Dimitri


Why don't you simply write a HTTP proxy which does this, or write Apache 2.0 filter
          until you upgrade to 2.3 container ?
          Musafir <[email protected]> wrote:
          > Hi Cameron,
          > Thanks for the response. I took a quick look at the website. While it is an interesting
          > product, I do not see how it will solve the problem at hand. Can you have a customer
          > support engineer contact me? (I have sent you an email with the same subject but
          > real email address)
          > Also, the problem has took a more generic shape in that it would suffice to have
          > an application level post-processor. By that, I mean a way to grab the applications
          > generated web-page just before it is sent out to the client. We will make custom
          > changes and then send it out to the client browser.
          > Warm regards,
          > Musafir
          > "Cameron Purdy" <[email protected]> wrote:
          >>There are ways (like Tangosol's customization server product), but not
          >>cheap.
          >>
          >>Peace,
          >>
          >>Cameron Purdy
          >>Tangosol, Inc.
          >>http://www.tangosol.com/coherence.jsp
          >>Tangosol Coherence: Clustered Replicated Cache for Weblogic
          >>
          >>
          >>"Musafir" <[email protected]> wrote in message
          >>news:[email protected]...
          >>> Hi,
          >>>
          >>> I have exactly the same requirement. If WL 5.1 does not have the filter
          >>mechanism, can anyone suggest alternative ways to achieve post-processing
          >>of
          >>a servlet? I am looking for a solution that does not require us to modify
          >>the servlet's code (it is a third party application).
          >>>
          >>> Warm regards,
          >>> Musafir
          >>
          >>
          Dimitri

Is there a listener fired when coherence client reads cache?

Hello,
we have to check (in coherence grid) if we can return certain object from cache for given user.
For example he creates filter that will return all cities for given country. But according to user rights - we can't show him cities with population higher than 1 milion. That objects should not be even transferred from coherence to client application.
How can we achieve such a filtering/security in coherence? (we have milions of objects in caches and we need to do it on coherence side due to performance)
Best Regards
Jarek

Hi,
For your information, we have integrated FCKEditor (CKEditor's ancestor) into an ADF 11g application.
I know it is very different from CKEditor, but we have been forced to create our own JSF component and made several hooks to the framework to make it behave like desired.
One of the problem we have faced is the lost of the input values when partial submit was performed on the editor isntance.
Just so you know it might be real hard to integrate the component without creating a JSF component.
Regards,
JP

Coherence aggregate functions

I am new to coherence.Need help on aggregation.
I need to get min and max values of cached data using coherence aggregate function.
the list looks like private static final String[] Data = {"001","002","003","004","005","006","007","008","009"};
Also, the code I used to get the max value is :
Double maxAge = (Double)cache.aggregate( new EqualsFilter("00%",Data),
     new DoubleMax("00")
I am getting only null a result in maxAge.Can you please help me in this.

918079 wrote:
I am new to coherence.Need help on aggregation.
I need to get min and max values of cached data using coherence aggregate function.
the list looks like private static final String[] Data = {"001","002","003","004","005","006","007","008","009"};
Also, the code I used to get the max value is :
Double maxAge = (Double)cache.aggregate( new EqualsFilter("00%",Data),
     new DoubleMax("00")
I am getting only null a result in maxAge.Can you please help me in this.Hi,
1. Please go through the JavaDocs of EqualsFilter here
2. You are trying wildcards in EqualsFilter which is not supported and instead use LikeFilter and details are here
3. Coherence aggregator/filters functions are used to aggregate/filter data managed in distributed maps and not for local variables.
4. DoubleMax is an aggregator function that calculates a maximum of numeric values (not Strings) extracted from a set of map entries.
Here is a simple example:
NamedCache cache=CacheFactory.getCache("test");
HashMap hm=new HashMap();
for (int i=0; i<50; i++){
// Inserting double values in the map that will be stored in Coherence Cache
hm.put("Key"+i, new Double(i+"."+i));
//Put all the data in the distributed cache
cache.putAll(hm);
// This will look for values in the cache that starts with 1 and return back the max. of them
Double maxAge = (Double)cache.aggregate( new LikeFilter(IdentityExtractor.INSTANCE,"1%", (char) 0, false),
new DoubleMax(IdentityExtractor.INSTANCE)
System.out.println("maxAge: " + maxAge);
Hope this helps!
Cheers,
NJ

More on the pivot table processing in coherence

hello, all:
I posted a question regarding how to create a pivot table in coherence (coherence and pivot table
Now I am on a project that my mgr wanted me to query a coherence cache and return a subset in "pivot" structure.
Suppose I send a request from the client with these info: String[] rows, String[] columns, String[] data. Each of them represent rows area, columns area and data area as in a Excel table. The aggregation is limited to sum for now.
I managed to create some code like this:
public Map<Object, Object> sum(Filter filter, String properties, String[] targets){
          EntryAggregator[] entryAggregators = new EntryAggregator[targets.length];
          for(int idex = 0;idex<targets.length;idex++){
               entryAggregators[idex] = new DoubleSum(targets[idex]);
          MultiExtractor me = new MultiExtractor(properties);
          EntryAggregator ca = CompositeAggregator.createInstance(entryAggregators);
          GroupAggregator ga = GroupAggregator.createInstance(me,ca);
          return aggregate(filter, ga);
     }here properties represent rows: "getX,getY,getZ" ; targets represent data to aggreated on: targets[] {"getA","getB"}
There are two relevant questions:
1. I am not able to put the columns info into the code;
2. the return type above is Map<Object, Object> , printing map looks like
+{[Parts, Appliance, Trash compactor],[214.5, 1.0]}...+
with 3 rows (X,Y,Z) in the first segment, and aggreated 2 values (A,B) in the 2nd segment.
how can I exact these values from a map structure (it seems the key/value are map themselves), and design a "pivot" structure to host them?
I hope I've made myself clear.
Thanks,
Johnny

Hi Johnny,
Your question intrigued me so I thought I would do a bit of digging into how you might replicate a pivot table using aggregators. Well, it has taken me a few days and as it is quit a long subject I wrote it up on my blog rather than post it all in the forum http://thegridman.com/coherence/oracle-coherence-pivot-table-queries/
You were on the right track with the GroupAggregator but there is a bit more to do to allow all the different combinations of rows, columns and values as well as filtering.
I hope what I have written up is clear enough, just ask if you have more question.
JK

Gzip filter causes start-up failure in coherence v3.1

I've just moved to coherence version 3.1 and now my override XML file with the filter configuration causes an exception at start-up.
This is the piece that causes the failure:
<outgoing-message-handler>
<use-filters>
<filter-name>gzip</filter-name>
</use-filters>
</outgoing-message-handler>
Simply commenting out the filter-name (as in the original file) allows the server to start up:
<outgoing-message-handler>
<use-filters>

</use-filters>
</outgoing-message-handler>
How do I enable the gzip filter under v3.1?
Thanks
Walter

Hi Walter,
Please send the exception stack trace to support at tangosol.com
Regards,
Dimitri

During invokeAll() execution coherence calls both applyIndex and evaluateEntry methods

There is an an odd behaviour of invokeAll(Filter filter,...) method, because both applyIndex() and evaluateEntry() (for retained entries) are called during invocation.
I don't understand why coherence calls evaluateEntry() method when a set is already filtered by using indices, since it is redundant operation which could decrease performance.
Also such behaviour causes problems with InKeySetFilter, because the underlying set contains binary keys, but evaluateEntry() receives "plain" entry and returns false:
cache.invokeAll(new InKeySetFilter(AlwaysFilter.INSTANCE, cache.keySet()), new ExtractorProcessor(new IdentityExtractor()));
//returns no items
alertCache.keySet(new InKeySetFilter(AlwaysFilter.INSTANCE, alertCache.keySet()));
//but this call returns all items
Can someone explain such behaviour and advice any solution or workaround?
PS Coherence 3.5.3
Edited by: Serge Vinogradov on 23-Aug-2010 08:43
Edited by: Serge Vinogradov on 23-Aug-2010 08:45

Serge Vinogradov wrote:
Robert,
both this methods were actually called during invokeAll execution (checked it with debugger) and applyIndex returned null.
But I've found another calls of evaluateEntry at DistributedCache.onInvokeFilterRequest. Let me quote an extract from it:
public void onInvokeFilterRequest(InvokeFilterRequest msgRequest)
Object aEntry[] = null;
aEntry = storage.query(filter, Storage.QUERY_INVOKE, partMask);
...//iterate entris
Storage.BinaryEntry entry = (Storage.BinaryEntry)aEntry[iR];
if((filter != null) ? InvocableMapHelper.evaluateEntry(filter, entry) : true)
entry.ensureWriteable();
} else
unlockKey(storage, binKey, false);
aEntry[iR] = null;
cEntries--;
continue;
...//then process not null entries
As you can see InvocableMapHelper.evaluateEntry uses original filter, but not the filter returned after appliyng index. As I understand it is done for correct dealing with locking and to ensure that entry is still complied with filter.
We have overridden InKeySetFilter class as a workaround:
public class FixedInKeySetFilter extends InKeySetFilter {
private transient boolean isFiltered = false;
public FixedInKeySetFilter() {
super();
public FixedInKeySetFilter(Filter filter, Set set) {
super(filter, set);
public Filter applyIndex(Map mapIndexes, Set setKeys)
Filter resFilter = super.applyIndex(mapIndexes, setKeys);
isFiltered = (resFilter == null);
return resFilter;
@Override
public boolean evaluateEntry(Map.Entry entry) {
return isFiltered || super.evaluateEntry(entry);
Edited by: Serge Vinogradov on 24-Aug-2010 08:33Hi Serge,
you are right... I did not expect the code to intentionally lose the returned filter reference, but that clearly seems to be the case here...
On the other hand, your evaluateEntry implementation does not seem to be correct as it does not delegate to a non-index-aware delegate filter...
The following implementation should probably fix all problems listed so far:
import java.util.Map;
import java.util.Set;
import java.util.Map.Entry;
import com.tangosol.util.Converter;
import com.tangosol.util.Filter;
import com.tangosol.util.InvocableMapHelper;
import com.tangosol.util.filter.InKeySetFilter;
import com.tangosol.util.filter.IndexAwareFilter;
public class FixedInKeySetFilter extends InKeySetFilter {
     private transient boolean isConverted;
     private transient boolean filtered;
     public FixedInKeySetFilter() {
     public FixedInKeySetFilter(Filter filter, Set setKeys) {
          super(filter, setKeys);
     @Override
     public synchronized void ensureConverted(Converter keyToInternalConverter) {
          super.ensureConverted(keyToInternalConverter);
          this.isConverted = true;
     @Override
     public Filter applyIndex(Map mapIndexes, Set setKeys) {
          if (setKeys.isEmpty()) {
               return null;
          if (!isConverted) {
               Converter converter = getKeyToInternalConverterOurself();
               if (converter != null) {
                    ensureConverted(converter);
               } else {
                    return this;
          Filter filter = getFilter();
          Filter res = super.applyIndex(mapIndexes, setKeys);
          res = setKeys.isEmpty() ? null : filter instanceof IndexAwareFilter ? res : filter;
          filtered = res == null;
          return res;
     @Override
     public boolean evaluateEntry(Entry entry) {
          if (filtered) {
               return true;
          Filter filter = getFilter();
          if (isConverted) {
               // apply index is surely invoked this case, so need not consult local setKeys...
               return InvocableMapHelper.evaluateEntry(filter, entry);
          } else {
               return super.evaluateEntry(entry);
      * Override this method if we can acquire the key-to-internal converter, e.g.
      * if we acquired it from the backing map manager context using the service name.
      * @return the key to internal converter of the cache service of the cache we are querying.
     protected Converter getKeyToInternalConverterOurself() {
          return null;
}Best regards,
Robert
Edited by: robvarga on Aug 24, 2010 11:35 PM

Problem with effectiveness of filters

I have at several occasions run into cases were the built in "query optimizer" don’t apply filters in the best order from a performance point of view when dealing with multiple filters joined together using AND-filters.
As a work-around to the problem I have sub classed the built-in filters adding an extra constructor that allow setting the effectiveness as a hard coded integer value that (if specified) is returned instead of the calculated value. This way I was able to force an AND filter to pick the filter I know (from application knowledge) is most likely the most efficient first.
Would it be possible to get this kind of constructor added to the standard filter classes?
I would also like to know how a few things about how indexes are used by Coherence:
1.     How does the various built in filters calculate there effectiveness? Do coherence for instance keep some statistics about how many unique values an index contain or how does it decide what index that is more effective than another?
2.     Can Coherence use more than one index at the same time (i.e. merge indexes)? From my experience it seems like only one index is used fully, other indexes are only used to avoid de-serialization when performing a linear search of the hits from the first index. As far as I know only the most advanced RDBM are able to use more than one index (instead of only performing linear search from the result of one index) so I am not really surprised if Coherence do the same...
3.     Does making an index sorted in any way improve the performance if the index is used as the second or third index applied in a query (one could envision that some sort of binary search could be used instead of linear search of the hits from the first index used if a sorted index is available)?
/Magnus

MagnusE wrote:
I have at several occasions run into cases were the built in "query optimizer" don’t apply filters in the best order from a performance point of view when dealing with multiple filters joined together using AND-filters.
As a work-around to the problem I have sub classed the built-in filters adding an extra constructor that allow setting the effectiveness as a hard coded integer value that (if specified) is returned instead of the calculated value. This way I was able to force an AND filter to pick the filter I know (from application knowledge) is most likely the most efficient first.
Would it be possible to get this kind of constructor added to the standard filter classes?
You can simply create a wrapper filter which implements IndexAwareFilter and wraps another IndexAwareFilter. The calculateEffectiveness should be implemented as you want it, and you can delegate applyIndex() and evaluateEntry() to the wrapped filter.
MagnusE wrote:
I would also like to know how a few things about how indexes are used by Coherence:
1.     How does the various built in filters calculate there effectiveness?I believe they return the number of reverse index accesses as effectiveness if the index exists, and maybe the size of the candidate set if it does not exist.
MagnusE wrote:
Do coherence for instance keep some statistics about how many unique values an index contain or how does it decide what index that is more effective than another?Since the reverse index is just a map between extracted values and a collection of keys, practically this information is available from MapIndex.getIndexContents().size(), if I correctly understand your question, where you can get the MapIndex for an extractor with mapIndexes.get(extractor).
MagnusE wrote:
2.     Can Coherence use more than one index at the same time (i.e. merge indexes)? From my experience it seems like only one index is used fully, other indexes are only used to avoid de-serialization when performing a linear search of the hits from the first index. As far as I know only the most advanced RDBM are able to use more than one index (instead of only performing linear search from the result of one index) so I am not really surprised if Coherence do the same...Yes, any number of indexes can be used, but the stock filters use only a single index (as you can specify a single extractor to the filter). Of course if you form a logical expression between two filters using different extractors, indexes to both extractors will be used if they exist. The applyIndex() and calculateEffectiveness() methods receive all indexes in the mapIndexes parameter, so your custom index-aware filter can use any number of existing indexes at the same time.
MagnusE wrote:
3.     Does making an index sorted in any way improve the performance if the index is used as the second or third index applied in a query (one could envision that some sort of binary search could be used instead of linear search of the hits from the first index used if a sorted index is available)?If you need range queries on the extracted value, sorted index can help a great deal as you don't have to iterate all the keys in the reverse index. This is independent of what other filters you use in your logical expression, the evaluation of the subexpression on the sorted index will still be more efficient than if the index was unsorted.
Best regards,
Robert

Coherence filters

Similar Messages

Maybe you are looking for