Friday, June 27, 2014

Simple Comparisons on some In-Memory Data Grids

 
Fetures Feature Description GemFire7 Coherence3.7 Infinispan5 Gigaspaces7 Notes
1. topologies peer-peer; client-server High High High High
2. cross-site /
  WAN replication
datacenter-datacenter; region-region High Low Med High Coherence support by incubator
"Push replication";
Infinispan basic support in V5.2
3. read-most scalability via replication High High High High
4. write-most scalability via partitioning and replication
between primary and backups
High High Med High
5. high availability / HA via replication and persistence
 into disk (DB)
High High High High
6. asychronous
    replication
useful for slow changing data and
WAN replication
High Med High High Coherence support by incubator
"Push replication";
7. partition rehashing consistent hashing to reduce
data relocation
High High High High all seem to use consistent
hashing-like algorithms
8. dynamic clustering adds or loses nodes High High High Med
9. updating among
    partition backups
master-backup: less deadlock prone
and IO master-master/update
anywhere: more dealock prone and IO
High High Low High Infinispan only has master-master
that incurs deadlock for slow
network and large caches
10. cache loader and
      writer
2 application interfaces: loades data
from DB and saves data into Cache.
With cache loader and writer,
applications only need to interace
with Cache!
High High Med High Infinispan writers only support JPA, not JDBC or Hibernate!
11. read-through and
     read-ahead
populates cache with DB
in batch mode
Med High Med High no prefetch/batch support from
Gemfire and infinispan
12. write-through and
     write-behind
persists cache into wherever they
were loaded.write-behind persists
data asynchronously
High High Med High
13. event notification Cache also works as a messaging and
parallel processing bus like JMS/MDB
Med Med Med High
14. continous querying register contents-based interests
with Cache and receive updates
continously.
High High Low High
15. cache querying SQL like query: not only searches
 based on key matching
High High Low High
16. locking and tx JTA (global tx) and ACID properties High High Med High
17. off-heap puts cache data off Java heap
so that GC pause time is reduce!
Low High Low Low
18. key affinity /
     colocation
colocates related cache objects on the
same partition to reduce IO
High High High High
19. customized
     partitioning
puts cache into specific cluster node
to bypass cache hashing algorithm
High Low Med Low
20. synchronous
     requests like RFQ
synchronou requests should go
through the same partition routing as
asynchrous messages.
Low Low Low High
21. API Map,Restful and appropriate interfaces Med Med Med High
22. language bindings Java, C++ etc Med Med Med High
23. map-reduce /
     scatter-gather
submits aggregation tasks to run
across multiple or all cluster nodes
High High High High
24. monitoring monitor the health of clusters High Med Med High
25. J2EE (JMS,Web,
     Remoting)
how much J2EE? to support? Low Low Low High
26. migration effort how to migrate to a different cache
production in a J2EE app. server
High High High Low GigaSpaces is an app. server

Monday, June 23, 2014

JMS Connection and Session on WebSphere MQ 7.1

The difference between JMS API's Connection and Session will probably confuse many naive developers. Their performance implications will only make things worse.
This is primarily because we are used to thinking connection and session are the same and can be used interchangeably to represent a unique single-threaded conversation between a server and a client. This is indeed the case for DB servers and clients.

The basic idea of JMS API's connection and session is to first create a top level heavyweight physical connection, then several lightweight sessions / logical connections under the only physical connection.
Because of the connection's heavyweight nature, we usually pool them just like DB connection pooling.  We hope the connection can create lightweight sessions very fast and service its session operations (sending, receiving, tx etc) efficient by multiplexing them through its only physical connection.
Add to this, the following Spring configures supports caching a single JMS connection and multiple sessions:

<bean id="cachedConnectionFactory"  class="org.springframework.jms.connection.CachingConnectionFactory"
        p:targetConnectionFactory-ref="myConnFactory" p:sessionCacheSize="10"/>


However the JMS specification doesn't say how the MOM vendors should implement connections and sessions. For earlier WebSphere MQ versions before 7, each JMS session actually also represents a brand new physical connection / channel instance just like a JMS connection.
MQ version 7 introduces the so called "sharing conversations" which allows a configurable number of JMS connections and sessions (both represent conversations in MQ terms) to share a single physical connection / channel instance.
The number of conversations that can be shared across a single physical connection / channel instance is determined by the WebSphere MQ channel property SHARECNV. The default value of this property for Server Connection Channels is 10. A value of 0 or 1 basically disables the sharing.
In order to use this feature, your client side JMS connection factory also need to enable property
SHARECONVALLOWED.
For example, assuming SHARECNV is 10, you created 5 connections and 15 sessions (it doesn't matter which connections created which sessions from sharing perspective), the total physical connections needed is (5+15)/10=2
For more information, you refer to this page.

If you are developing a low latency and high performance application, you must know how to tune JMS connections and sessions on your MQM vendor.
  • How to Create Sessions quickly?
    If you use the previous Spring config, you should estimate the maximum sessions you needs and then pre-populate some of them based on the SHARECNV value. Otherwise some session creations will be as slow as creating a new physical connection.
  • How Many Connections to Create?
    Due to the multiplexing nature, if most of the multiplexed sessions under a connection are busy most of the time, you should move those busy sessions to new connections. Of course, you should manage connections using pooling such as retrieving them from an application servers.