Yong Jiao's Blog about Technology and Life: June 2014

Friday, June 27, 2014

Simple Comparisons on some In-Memory Data Grids

Fetures	Feature Description	GemFire7	Coherence3.7	Infinispan5	Gigaspaces7	Notes
1. topologies	peer-peer; client-server	High	High	High	High
2. cross-site / WAN replication	datacenter-datacenter; region-region	High	Low	Med	High	Coherence support by incubator "Push replication"; Infinispan basic support in V5.2
3. read-most scalability	via replication	High	High	High	High
4. write-most scalability	via partitioning and replication between primary and backups	High	High	Med	High
5. high availability / HA	via replication and persistence into disk (DB)	High	High	High	High
6. asychronous replication	useful for slow changing data and WAN replication	High	Med	High	High	Coherence support by incubator "Push replication";
7. partition rehashing	consistent hashing to reduce data relocation	High	High	High	High	all seem to use consistent hashing-like algorithms
8. dynamic clustering	adds or loses nodes	High	High	High	Med
9. updating among partition backups	master-backup: less deadlock prone and IO master-master/update anywhere: more dealock prone and IO	High	High	Low	High	Infinispan only has master-master that incurs deadlock for slow network and large caches
10. cache loader and writer	2 application interfaces: loades data from DB and saves data into Cache. With cache loader and writer, applications only need to interace with Cache!	High	High	Med	High	Infinispan writers only support JPA, not JDBC or Hibernate!
11. read-through and read-ahead	populates cache with DB in batch mode	Med	High	Med	High	no prefetch/batch support from Gemfire and infinispan
12. write-through and write-behind	persists cache into wherever they were loaded.write-behind persists data asynchronously	High	High	Med	High
13. event notification	Cache also works as a messaging and parallel processing bus like JMS/MDB	Med	Med	Med	High
14. continous querying	register contents-based interests with Cache and receive updates continously.	High	High	Low	High
15. cache querying	SQL like query: not only searches based on key matching	High	High	Low	High
16. locking and tx	JTA (global tx) and ACID properties	High	High	Med	High
17. off-heap	puts cache data off Java heap so that GC pause time is reduce!	Low	High	Low	Low
18. key affinity / colocation	colocates related cache objects on the same partition to reduce IO	High	High	High	High
19. customized partitioning	puts cache into specific cluster node to bypass cache hashing algorithm	High	Low	Med	Low
20. synchronous requests like RFQ	synchronou requests should go through the same partition routing as asynchrous messages.	Low	Low	Low	High
21. API	Map,Restful and appropriate interfaces	Med	Med	Med	High
22. language bindings	Java, C++ etc	Med	Med	Med	High
23. map-reduce / scatter-gather	submits aggregation tasks to run across multiple or all cluster nodes	High	High	High	High
24. monitoring	monitor the health of clusters	High	Med	Med	High
25. J2EE (JMS,Web, Remoting)	how much J2EE? to support?	Low	Low	Low	High
26. migration effort	how to migrate to a different cache production in a J2EE app. server	High	High	High	Low	GigaSpaces is an app. server

Monday, June 23, 2014

JMS Connection and Session on WebSphere MQ 7.1

The difference between JMS API's Connection and Session will probably confuse many naive developers. Their performance implications will only make things worse.
This is primarily because we are used to thinking connection and session are the same and can be used interchangeably to represent a unique single-threaded conversation between a server and a client. This is indeed the case for DB servers and clients.

The basic idea of JMS API's connection and session is to first create a top level heavyweight physical connection, then several lightweight sessions / logical connections under the only physical connection.
Because of the connection's heavyweight nature, we usually pool them just like DB connection pooling. We hope the connection can create lightweight sessions very fast and service its session operations (sending, receiving, tx etc) efficient by multiplexing them through its only physical connection.
Add to this, the following Spring configures supports caching a single JMS connection and multiple sessions:

<bean id="cachedConnectionFactory" class="org.springframework.jms.connection.CachingConnectionFactory"
p:targetConnectionFactory-ref="myConnFactory" p:sessionCacheSize="10"/>

However the JMS specification doesn't say how the MOM vendors should implement connections and sessions. For earlier WebSphere MQ versions before 7, each JMS session actually also represents a brand new physical connection / channel instance just like a JMS connection.
MQ version 7 introduces the so called "sharing conversations" which allows a configurable number of JMS connections and sessions (both represent conversations in MQ terms) to share a single physical connection / channel instance.
The number of conversations that can be shared across a single physical connection / channel instance is determined by the WebSphere MQ channel property SHARECNV. The default value of this property for Server Connection Channels is 10. A value of 0 or 1 basically disables the sharing.
In order to use this feature, your client side JMS connection factory also need to enable property
SHARECONVALLOWED.
For example, assuming SHARECNV is 10, you created 5 connections and 15 sessions (it doesn't matter which connections created which sessions from sharing perspective), the total physical connections needed is (5+15)/10=2
For more information, you refer to this page.

If you are developing a low latency and high performance application, you must know how to tune JMS connections and sessions on your MQM vendor.

How to Create Sessions quickly?
If you use the previous Spring config, you should estimate the maximum sessions you needs and then pre-populate some of them based on the SHARECNV value. Otherwise some session creations will be as slow as creating a new physical connection.
How Many Connections to Create?
Due to the multiplexing nature, if most of the multiplexed sessions under a connection are busy most of the time, you should move those busy sessions to new connections. Of course, you should manage connections using pooling such as retrieving them from an application servers.