Thursday, May 29, 2014

Class Loading Issue when Running Gemfire 7.x OQL in Weblogic 10.3.x

I have a very simple OQL like "SELECT * FROM /FxLimitsRegion" that ran successfully in my unit/integration test. But I got the following exception when trying to run it inside my WebLogic EAR (our Gemfire cache and Weblogic application server run in the same JVM):
com.gemstone.gemfire.cache.query.QueryInvalidException: Syntax error in query:  Invalid class or can't make instance, com.gemstone.gemfire.cache.query.internal.parse.ASTSelect
    at com.gemstone.gemfire.cache.query.internal.QCompiler.compileQuery(QCompiler.java:80)
    at com.gemstone.gemfire.cache.query.internal.DefaultQuery.(DefaultQuery.java:156)
    at com.gemstone.gemfire.cache.query.internal.DefaultQueryService.newQuery(DefaultQueryService.java:107)
    at com.gemstone.gemfire.internal.cache.LocalDataSet.query(LocalDataSet.java:129)
    at com.jpmorgan.gcrm.sef.business.cache.impl.SEFCacheManagerGemfireImpl.queryPartitionedRegionLocally(SEFCacheManagerGemfireImpl.java:187)
    at com.jpmorgan.gcrm.fno.business.datamanager.impl.FxDataManager.selectActiveFxLimits(FxDataManager.java:57)
    ... 13 more
Caused by: java.lang.IllegalArgumentException: Invalid class or can't make instance, com.gemstone.gemfire.cache.query.internal.parse.ASTSelect
    at antlr.ASTFactory.createUsingCtor(ASTFactory.java:251)
    at antlr.ASTFactory.create(ASTFactory.java:210)
    at com.gemstone.gemfire.cache.query.internal.parse.OQLParser.selectExpr(OQLParser.java:1090)
    at com.gemstone.gemfire.cache.query.internal.parse.OQLParser.query(OQLParser.java:217)
    at com.gemstone.gemfire.cache.query.internal.parse.OQLParser.queryProgram(OQLParser.java:115)
    at com.gemstone.gemfire.cache.query.internal.QCompiler.compileQuery(QCompiler.java:76)
The class "com.gemstone.gemfire.cache.query.internal.parse.ASTSelect" to load was included in the gemfire.jar in my EAR. So the exception seems to be confusing in the beginning.
From the above stack trace, you can see it relates to the ANTLR jar. Both my EAR and Weblogic itself have ANTLR. But my EAR uses version 2.7.7 while Weblogic 10.3.2 uses a pretty old version.
So we need to know which ANTLR version was used. Here is the call workflow:
1. my EAR thread issued a OQL ->
2. Gemfire found  ANTLR to parse the OQL to AST ->
3. ANTLR tried to load Gemfire class "com.gemstone.gemfire.cache.query.internal.parse.ASTSelect"

By default, Step2 found the Weblogic ANTLR by the Weblogic system class loader. But the Gemfire class is in the child EAR classloader. So Step 3 threw the exception.
If we can tell Weblogic to use my EAR ANTLR, the exception should be gone. In order to do so, I just added the following excerpt to your weblogic-application.xml:
         <prefer-application-packages>
               <package-name>antlr.*</package-name>
         </prefer-application-packages>
New version of ANTLR changed to load classes from the thread context class loader by default which was the EAR classloader in my case. So if the Weblogic had new version of ANTLR, the exception should have not been there.