Codemesh® Technology vs. Handwritten JNI

Comparison of Codemesh Technology With Handwritten JNI

Introduction

In any integration solution you typically want to minimize the "friction" between the two sides. Friction can be introduced by many factors:

incompatible technologies,
users who are inexperienced in one of the technologies,
undocumented or unknown requirements of one technology.

Some friction is unavoidable; there's an excellent article by Joel Spolsky on this subject. It is called "The law of leaky abstractions" and discusses some aspects of this problem. In a nutshell it says that every abstraction breaks down for some users at some point and there's nothing you can do about it.

If we look at the problem of integrating Java with C++ or .NET, there are some very obvious areas where we can expect the integration abstraction to break down:

Ease of use
Performance
Reliability

If we look at the Java Native Interface (JNI) as the glue between Java and C (and by extension also .NET), we can easily make the following statements:

JNI is not easy to use and cannot serve as an abstraction layer for the JVM.

We will focus on this aspect in greater detail below.
JNI has great performance.

There were some faster integration approaches, but they were not JVM-portable and did not stand the test of time.
JNI is not very reliable.

This should really read: "Handwritten JNI is not very reliable." That's an important distinction because both Codemesh and all JVM implementors use JNI internally with great success. This is the area on which we will focus on most here.

In the sections below, we will contrast the way JNI and Codemesh technology achieve their cross-language integration goals. Please also look at the higher level check list of things you might wish to worry about in connection with handwritten JNI.

Launching a Virtual Machine

JNI publishes the so-called "Invocation Interface" for launching a JVM in your native process. While the APIs by themselves are pretty simple (as illustrated by the snippet below), you are left with a lot of work that nobody talks about in a "Hello world" scenario. The snippet below demonstrates how you might launch a JVM in your C++ process via JNI:

JavaVMInitArgs   args;
JavaVMOption     opts[ 2 ];
JavaVM *         jvm = NULL;
JNIEnv *         env = NULL;

args.ignoreUnrecognized = JNI_FALSE;
args.version = JNI_VERSION_1_2;
args.options = opts;

opts[ 0 ].optionString = "-Djava.class.path=myapp.jar";
opts[ 1 ].optionString = "-Xmx256m";

JNI_CreateJavaVM( &jvm, (void**)&env, &args );

This looks fairly straightforward. It looks so straightforward because it neglects a lot of details:

Do we wish to deal with potentially pre-existing JVMs in the process?

Integration solutions should be able to deal with other integration solutions. It's never a good idea to assume that you own the world and can start a JVM (JNI only supports one JVM per process).
From where do we have the JNI_CreateJavaVM symbol and the other JNI types?

Are you hard-linking against a JVM library or do you query the symbol from a dynamically loaded JVM? Who writes that code (portably)? Do you require the presence of a JDK for C++ development purposes?
Are you going to deal with platform-portable path and file separators?

That's a little thing, but it can be a hassle for inexperienced users.
Where is the configuration information coming from?

Having a mature configuration API for the Java parts of your application is incredibly helpful.
How is security handled?
Java has a mature security model that's not too hard to use from within Java, but what about using it from a C++ application?
Will this work for all threads in the application?
Hint: it won't...
How are you going to handle errors and misconfigurations?

There are many hidden issues around JVM startup that you only become aware of once you've run into them. JunC++ion and JuggerNET have great support in place and handle these issues in a completely sensible and easy to understand way:

xmog_jvm_loader & loader = xmog_jvm_loader::get_jvm_loader();

loader.appendToClassPath( "myapp.jar" );
loader.setMaximumHeapSize( 256 );

try
    {
    xmog_jvm *        jvm = loader.load();
}
catch( xmog_exception xe )
    {
    ...
}

What you don't see in this snippet is all the work that went into setting up internal details so that Java code will work on all threads, that errors and exceptions are handled consistently, etc. You also don't see all the work that went into the configuration API, allowing you to create self-configuring integrated applications.

Creating a Java object

Once you have a JVM loaded, you will probably wish to create a Java object. Let's start by looking at how this is done the Codemesh way:

Hashtable      ht( 113 );
String         str = "test";

That doesn't look too hard, does it? Now let's look at the JNI way:

jclass         clsHT = env->FindClass( "java/util/Hashtable" );
jmethodID      mCtor = env->GetMethodID( clsHT, "" "(I)V" );
jobject        ht = env->NewObjectV( clsHT, mCtor, 113 );
jstring        str = env->NewStringUTF( "test" );

In both cases we're neglecting error handling, but let's compare the two snippets before we start talking about that:

In the Codemesh case, you simply declare two proxy instances and initialize them as part of the constructor invocation. You then have two C++ objects on which you can invoke methods and whose fields (if any) you can query.
In the JNI case, you first have to discover the type, then you have to look up the method identifier, then you can invoke the constructor. The constructor invocation yields a jobject, an opaque object handle, that is very hard to use and has to obey several usage restrictions.
The string creation looks easier, but it glosses over the fact that your native string data might not be UTF-8.

Let's add error handling to see what happens. We start again with the Codemesh case:

try {
    Hashtable      ht( 113 );
    String         str = "test";
}
catch( Throwable t ) {
    cerr << t.getMessage().to_chars() << endl;
}

Now the JNI case:

jthrowable     exc = NULL;
jclass         clsHT = env->FindClass( "java/util/Hashtable" );

if( clsHT == NULL ) {
   exc = env->ExceptionOccurred();
   env->ExceptionClear();
   throw exc;
}

jmethodID      mCtor = env->GetMethodID( clsHT, "" "(I)V" );

if( mCtor == NULL ) {
   exc = env->ExceptionOccurred();
   env->ExceptionClear();
   throw exc;
}

jobject        ht = env->NewObjectV( clsHT, mCtor, 113 );

if( ht == NULL ) {
   exc = env->ExceptionOccurred();
   env->ExceptionClear();
   throw exc;
}

jstring        str = env->NewStringUTF( "test" );

if( str == NULL ) {
   exc = env->ExceptionOccurred();
   env->ExceptionClear();
   throw exc;
}

Needless to say that the JNI snippet is not nearly as neat and maintainable as the Codemesh snippet. We also haven't even attempted to extract the exception message and do something with it. You might say: I can handle the boilerplate exception stuff in a utility method. You're right, you can! Unfortunately, it is a fact of life that very few people end up doing that, at least not consistently.

Also, what gets easily overlooked in this example is the fact that the Codemesh proxy instances clean up behind themselves when an exception occurs. The handwritten JNI snippet does not and risks leaking references in the JVM. You would have to create helper types and remember to use them consistently to duplicate the safe behavior of the Codemesh snippet.

Accessing a Field

Now that you have created an object, you might wish to access some of its fields. Accessing fields is one of the more annoying areas of the JNI API because you have to have so much information about the field. That information does not just translate into JNI function call arguments but also into teh selection of the proper JNI method to call.

Let's look at the Codemesh way before we get into the JNI details. The following snippet demonstrates how you would access a static and an instance Java field from C++ code:

// access a static field of class Context
String    propName = Context::INITIAL_CONTEXT_FACTORY;

// create an object and access two integer fields
MyType    mt( 3, 4 );
int       i1 = mt.foo;
int       i2 = mt.bar;

The above code is easily readable and maintainable and safe to use; it will throw C++ exceptions when something goes wrong. The code below demonstrates the corresponding JNI code and does not include any error checking:

// access a static field of class Context
jclass    clsContext = env->FindClass( "javax/naming/Context" );
jfieldID  fidICF = env->GetStaticFieldID( clsContext,
                                          "INITIAL_CONTEXT_FACTORY",
                                          "Ljava/lang/String;" );
jstring   propName = (jstring)env->GetStaticObjectField( clsContext, fidICF );

// create an object and access two integer fields
jclass    clsMyType = env->FindClass( "com/myapi/MyType" );
jmethodID midCtor = env->GetMethodID( clsMyType, "", "(II)V" );
jobject   mt = env->NewObject( clsMyType, midCtor, 3, 4 );
jfieldID  fidFoo = env->GetFieldID( clsMyType, "foo", "I" );
jfieldID  fidBar = env->GetFieldID( clsMyType, "bar", "I" );
jint      i1 = env->GetIntField( mt, fidFoo );
int       i2 = env->GetIntField( mt, fidBar );

Other than the cryptic nature of the API calls and arguments, we want you to focus on a few particular aspects:

Name or type changes of Java fields do not cause compilation errors because that information is conveyed through string values.

The connection between the two sides is only made through data and not through typesafe APIs. Any maintenance work you're doing on your Java code risks breaking your JNI integration layer without a compiler warning or error to help you diagnose the problem.
Just about every single JNI API call that we used in this snippet could throw a Java exception.

You have to check them all or risk your JVM crashing.
There are several Java object references that will require explicit freeing.

If you forget to do it, you're leaking Java objects in the JVM and your application might crash minutes or hours into its lifetime.

Calling a Method

Calling a method is not substantially different from accessing a field, it's just somewhat more complicated due to the method arguments that you might have to pass. Just like a Java field, a Java method is also identified by its declaring type, its name, and its type. In the case of a method, the type can be much more complicated because it includes the method parameter types. Compare the following two snippets. Again, the Codemesh snippet first:

// call a static utility method that creates a string
String    id = MyType::create( 3L, "test", Date( 75000L ) );

Now the corresponding JNI snippet:

jclass    clsMyType = env->FindClass( "com/myapi/MyType" );
jmethodID midCreate = env->GetStaticMethodID( clsMyType,
              "create",
              "(JLjava/lang/String;Ljava/util/Date;)Ljava/lang/String;" );
jclass    clsDate = env->FindClass( "java/util/Date" );
jmethodID midCtor = env->GetMethodID( clsDate, "", "(J)V" );
jobject   dt = env->NewObject( clsDate, midCtor, 75000L );
jstring   test = env->NewStringUTF( "test" );
jstring   result = env->CallStaticObjectMethod( clsMyType, midCreate, 3L, test, dt );

Notice that we're neither performing cleanup nor error handling in the JNI snippet. If we did, it would be even more convoluted and error-prone. The Codemesh snippet does not require special error handling or cleanup because it's all included in the generated proxy classes and in the Codemesh runtime.

In practice, methods typically give you much more grief than fields because multiple arguments compound the cleanup problem as well as the maintenance problem: a method is much more likely to have its signature changed than a field is to have its type changed.

Callbacks

JNI is a very complete and well-designed API... which no human should ever have to use. When we started working on our integration solutions, we slowly became JNI experts and we were continuously amazed by the features that we discovered hidden in the JNI API. The designers of the JNI API, which now consists of over 200 functions, had foreseen just about every use case that we wished to support. We only unearthed one glaring hole in the design, and that hole involves callbacks.

In our use case, a callback is an asynchronous C++ entry point that is invoked from the Java side. This might sound like an obscure use case to you, but you really need it because a lot of Java APIs are designed around Listener interfaces that you are supposed to implement and register with event sources. When you're a C++ developer who is using such a Java API, you don't want to be forced to implement a piece of your application in Java because the integration technology you're using does not allow you to implement it in C++. That would be one of those areas of unexpected "friction" that we wish to avoid at all cost.

We spent half a man year on designing and implementing the callback feature and many weeks more on perfecting it over the years. We had to use many different JNI APIs in conjunction and ended up with a feature that has no counterpart in out-of-the-box JNI: you can extend a Java Listener interface in C++, register it with an event source and have your C++ methods called from Java!

If you think you can come up with your own callback design and implementation, you're probably right. Just don't be surprised if you end up spending a lot more time on it than you expected!

Summary

JNI is a very nice and very well designed integration API that should not be used for larger integration projects unless you are using automated code generation technology.
Even a small amount of handwritten JNI code is a disaster that is waiting to happen.
Don't rely on the resident JNI expert: (s)he won't be there forever and will be extremely hard to replace!
For two languages like Java and C++, languages that have a lot in common, JNI introduces an awful lot of "friction."