Lesson 8: Using Array Proxy Types
Introduction
Arrays require some special facilities to be useful in a natural way. There are some basic differences between arrays in Java and arrays in C++. C++ inherits the C interpretation of arrays: an array is simply a pointer into memory. The memory holds (aligned) instances of the array element type.
A Java array on the other hand is a full object. It has a length property that allows you to query the array length and it has some memory that holds the actual array data. The memory used to hold the array elements might or might not be contiguous; the JVM does not have to make any guarantees about this.
When we're asked to make C++ arrays and Java arrays interoperable, we have a couple of choices and we can provide some integration features easily whereas others are very hard to provide. A quick summary of available integration features follows:
- there is a dedicated proxy type for every primitive array type.
- there is a template type for arrays of object references.
- all proxy array types extend
xmog_java_array
. - all proxy array types have a read-only
length
field. - all proxy array types have overloaded subscript operators that allow per-element access to the array. Please see the section on performance that discusses other access options.
Primitive Array Types
All primitive proxy array types follow a simple naming policy:
xmog_java_primtype_array
. Simply replace primtype
with the C++ counterpart to a Java primitive type and you have the corresponding proxy array
type, for example:
// create a Java boolean[] array of size 10 xmog_java_bool_array bArr( 10 ); // create a Java int[] array of the same size xmog_java_int_array iArr( bArr.length ); for( int i=0, imax=bArr.length; i<imax; i++ ) iArr[ i ] = i;
You can see that the creation of array instances looks different from what you would normally write in Java, but once created, a proxy array behaves exactly as its Java counterpiece.
Take care with the length
field if you use it in log statements or printf
calls where it is passed as part of a variable length argument group. While it converts to an integer
when used in C++ expressions, its type is not int
but an object that knows how to retrieve
the array length. When in doubt, cast it to an int
and you will be fine. The same caution
applies to the elements of an array. If you assign them to a primitive variable or use them in a context
where the compiler can infer that it has to be primitive you will be fine but take care in variable length
arguments where the compiler does not know that the conversion operator should be called.
xmog_java_int_array iArr( 10 );
// this works
printf( "The array length is %d\n", (int)iArr.length );
// this misbehaves or fails to compile because length
is not an int
printf( "The array length is %d\n", iArr.length );
// this will be fine
jint i = iArr[ 3 ];
// this will also be fine
printf( "The #3 element is %d\n", (int)iArr[ 3 ] );
// this might be problematic
printf( "The #3 element is %d\n", iArr[ 3 ] );
Reference Type Arrays
All proxy types for Java reference types, i.e. the interface and class types for which you generated
proxy types, are derived from a type called xmog_java_object_array
but you typically
use them via the template type xmog_java_array_template<>
.
A one-dimensional arary of proxy String
instances can be declared as
xmog_java_array_template<java::lang::String>
or, more concisely, by
using a typedef
in the String
proxy type, as String::array1D
.
The code generator emits two typedef
s for single- and two-dimensional array types
as part of the proxy type declaration. If you need to deal with higher-dimensional arrays you can simply
create your own typedef
s for their types, for example:
typedef xmog_java_array_template<java::lang::String::array2D> StringArray3D;
Other than that, the references type array types behave exactly like their primitive counterparts.
Performance
The subscript operators allow per-element access to the proxy array. Using the operators is very
convenient, but it is hardly the highest performing way to access the array: for each element, we are
calling accross the the language boundary at least once and because there is no way to provide the
xmog_localenv
pointer as part of the operator call, we also need to access thread local
storage once for each access. In array-heavy use cases, this combined overhead could completely dominate
the application's execution speed if you are
choosing ease-of-use over performance. The following snippet shows an example of such array usage. Please
note that there's nothing wrong with using the framework array types like that. You should always write
easily readable and maintainable code first and then optimize performance in the areas that profiling in
contrast to guessing indicates to be problematic.
//create an integer array of size 10000 xmog_java_int_array iArr( 10000 ); ... //iterate over the array elements and double them for( int i=0; i<iArr.length; i++ ) iArr[ i ] *= 2;
There are other options that are a little harder to use, but offer much better performance. You might for example be better off performing the array traversal on a native array and then copying the native array into a Java array.
Quite frequently, you are faced with the problem of having to convert a native array type to the corresponding Java array type. You might wish for conversion operators between the two sides, but we believe that conversion operators would encourage some coding patterns that lead to hard- to-understand problems. Let's assume for a moment that we had trivially usable conversion operators. What if you wrote code like this:
jint nativeInputOutputArray[ 10 ]; JavaAlgorithm::calculate( nativeInputOutputArray );
In the above example, the native array would be transformed into a Java array when it is passed to the calculate method. The calculate method reads the array elements, performs a calculation, and returns the result in the passed in array. Now the problem is this: does the native array contain the output or still the input?
We really can't know what's being done to an array on the Java side, so to be safe, we would always have to update the native array completely with the contents of the temporary Java array if we wanted to make sure that the user is not surprised by runtime behavior differences between C++ code and Java code. For that reason (and other more technical reasons), we have chosen to make conversion between native and Java arrays an explicit operation. A proxy array type has a constructor that takes a pointer to some native memory and an array size as input and creates a Java array from that native array. The following code snippet illustrates this:
jint nativeInputOutputArray[] = { 0, 1, 2, 3, 4 }; xmog_java_int_array javaInputOutputArray( nativeInputOutputArray, 5 ); JavaAlgorithm::calculate( javaInputOutputArray );
Now, the intent is perfectly clear. The javaInputOutputArray
instance is going to receive
all updates (if any) and then it is up to the user to set them back into the native array.
jint nativeInputOutputArray[] = { 0, 1, 2, 3, 4 }; xmog_java_int_array javaInputOutputArray( nativeInputOutputArray, 5 ); Algorithm.calculate( javaInputOutputArray ); javaInputOutputArray.to_native( nativeInputOutputArray, 0, 5 );
As with all framework methods, you can pass in an optional xmog_localenv
pointer to
slightly improve the performance.