: junc++ion generate

Code Generator Introduction

The JunC++ion code generator is a sophisticated engine that

  1. takes compiled Java bytecode as input,

  2. builds a model representation of the Java types,

  3. transforms that Java model to a C++ model (governed by many options),

  4. renders the C++ model through a set of customizable templates (again governed by many options).

The multi-stage transformation process allows us to react quickly to customers' product enhancement- or fix-requests. The code generator is flexible enough that it could theoretically generate any kind of code, but we have provided it with the particular set of templates and settings that can generate C++ proxy types for your Java types. The generated proxy types allow you to use the original Java types in your C++ applications.

Prerequisites

The code generator is written in Java and currently built to be backwards compatible with Java 7. The command line (CLI) version should work with any version of Java that is at least Java 7 compatible. The GUI version of the code generator uses Swing components from JIDESoft. Unfortunately, we relied on the JDAF procuct which was broken on MacOSX starting with Java 9. So far, there has been no fix.

Architecture

Let's take a look at the code generator architecture. The image below shows the flow from Java bytecode on the left to C++ on the right. Flow in code generator

The code generator primarily takes Java bytecode as input. It accepts all current bytecode containers (.class, .jar, .jmod) as inputs. The control over which Java types are imported lies primarily with a model file that you have created either with the code generator GUI or—if you are an expert user—with a text editor. That file persists the import commands for your Java types as well as customizations to properties governing the import process. The model file is the only required input file you need to supply in addition to the Java types.

During the import phase, you can use model properties to customize for example which Java types, methods, and fields should be excluded from the model and which types should be enabled by default. Some import properties might also be supplied by properties files or by supplying the property values on the command line.

The result of the import process is a Java model that is internal to the code generator but can be visualized in the code generator GUI.

The next step is the transformation of this Java model into a proxy type model. Various transformation properties govern what happens during this step. You can for example specify a name transformation policy to translate Java elements into their corresponding proxy elements.

The proxy model is purely internal to the code generator and usually only exists as a transient object during the code generation process. Its purpose is to represent the proxy type system in a way that lends itself to code generation.

The final step is the rendering of this proxy type model through a set of templates and some properties, such as the target directory or which templates to render. In general, all required templates are supplied by the code generator but you can override the default templates or supplement with additional templates. A nice use case for this would be to generate a report of types that were rendered or additional build files that we do not support out of the box.

Either way, the end result is a set of output files that consists of compilable C++ source code as well as build files in various formats.

Why a Code Generator GUI?

It seemed at first that a GUI is just icing on the cake, after all, real programmers use command line interfaces, right? The particular characteristics of cross-language integration as described here forced us to reconsider. When you say that you want to use Java from C++, you usually don't mean that you want to use all 12,000 (or so) types that come bundled with your Java Runtime Environment. You typically have a particular integration in mind that relies on a few dozen to a few hundred Java types being available in C++. Unfortunately, these types rely on hundreds if not thousands of Java types in their implementation details.

It is very hard to define the boundaries between the types that you wish to actively use as proxy types and types that are just required at runtime to allow your application to run. Just generating all of the types is not an option because it would create a huge, bloated code base, 99.9% of which would be totally unnecessary. On the other hand, just generating the totally minimal set of types might force you to frequently regenerate when a user of the generated code discovers an API she wants to use that you did not include in the previous generation. Typically, there's a sweet-spot between bloated and sparse that allows your developers to do what they need to without making the integration too large.

Let's look at a concrete example: take the java.lang.Object type. Let's say you just want to "use java.lang.Object" from C++. That means that we only have to generate one proxy type, right? Well, not so fast. Object relies on Class and String in its public API, so we also have to generate those two types if we want you to be able to use the methods with String or Class return types. But now we need to look at the String and Class types, etc., etc.

It turns out that just importing Object pulls in hundreds of related types, most of which you probably don't need or want. The code generator has built-in, customizable rules that govern the automatic pruning that takes place, but only you can be the ultimate arbiter of which types you need. The GUI gives you the ability to interactively define exactly what you want. Afterwards, the CLI version gives you the ability to generate proxy types as part of your daily or CI build.