Sonntag, 8. April 2018

JDK 11 and proxies in a world past sun.misc.Unsafe

With JDK 11 the first methods of sun.misc.Unsafe are retired. Among them, the defineClass method was removed. This method has been commonly used by code generation frameworks to define new classes in existing class loaders. While this method was convenient to use, its existence also rendered the JVM inherently unsafe, just as the name of its defining class suggests. By allowing a class to be defined in any class loader and package, it became possible to gain package-scoped access to any package by defining a class within it, thus breaching the boundaries of an otherwise encapsulated package or module.

With the goal of removing sun.misc.Unsafe, the OpenJDK started offering an alternative for defining classes at runtime. Since version 9, the MethodHandles.Lookup class offers a method defineClass similar to the unsafe version. However, the class definition is only permitted for a class that resides in the same package as the lookup's hosting class. As a module can only resolve lookups for packages that are owned by a module or that are opened to it, classes can no longer be injected into packages that did not intend to give such access.

Using method handle lookups, a class foo.Qux can be defined during runtime as follows:
MethodHandles.Lookup lookup = MethodHandles.lookup();
MethodHandles.Lookup privateLookup = MethodHandles.privateLookupIn(foo.Bar.class, lookup);
byte[] fooQuxClassFile = createClassFileForFooQuxClass();
privateLookup.defineClass(fooQuxClassFile);

In order to perform a class definition, an instance of MethodHandles.Lookup is required which can be retrieved by invoking the MethodHandles::lookup method. Invoking the latter method is call site sensitive; the returned instance will therefore represent the privileges of the class and package from within the method is invoked. To define a class in another package then the current one, a class from this package is required to resolve against it using MethodHandles::privateLookupIn. This will only be possible if this target class's package resides in the same module as the original lookup class or if this package is explicitly opened to the lookup class's module. If those requirements are not met, attempting to resolving the private lookup throws an IllegalAccessException, protecting the boundaries that are implied by the JPMS.

Of course, code generation libraries are also constrained by this limitation. Otherwise they could be used to create and inject malicious code. And since the creation of method handles is call-site sensitive, it is not possible to incorporate the new class definition mechanism without requiring users to do some additional work by providing an appropriate lookup instance that represents the privileges of their module.

When using Byte Buddy, the required changes are fortunately minimal. The library defines classes using a ClassDefinitionStrategy which is responsible for the loading a class from its binary format. Prior to Java 11, a class could be defined using reflection or sun.misc.Unsafe using ClassDefinitionStrategy.Default.INJECTION. To support Java 11, this strategy needs to be replaced by ClassDefinitionStrategy.UsingLookup.of(lookup) where the provided lookup must have access to the package in which a classes would reside.

 

Migrating cglib proxies to Byte Buddy


As of today, other code generation libraries do not provide such a mechanism and it is uncertain when and if such capabilities are added. Especially for cglib, API changes have proven problematic in the past due to the libraries old age and widespread use in legacy applications that are no longer updated and would not adopt modifications. For users that want to adopt Byte Buddy as a more modern and actively developed alternative, the following segment will therefore describe a possible migration.

As an example, we generate a proxy for the following sample class with a single method:
public class SampleClass {
  public String test() { 
    return "foo"; 
  }
}

To create a proxy, the proxied class is normally subclassed where all methods are overridden to dispatch the interception logic. Doing so, we append a value bar to the return value of the original implementation as an example.

A cglib proxy is typically defined using the Enhancer class in combination with an MethodInterceptor. A method interceptor supplies the proxied instance, the proxied method and its arguments. Finally, it also provides an instance of MethodProxy which allows to invoke the original code.
Enhancer enhancer = new Enhancer();
enhancer.setSuperclass(SampleClass.class);
enhancer.setCallback(new MethodInterceptor() {
  @Override
  public Object intercept(Object obj, Method method, Object[] args, MethodProxy proxy) {
    return proxy.invokeSuper(obj, method, args) + "bar";
  }
});
SampleClass proxy = (SampleClass) enhancer.create();
assertEquals("foobar", proxy.test());

Note that the above code will cause a problem if any other method such as hashCode, equals or toString was invoked on the proxy instance. The first two methods would also be dispatched by the interceptor and therefore cause a class cast exception when cglib attemted to return the string-typed return value. In contrast, the toString method would work but return an unexpected result as the original implementation was prefixed to bar as a return value.

In Byte Buddy, proxies are not a dedicated concept but can be defined using the library’s generic code generation DSL. For an approach that is the most similar to cglib, using a MethodDelegation offers the easiest migration path. Such a delegation targets a user-defined interceptor class to which method calls are dispatched:
public class SampleClassInterceptor {
  public static String intercept(@SuperCall Callable<String> zuper) throws Exception {
    return zuper.call() + "bar";
  }
}

The above interceptor first invokes the original code via a helper instance that is provided by Byte Buddy on demand. A delegation to this interceptor is implemented using Byte Buddy’s code generation DSL as follows:
SampleClass proxy = new ByteBuddy()
  .subclass(SampleClass.class)
  .method(ElementMatchers.named("test"))
  .intercept(MethodDelegation.to(SampleClassInterceptor.class))
  .make()
  .load(someClassLoader, ClassLoadingStrategy.UsingLookup.of(MethodHandles
      .privateLookupIn(SampleClass.class, MethodHandles.lookup()))
  .getLoaded()
  .getDeclaredConstructor()
  .newInstance();
assertEquals("foobar", proxy.test());

Other than cglib, Byte Buddy requires to specify a method filter using an ElementMatcher. While filtering is perfectly possible in cglib, it is quite cumbersome and not explicitly required and therefore easily forgotten. In Byte Buddy, all methods can still be intercepted using the ElementMatchers.any() matcher but by requiring to specify such a matcher, users are hopefully reminded to make a meaningful choice.

With the above matcher, any time a method named test is invoked, the call will be delegated to the specified interceptor using a method delegation as discussed.

The interceptor that was introduced would however fail to dispatch methods that do not return a string instance. As a matter of fact, the proxy creation would yield an exception issued by Byte Buddy. It is however perfectly possible to define a more generic interceptor that can be applied to any method similar to the one offered by cglib's MethodInterceptor:
public class SampleClassInterceptor {
  @RuntimeType
  public static Object intercept(
      @Origin Method method,
      @This Object self,
      @AllArguments Object[] args,
      @SuperCall Callable<String> zuper
  ) throws Exception {
    return zuper.call() + "bar";
  }
}

Of course, since the additional arguments of the interceptor are not used in this case, they can be omitted what renders the proxy more efficient. Byte Buddy will only provide arguments on demand and if they are actually required.

As the above proxy is stateless, the interception method is defined to be static. Again, this is an easy optimization as Byte Buddy otherwise needs to define a field in the proxy class that holds a reference to the interceptor instance. If an instance is however required, a delegation can be directed to a member method of an instance using MethodDelegation.to(new SampleClassInterceptor()).

 

Caching proxy classes for performance


When using Byte Buddy, proxy classes are not automatically cached. This means that a new class is generated and loaded every time the above code is run. With code generation and class definition being expensive operations, this is of course inefficient and should be avoided if proxy classes can be reused. In cglib, a previously generated class is returned if the input is identical for two enhancements what is typically true when running the same code segment twice. This approach is however rather error prone and often inefficient since a cache key can normally be calculated much easier. With Byte Buddy, a dedicated caching library can be used instead, if such a library already is available. Alternatively, Byte Buddy also offers a TypeCache that implements a simple cache for classes by a user-defined cache key. For example, the above class generation can be cached using the base class as a key using the following code:
TypeCache<Class<?>> typeCache = new TypeCache<>(TypeCache.Sort.SOFT);
Class<?> proxyType = typeCache.findOrInsert(classLoader, SampleClass.class, () -> new ByteBuddy()
  .subclass(SampleClass.class)
  .method(ElementMatchers.named("test"))
  .intercept(MethodDelegation.to(SampleClassInterceptor.class))
  .make()
  .load(someClassLoader, ClassLoadingStrategy.UsingLookup.of(MethodHandles
      .privateLookupIn(SampleClass.class, MethodHandles.lookup()))
  .getLoaded()
});

Unfortunately, caching classes in Java brings some caveats. If a proxy is created, it does of course subclass the class it proxies what makes this base class ineligible for garbage collection. Therefore, if the proxy class was referenced strongly, the key would also be referenced strongly. This would render the cache useless and open for memory leaks. Therefore, the proxy class must be referenced softly or weakly what is specified by the constructor argument. In the future, this problem might be resolved if Java introduced ephemerons as a reference type. At the same time, if garbage collection of proxy classes is not an issue, a ConcurrentMap can be used to compute a value on absence.

 

Broaden the usability of proxy classes


To embrace reuse of proxy classes, it is often meaningful to refactor proxy classes to be stateless and to rather isolate state into an instance field. This field can then be accessed during the interception using the mentioned dependency injection mechanism, for example, to make the suffix value configurable per proxy instance:
public class SampleClassInterceptor {
  public static String intercept(@SuperCall Callable<String> zuper, 
        @FieldValue("qux") String suffix) throws Exception {
    return zuper.call() + suffix;
  }
}

The above interceptor now receives the value of a field qux as a second argument which can be declared using Byte Buddy’s type creation DSL:
TypeCache<Class<?>> typeCache = new TypeCache<>(TypeCache.Sort.SOFT);
Class<?> proxyType = typeCache.findOrInsert(classLoader, SampleClass.class, () -> new ByteBuddy()
    .subclass(SampleClass.class)
    .defineField(“qux”, String.class, Visibility.PUBLIC)
    .method(ElementMatchers.named("test"))
    .intercept(MethodDelegation.to(SampleClassInterceptor.class))
    .make()
    .load(someClassLoader, ClassLoadingStrategy.UsingLookup.of(MethodHandles
        .privateLookupIn(SampleClass.class, MethodHandles.lookup()))
    .getLoaded()
});

The field value can now be set on an every instance after its creation using Java reflection. In order to avoid reflection, the DSL can also be used to implement some interface that declares a setter method for the mentioned field which can be implemented using Byte Buddy's FieldAccessor implementation.

 

Weighting Proxy runtime and creation performance


Finally, when creating proxies using Byte Buddy, some performance considerations need to be made. When generating code, there exists a trade-off between the performance of the code generation itself and the runtime performance of the generated code. Byte Buddy typically aims for creating code that runs as efficiently as possible what might require additional time for the creation of such code compared to cglib or other proxing libraries. This bases on the assumption that most applications run for a long time but only creates proxies a single time what does however not hold for all types of applications.

As an important difference to cglib, Byte Buddy generates a dedicated super call delegate per method that is intercepted rather then a single MethodProxy. These additional classes take more time to create and load but having these classes available results in better runtime performance for each method execution. If a proxied method is invoked in a loop, this difference can quickly be crucial. If runtime performance is however not a primary goal and it is more important that the proxy classes are created in short time, the following approach avoids the creating of additional classes altogether:
public class SampleClassInterceptor {
  public static String intercept(@SuperMethod Method zuper, 
        @This Object target, 
        @AllArguments Object[] arguments) throws Exception {
    return zuper.invoke(target, arguments) + "bar";
  }
}

 

Proxies in a modularized environment


Using the simple form of dependency-injection for interceptors rather then relying on a library-specific type such as cglib's MethodInterceptor, Byte Buddy facilitates another advantage in a modularized environment: since the generated proxy class will reference the interceptor class directly rather then referencing a library specific dispatcher type such as cglib's MethodInterceptor, the proxied class’s module does not need to read Byte Buddy’s module. With cglib, the proxied class module must read cglib's module which defines the MethodInterceptor interface rather then the module that implements such an interface. This will most likely be non-intuitive for users of a library that uses cglib as a transitive dependency, especially if the latter dependency is treated as an implementation detail that should not be exposed.

In some cases, it might not even be possible or desirable that the proxied class's module reads the module of the framework which supplies the interceptor. For this case, Byte Buddy also offers a solution to avoid such a dependency altogether by using its Advice component. This component works on code templates such as that in the following example:
public class SampleClassAdvice {
  @Advice.OnMethodExit
  public static void intercept(@Advice.Returned(readOnly = false) String returned) {
    returned += "bar";
  }
}

The above code might not appear to make much sense as it stands and as a matter of fact, it will never be executed. The class merely serves as a byte code template to Byte Buddy which reads the byte code of the annotated method which is then inlined into the generated proxy class. To do so, every parameter of the above method must be annotated to represent a value of the proxied method. In the above case, the annotation defines the parameter to define the method’s return value to which bar is appended as a suffix given the template. Given this advice class, a proxy class could be defined as follows:
new ByteBuddy()
  .subclass(SampleClass.class)
  .defineField(“qux”, String.class, Visibility.PUBLIC)
  .method(ElementMatchers.named(“test”))
  .intercept(Advice.to(SampleClassAdvice.class).wrap(SuperMethodCall.INSTANCE))
  .make()

By wrapping the advice around a SuperMethodCall, the above advice code will be inlined after the call to the overridden method has been made. To inline code before the original method call, the OnMethodEnter annotation can be used.

Supporting proxies on Java versions prior to 9 and past 10


When developing applications for the JVM, one can normally rely on applications that run on a particular version to also run on later versions. This has been true for a long time, even if internal API has been used. However, as a consequence of removing this internal API, this is no longer true as of Java 11 where code generation libraries that have relied on sun.misc.Unsafe will no longer work. At the same time, class definition via MethodHandles.Lookup is not available to JVMs prior to version 9.

As for Byte Buddy, it is the responsibility of a user to use a class loading strategy that is compatible with the current JVM. To support all JVMs, the following selection needs to be made:
ClassLoadingStrategy<ClassLoader> strategy;
if (ClassInjector.UsingLookup.isAvailable()) {
  Class<?> methodHandles = Class.forName("java.lang.invoke.MethodHandles");
  Object lookup = methodHandles.getMethod("lookup").invoke(null);
  Method privateLookupIn = methodHandles.getMethod("privateLookupIn", 
      Class.class, 
      Class.forName("java.lang.invoke.MethodHandles$Lookup"));
  Object privateLookup = privateLookupIn.invoke(null, targetClass, lookup);
  strategy = ClassLoadingStrategy.UsingLookup.of(privateLookup);
} else if (ClassInjector.UsingReflection.isAvailable()) {
  strategy = ClassLoadingStrateg.Default.INJECTION;
} else {
  throw new IllegalStateException(“No code generation strategy available”);
}

The above code uses reflection to resolve a method handle lookup and to resolve it. Doing so, the code can be compiled and loaded on JDKs prior to Java 9. Unfortunately, Byte Buddy cannot implement this code as a convenience since MethodHandles::lookup is call site sensitive such that the above must be defined in a class that resides in the user’s module and not within Byte Buddy.

Finally, it is worth considering to avoid class injection altogether. A proxy class can also be defined in a class loader of its own using the ClassLoadingStrategy.Default.WRAPPER strategy. This strategy is not using any internal API and will work on any JVM version. However, one must keep in mind the performance costs of creating a dedicated class loader. And finally, even if the package name of the proxy class is equal to the proxied class, by defining the proxy in a different class loaders, their runtime packages will no longer be considered as equal by the JVM thus not allowing to override any package-private methods.

Final thoughts


On a final note I want to express my opinion that retiring sun.misc.Unsafe is an important step towards a safer, modularized JVM despite the costs of this migration. Until this very powerful class is removed, any boundaries set by the JPMS can be circumvented by using the privileged access that sun.misc.Unsafe still offers. Without this removal, the JPMS costs all the inconvenience of additional encapsulation without the benefit of being able to rely on it.

Most developers on the JVM will most likely never experience any problems with these additional restrictions but as described, code generation and proxying libraries need to adapt these changes. For cglib, this does unfortunately mean that the end of the road is reached. Cglib was originally modeled as a more powerful version of Java's built-in proxying API where it requires its own dispatcher API to be referenced by the proxy class similar to how Java's API requires referencing of its types. However, these latter types reside in the java.base module which is always read by any module. For this reason, the Java proxying API still functions whereas the cglib model was broken irreparably. In the past, this has already made cglib a difficult candidate for OSGi enviornments but with the JPMS, cglib as a library does no longer function. A similar problem exists for the corresponding proxying API that is provided by Javassist.

The upside of this change is that the JVM finally offers a stable API for defining classes during an application's runtime, a common operation that has relied on internal API for over twenty years. And with the exception of Javaagents that I think still require a more flexible approach, this means that future Java releases are guaranteed to always work once all users of proxies completed this final migration. And given that the development of cglib has been dormant for years with the library suffering many limitations, an eventual migration by today's users of the library was inevitable in any case. The same might be true for Javassist proxies, as the latter library has not seen commits in almost a half year either.

Mittwoch, 10. Mai 2017

Yet another Jigsaw opinion piece

In the past weeks there has been a heated debate around the imminent release of Java 9 and its most famous feature: the Java platform module system – the JPMS which is better known under its project umbrella‘s name Jigsaw. The module system is introduced into the Java ecosystem in form of a formal specification process a JSR which needs to be approved in its final form by its expert group. Among other members of this expert group, representatives of Red Hat and IBM have now voted to reject Java‘s module system in the first ballot which they believe is not yet ready for production.  

What‘s the fuzz all about?


Even today, Java developers are widely familiar with modularity. Build systems like Maven organize code as modules that are compiled against a declared set of dependencies. Only at runtime, these modules are put together on the class path where these compile-time module boundaries vanish. With Jigsaw, the module path is offered as an alternative to this class path for which the JVM retains such compile-time boundaries at runtime. By not using this module path, applications should function just as before. But this comes with the exception of applications that rely on APIs internal to the JVM. The Java standard library is always loaded as a collection of modules, even if the class path is used exclusively such that internal Java APIs are no longer accessible.

This latter limitation in compatibility has raised some concerns among the maintainers of both libraries and end-user applications. And in this context it can be a bit surprising that the recent objections do not relate too much to these concerns. While mentioning issues around compatibility, both Red Hat and IBM predominantly argue that the JPMS requires further extension to allow for a better integration with existing module systems such as JBoss modules and OSGi.

What problem does still need solving?


By jar hell, developers typically describe a situation where a Java application would require two different versions of a library to satisfy different transitive dependencies. Using the class path, this is impossible as one version of a library shadows a second copy. If a class of a given name is loaded for the first time, the system class loader scans jar files in their command line order and loads the first class file it discovers. In the worst case, this can result in Frankenstein functionality if the shadowed jar file contains some exclusive classes that link with classes of the shadowing jar. But more typically, it results in a runtime failure once a feature dependent on a specific version is triggered.

With OSGi and JBoss modules, this problem can be partially resolved. The latter module systems allow to load a library by each its own class loader, thus avoiding the system class loader that is responsible for the class path. With this approach, multiple versions of the same class can coexist by isolation within separate class loaders. Doing so, it is for example possible for two libraries to both depend on their specific version of the commonly breaking Guava API. With class loader isolation, any library would delegate calls to its required version when loading dependent classes.

When using the module path, the JPMS does not (currently) apply such class loader isolation. This means that jar hell is not solved by Java 9. In contrast to using the class path, the JVM does however detect the described version conflict and fails the application at startup, rather than speculating on accidental compatibility. To enforce this constraint, every Java package name is now exclusive to a specific module or the class path. Therefore, it is not possible for two modules to share a package. This restriction also holds if a private package is not meant to be exposed what is considered to be another flaw of the current module design by Jigsaw‘s critics.

A missed chance to escape jar hell?


For class loader isolation to work, it is necessary that versions of the same module never interact. And while two such versions would of course never interact directly, it is unfortunately more than common that two versions are part of the public API of different modules. For example, if two libraries return instances of Guava‘s Function type, a version conflict between each module‘s Guava version can no longer be solved using class loader isolation, even if the Function type did not change between those versions. At runtime, any loaded class is described as a tuple of its name and class loader but since two class loaders do now offer the Function type, which one should be resolved?

This described problem, as a matter of fact, cannot be solved by a module system. Instead, a module system can discover this conflict and inform the user of the need of an explicit resolution. This is accomplished by the current implementation of the JPMS and of course both OSGi and JBoss modules. At the end of the day, version conflicts can only be avoided by evolving APIs in a compatible manner.

Is Jigsaw too simple?


Despite the remaining limitations of a class loader-isolating module system, the current argument against Jigsaw is mainly revolving around this item. Additionally, the expert group members that reject Jigsaw point out the lacking support for circular module dependencies ("module A depends on B depends on C depends on A") and the inability to alter the module graph after its creation.

From a technical perspective, it would of course be possible to add these features. As a matter of fact, Java 9 already ships with a module builder API that allows for loading modules with exclusive class loaders. There is no technical limitation in choosing to retain a single class loader for the module path; rather this decision is considered to be the responsible choice for the JVM by Oracle. And before diving deeper into the arguments, I want to state that I fully agree with the company’s reasoning.

What is wrong with class loader isolation?


As mentioned before, even with class loader isolation, manual version management can often not be avoided. Also, library authors that rely on common APIs with version incompatibilities such as Guava do increasingly shade such dependencies. When shading, the code of a library is copied into a separate name space, thus allowing an application to refer to “its version“ by different names instead of by different class loaders. This approach has of course flaws of its own, especially when a shaded dependency uses JNI. On the other hand, this approach overcomes the just mentioned shortcoming of class loader isolation when using libraries with conflicting shared dependencies. Also, by shading a common dependency, a library author relieves its users from potential conflicts independently of a deployment method.

Allowing for circular dependencies would neither impose a big technical challenge. However, cyclic dependencies are rather uncommon and many build systems like Maven do neither support them. Typically, cyclic dependencies can be refactored to non-cyclic ones by splitting up at least one module into implementation and API. In this context, if a feature seems to be of such little common concern, I do not think corner cases justify its addition, especially when the class path still serves as a backup. And if this decision turns out to be wrong, cyclic dependencies can always be enabled in a future release. Taking this feature away would however not be possible.

Finally, dynamic modules render a feature that could be useful to more than a few applications. When you require the dynamic redeployment of modules with an active life cycle, from my experience in my last project, OSGi is a very good choice. That said, most applications are static and do not have a good reason for its use. But by adding support for a dynamic module graph, the complexity of this feature would translate into the JPMS. Therefore, I think it is the right decision to leave this feature out for now and wait until its use is better understood. Naturally, an approachable module system increases adoption.

Compatibility first.


Does this incompatibility mean the end for OSGi and JBoss modules? Of course not. Quite to the contrary, the introduction of standardized module descriptors gives opportunity to existing module systems. Missing manifest headers to describe bundles is one of the major pain points when using OSGi due to a significant number of libraries that do not consider the proprietary module descriptor. With the introduction of a standardized module descriptor, existing module systems can ease this limitation by using the latter descriptor as a secondary source for a module’s description.

I do not doubt for a second that Red Hat and IBM rejected the JSR with their best intentions. At the same time, I cannot agree with the criticism on the lacking reach of the module system. In my opinion, the existing changes are sufficiently challenging for the Java ecosystem to adopt and especially a last minute introduction of class loader isolation bears the potential of unwanted surprise. In this light, I find the arguments made against the current state of Jigsaw inconsistent as it criticizes the complexity of transition to modules but also demands its extension.

There is no perfect module system


Personally, I think that the current proposal for the JPMS bears two big challenges. Unfortunately enough, they got in the background due to the recent discussion.

Automatic modules

Without a module descriptor, modular code can only refer to a non-modular jar file in form of a so-called automatic module. Automatic modules do not impose any restrictions and are named by their jar file. This works well for developers of end user applications which never release their code for use by another application. Library developers do however lack a stable module name to refer to their dependant automatic modules. If released, they would rely on stable file names for their dependencies which are difficult to assume.

For the adoption of Jigsaw, this would imply a bottom-up approach where any library author can only modularize their software after all dependant code was already modularized. To ease the transition, a manifest entry was added which allows to publish a jar with a stable automatic module name without the need to modularize code or even migrate to Java 9. This allows other libraries users that depend on this first library with a stable name to modularize their code, thus breaking through the bottom-up requirement.

I think it is essential to allow library maintainers to state an explicit module name before their code is migrated to fully use the JPMS and I consider this a more than adequate way to deal with this problem which does unlikely offer a better solution.

Reflection and accessibility

With Jigsaw, it is no longer allowed to access non-public, non-exported members using reflection what is an opportunity that many frameworks currently assume. Of course, with a security manager being set, such access can be impossible even in today’s Java releases but since security managers are used so seldom, this is not thought of much. With Jigsaw, this default is reversed where one needs to explicitly open packages for such reflective access, therefore affecting many Java applications.

Generally, I think that Jigsaw’s encapsulation is a better default than the current general openness. If I want to give Hibernate access to my beans, the JPMS allows me to open my beans to Hibernate only by a qualified export. With a security manager, controlling such fine grained access was difficult if not impossible to implement. However, this transition will induce a lot of growing pain and many libraries are not maintained actively enough to survive adopting these new requirements. Thus, adding this restriction will definitely kill off some libraries that would otherwise still provide a value.

Also, there are use cases of reflection that are still uncovered. For the mocking library Mockito (that I help maintain) we do for example need a way of defining classes in any class loader. This was and still is only possible by the use of internal Java APIs for which no alternative is yet offered. As Mockito is only used in test environments, security should not be of concern in this context. But thanks to the retained openness of sun.misc.Unsafe on which we already rely for instantiating mock classes without constructor calls, we can simply open these APIs by changing their accessibility using its direct memory APIs.

This is of course no good enough solution for the years to come but I am convinced that those concerns can be addressed before removing the Unsafe class entirely. As one possibility, the JVM could be extended with a test module that needs to be resolved explicitly on the command line and which allows such extended access. Another option would be to require the attachment of a Java agent by any test runner because of their ability to break through module barriers. But as for now, any maintained software has an opportunity to resolve its non-standard Java use and continue the discussion on missing APIs in the upcoming years.

Finding consensus.


Considering the stereotype of the socially anxious computer nerd, software development can be a rather emotional business. Oracle has always been a company that Java developers love to hate and the current discussion partly jumps on this bandwagon. Looking at the success of Java as a language and a platform, I do however think that Oracle deserves credit for its objectively good job in its stewardship. Breaking software today with future success in mind is a delicate and grateless task. Anybody who refactored correct but complex code should be sympathetic to this challenge.

Project Jigsaw has often been criticized for being an unnecessary effort and I admit that this thought had crossed my own mind. Yet, it is thanks to the module systems that dead weight like CORBA or RMI can finally be removed from the JVM. With the implied reduction in size of modular Java applications, the JVM has become more attractive for the use within containerized applications and cloud computing what is surely no coincidence given Oracle’s market strategy. And while it would of course be possible to further postpone this effort to a later Java release, the JVM must address the removal of functionality at some point. Now is as good a time as any.

To ease the upcoming transition, it is important to keep the breaking changes down to a minimum. Therefore, I am convinced that extending the scope of Jigsaw is not in the best interest of the broader Java community. Many of the rejecting votes of the recent ballot asked for the involved parties to find consensus on the outstanding issues. Unfortunately, the features in question can either be implemented or discarded where consensus can only be reached by one party giving up their position.

With the typical Java application in mind, I do hope that Oracle does not answer to the demands with a scope extension only to secure a successful vote on the Jigsaw JSR. Rather, I want to appeal to the expert group members who were voting against the JSR to reconsider their vote with the needs of the entire Java ecosystem in mind where the requirements of existing enterprise module solutions are only one factor among many. With the broad usage of Java, ranging from business applications to low-latency systems, it is only natural that different parties identify different priorities for the evolution of the platform. I am convinced that Oracle has found a common denominator for a module system that serves most users.

Donnerstag, 27. Oktober 2016

Generational disparity in garbage collection

For the last year, I have been helping the startup Instana to create a Java agent that traces executions within a Java application. This execution data is collected and jointed to generate traces of user requests as well as the resulting communication between services within the system owner’s hemisphere. This way, unstructured communication can be visualized what significantly simplifies the operation of a distributed system that is composed of multiple interacting services.

In order to generate these traces, the Java agent rewrites all code that reads an external request or initiates one. Obviously, these entries and exits into or out of a system need to be recorded and additionally, meta data is exchanged to identify a request uniquely across systems. For example, when tracing HTTP requests, the agent adds a header containing a unique id which is then recorded by the receiving server as a proof of a request’s origin. Broadly speaking, it is similar to what Zipkin is modeling, but without requiring users to change their code.

In the most simple scenario, such tracing is straightforward to implement. Thanks to my library Byte Buddy which does the heavy lifting, all injected code is written in plain old Java and then copied to the relevant methods at runtime using the Java instrumentation API. For example, when instrumenting a servlet, we know that an entry to a JVM is made whenever the service method is invoked. We also know that the entry is completed when this very same method exits. Therefore, it suffices to add some code to the beginning and the end of the method to record any such entry into a VM process. And it has been the majority of my job to plow through the many Java libraries and frameworks to add support for their ways of communication. From Akka to Zookeeper, over the last year I have hello-worlded my way through the entire Java ecosystem; I even got to write EJBs for all the servers! And I had to make sense of Sun’s CORBA implementation. (Spoiler: There is no sense.)

Things do however quickly become more difficult when tracing asynchronous executions. If a request is received by one thread but is answered from within another thread, it does no longer suffice to only trace entries and exits. Therefore, our agent needs to also track all context switches in concurrent systems made via thread pools, fork join tasks or custom concurrency frameworks. And the same way that debugging asynchronous execution is difficult, this is quite a bit of work for us too. I think that I spend as much time dealing with concurrency as I do recording entries and exits.

The impact on garbage collection


But how does all this impact garbage collection? When implementing a performance monitor, one is facing a trade-off between interpreting the work of a Virtual Machine and causing work for this machine by doing so. While the majority of processing is done in the monitor back-end to which the agent reports its data, we have to do a minimum within the Java process that we share with the monitored application. And you can already guess it: by allocating objects, we inevitably have an impact on the VM’s garbage collection. Fortunately, modern garbage collection algorithms are doing excellent work and by mostly avoiding object allocation and by adaptively sampling our tracing efforts, the effect of our code changes is negligible for the vast majority of users. Ideally, we only burn a few unused processor cycles to do our work. As a matter of fact, very few applications do use their full processing potential and we are happy with grabbing a small portion of this excess.

Writing a garbage collection-friendly application is typically not too difficult. It is obvious that the easiest way of avoiding garbage is to avoid object allocation altogether. However, object allocation in itself isn’t too bad either. Allocating memory is a rather cheap operation and as any processor owns its own allocation buffer - a so-called TLAB - we do not impose an unnecessary synchronization when allocating only a bit of memory from our threads. If an object only lives in the scope of a method, the JVM can even erase the object allocation altogether as if the fields of the objects were put onto the stack directly. But even without this escape-analysis, short-lived objects are captured by a special garbage collection circle called the young generation collection that is processed quite efficiently. To be honest, this is where most of my objects end up as I often value code readability over the small improvements that escape analysis offers. Currently, escape analysis quickly hits its boundary. Yet, I hope for future HotSpots to improve to get the best of both worlds even without changing my code. Fingers crossed!

When writing Java programs, I do not typically think about the impact on garbage collection but the above guidelines tend to manifest in my code. For the majority of our agent, this has been working very well. We are running a whole bunch of example applications and integration tests to assure a good behavior of our agent and I also keep an eye on the GC when running examples. In our modern times, using tools like flight recorder and JIT watch, performance analysis has become quite approachable.

The relativity of short-lived


With an early version of our agent, I one day noticed an application to trigger tenured collection cycles that it did not trigger without it. As a consequence, collection pauses increased by a multitude. The objects that ended up in the tenured collection were however only objects of the monitored application itself. But since our agent runs mostly isolated from the application threads and at first, this did at first not make sense to me.

When digging deeper, I found that our analysis of user objects triggered some additional escapes of objects but the impact was minimal. The application did already produce a fair amount of objects, mostly by using NIO and by using fork join pools. One thing that the latter frameworks have in common is that they rely on the allocation of many short lived objects. For example, a fork-join task often splits itself into multiple subtasks which repeat this procedure until each task’s payload is small enough to be computed directly. Every such task is represented by a single, stateful object. An active fork join pool can spawn millions of such objects every minute. But since the tasks compute fast, the representing object is eligible for collection quickly and therefore captured by the young collector.

So how did these objects end up in the tenured collection all of a sudden? At this time, I was prototyping a new stitching instrumentation to track context switches between such fork join tasks. Following the path of a fork join tasks is not trivial. Each worker thread of a fork join pool applies work stealing and might grab tasks out of the queue of any other task. Also, tasks might provide a feedback to their parent task on completion. As a consequence, tracing the expansion and interaction of tasks is a rather complex process, also because of the existence of so-called continuation threads where a single task might bounce jobs to hundreds of threads within only a few milliseconds. I came up with a rather elegant solution which relied on the allocation of many short-lived objects which were allocated in bursts whenever backtracking a task to its origin. It turned out that these bursts triggered quite a few young collections themselves.

And this is what I did not consider: each young generation collection increases the age of any object that is not eligible for garbage collection at this point. An object does not age by time but by the amount of young collections triggered. This is not true for all collection algorithms but for many of them such as for all default collectors of HotSpot. And by triggering so many collections, the agent threads “prematurely matured” objects of the monitored application despite those objects being unrelated to the agent’s objects. In a way, running the agent “prematurely matured” the target application’s object.

Getting around the problem


I did at first not know how to solve this. In the end, there is no way of telling a garbage collector to treat “your objects” separately. As long as the agent threads were allocating shorter-lived objects at a faster rate than the host process, it would spoil the original objects into the tenured collection causing an increase of garbage collection pauses. In order to avoid this, I therefore started to pool the objects I was using. By pooling, I quickly matured my own objects into the tenured collection and the garbage collection behavior returned to its normal state. Traditionally, pooling was used to avoid the costs of allocation which became cheap in our days. I rediscovered it to erase the impact of our “foreign process” onto garbage collection for the cost of a few kilobytes of memory.

Our tracer is already pooling objects in other places. For example, we represent entries and exits as thread local values that contain a bunch of primitive values that we mutate without allocating a single object. And while such mutable, often procedural and object pooling programming is no longer fashionable, it turns out to be very performance friendly. In the end, mutating bits is closer to what a processor is actually doing. And by using preallocated arrays of a fixed size instead of immutable collections, we save us quite a few round-trips to memory while also preserving our state to be contained in only a few cache lines.

 

Is this a “real world” problem?


You might think that this is a rather specific problem that most people do not need to worry about. But as a matter of fact, the problem that I describe applies to a large number of Java applications. For example, within application containers, we typically deploy multiple applications in a single Java process. Just as in the above case, the garbage collection algorithm does not group objects by application as it has no notion of this deployment model. Therefore, object allocations by two isolated applications that share a container do interfere with the anticipated collection patterns of one another. If each application relies on its objects to die young, the sharing of a heap causes a strong relativity on the duration of short-lived.

I am not an advocate for microservices. As a matter of fact, I think they are a bad idea for most applications. In my opinion, routines that can only exist in interaction should ideally be deployed together unless there are good technical reasons not to. And even if isolated applications ease development, you quickly pay the price in operations. I am just mentioning this to avoid a misinterpretation of the moral of the above experience.

What this experience taught me was that deploying several applications in a single Java process can be a bad idea if those applications are heterogeneous. For example, when running a batch process parallel to a web server, you should consider running each in its own process rather than deploying both of them in the same container. Typically, a batch process is allocating objects at a very different rate than a web server. Yet, many enterprise frameworks still advertise all-in-one solutions for tackling such problems which should not share a process to begin with. In 2016, the overhead of an additional process is not typically a problem and since memory is cheap, rather upgrade your server instead of sharing a heap. Otherwise, you might end up with collection patterns that you did not anticipate when developing, running and testing your applications in isolation.

Donnerstag, 17. Dezember 2015

I am elected Java Champion. Thank you!

Today I was elected a Java Champion. I want to take this opportunity to say thank you to everybody who supported me and my work. Without you, I would not be the developer that I am today. I want to thank all of you and especially:

Rafał Świerzyńsk for his patience and contagious enthusiasm when teaching me many programming fundamentals during my first full-time programming job. Thank you for proving to me that writing quality code pays off both in terms of joy at work and value for a customer.

The team at RebelLabs who helped me to promote Byte Buddy to a broader audience before anybody even used my software. Especially, I want to thank Simon Maple and Oleg Šelajev from ZeroTurnaround and Oliver White who is now working at Typesafe.

Axel Fontaine and Lukas Eder, both developers of open-source libraries that often served me as specimen for how to promote Byte Buddy. I also want to thank you both for your advice and shared experiences.

Everybody at javaBin, the Norwegian Java user group. My first presentation at a JUG meeting bootstrapped my speaker career and speaking at JavaZone gave me a lot of confidence for the international stage. You are a great bunch and I would surely not be a Java Champion today if it were not for your hard work and dedication.

The Java community for its countless efforts of documenting insights into Java technology, for its free software and the many great discussions that I have had. Anything I know I have learned from others and I am endlessly grateful. Especially, I want to thank Brice Dutheil who had the confidence to integrate Byte Buddy into the very popular Mockito library what motivated me tremendously.

Finally, my girlfriend Eiril who never complained when I was away for weeks of conferences, when I programmed away an entire weekend or was fed up with a problem that I had not yet figured out.

Thank you all, wholeheartedly!

Mittwoch, 2. Dezember 2015

Project Jigsaw: an incomplete puzzle

Mark Reinhold just recently proposed a delay of Java 9 to buy more time for completing project Jigsaw as the major feature of the upcoming release. While this decision will surely bring the doomsayers of Java back onto stage, I am personally quite relieved and think this was a good and necessary decision. The milestone for feature completion of Java 9 is
currently set to the 10th of December, forbidding the introduction of new functionality after that date. But looking at early access builds of project Jigsaw, Java’s module system does not seem to be ready for this development stage.

Delays in project Jigsaw have become a habit over the latest Java release cycles. This must not be misinterpreted as incompetence but rather as an indicator for how difficult it is to introduce modules to Java that is currently a stranger to true modularization. Initially, the module system for Java was proposed in 2008 for inclusion in Java 7. But until today, Jigsaw’s implementation always turned out to be more difficult than anticipated. And after several suspensions and even a temporary abandonment, the stewards of Java are surely under pressure to finally succeed. It is great to see that this pressure did not push the Java team to rush for a release.

In this article, I try to summarize the state of project Jigsaw as I see it and as it is discussed publicly on the Jigsaw mailing list. I am writing this article as a contribution to the current discussion and to hopefully involve more people into the ongoing development process. I do no intend to downplay the hard work done by Oracle. I am stating this explicitly to avoid misinterpretation after the rather emotional discussions about Jigsaw following the concealment of sun.misc.Unsafe.

Modularized reflection


What exactly is it that makes project Jigsaw such a difficult endeavor? Today, visibility modifiers are the closest approximation to encapsulating a class’s scope. Package-privacy can serve as an imperfect retainer of a type to its package. But for more complex applications that span internal APIs over multiple packages, visibility modifiers are insufficient and true modules become necessary. With project Jigsaw, classes can be truly encapsulated what makes them unavailable to some code even if those classes were declared to be public. However, Java programs that build on the assumption that all classes are always available at runtime might need to change fundamentally.

This change is most likely less fundamental for developers of end-user applications than for the maintainers of Java libraries and frameworks. A library is typically not aware of its user’s code during its compilation. For overcoming this limitation, a library can fallback to using reflection. This way, a container for dependency-injection (such as Spring) can instantiate bean instances of an application without the bean types being known to the framework at compile-time. For instantiating such objects, the container simply delays its work until runtime when it scans the application’s classpath and discovers the bean types which are now visible. For any of these types, the framework then locates a constructor which is invoked reflectively after resolving all injected dependencies.

Runtime discovery paired with reflection is used by a long list of Java frameworks. But in a modularized environment, running the previous runtime resolution is no longer possible without addressing module boundaries. With project Jigsaw, the Java runtime asserts that every module only accesses modules that are declared as a dependency in the accessing module’s descriptor. Additionally, the imported module must export the classes in question to its accessor. A modularized version of the dependency-injection container cannot declare any user module as a dependency and it is then forbidden reflective access. This would result in a runtime error when instantiating a non-imported class.

To overcome this limitation, project Jigsaw adds a new API that allows to include additional module dependencies at runtime. After making use of this API and adding all user modules, the modularized dependency-injection container can now continue to instantiate bean types that it does not know at compile-time.

But does this new API really solve the problem? From a purely functional point of view, this additional API allows for the migration of a library to retain its functionality even after being repackaged as a module. But unfortunately, the runtime enforcement of module boundaries creates a requirement for a ceremonial dance preceding the use of most reflection code. Before a method is invoked, the caller needs to always assure that the corresponding module is already a dependency of the caller. If a framework forgets to add this check, a runtime error is thrown without any chance of discovery during compilation.

With reflection being used excessively by many libraries and frameworks, it is unlikely that this change in accessibility is going to improve runtime encapsulation. Even if a security manager would restrict frameworks from adding runtime module dependencies, enforcing such boundaries would probably break most existing applications. More realistically, most breaches of module boundaries will not indicate true errors but be caused by improperly migrated code. At the same time, the runtime restriction is neither likely to improve encapsulation if most frameworks preemptively attain access to most user modules.

This requirement does of course not apply when a module uses reflection on its own types but such use of reflection is rather rare in practice and can be substituted by the use of polymorphism. In my eyes, enforcing module boundaries when using reflection contradicts its primary use case and makes the already non-trivial reflection API even more difficult to use.

Modularized resources


Beyond this limitation, it is currently unclear how the dependency-injection container would even discover the classes that it should instantiate. In a non-modularized application, a framework can for example expect a file of a given name to exist on the classpath. This file then serves as an entry point for describing how user code can be discovered. This file is typically obtained by requesting a named resource from a class loader. With project Jigsaw, this might no longer be possible when the required resource is also encapsulated within a module’s boundaries. As far as I know, the final state of resource encapsulation is not yet fully determined. When trying current early access builds, resources of foreign modules can however not be accessed.

Of course, this problem is also addressed in project Jigsaw’s current draft. To overcome module boundaries, Java’s preexisting ServiceLoader class is granted super powers. For making specific classes available to other modules, a module descriptor provides a special syntax that enables leaking certain classes through module boundaries. Applying this syntax, a framework module declares that it provides a certain service. A user library then declares an implementation of the same service to be accessible to the framework. At runtime, the framework module looks up any implementation of its service using the service loader API. This can serve as way for discovering other modules at runtime and could substitute resource discovery.

While this solution seems elegant on a first glance, I remain sceptical of this proposal. The service loader API is fairly simple to use but at the same time, it is very limited in its capabilities. Furthermore, few people have adapted it for their own code which could be seen as an indicator for its limited scope. Unfortunately, only time can tell if this API accommodates all use cases in a sufficient manner. At the same time it is granted that a single Java class gets tied deeply into the Java runtime, making deprecation and substitution of the service loader API almost impossible. In the context of the history of Java which has already told many stories about ideas that seemed good but turned sour, I find it precarious to create such a magical hub that could easily turn out to be an implementation bottleneck.

Finally, it remains unclear how resources are exposed in modularized applications. While Jigsaw does not break any binary compatibility, returning null from a call to ClassLoader::getResource where a value was always returned previously might just bury applications under piles of null pointer exceptions. As an example, code manipulation tools require a means to locate class files which are now encapsulated what would at a minimum hinder their adoption process.

Optional dependencies


Another use case that the service loader API does not accommodate is the declaration of optional dependencies. In many cases, optional dependencies are not considered a good practice but in reality they offer a convenient way out if dependencies can be combined in a large number of permutations.

For example, a library might be able to provide better performance if a specific dependency is available. Otherwise, it would fall back to another, less optimal alternative. In order to use the optional dependency, the library is required to compile against its specific API. If this API is however not available at runtime, the library needs to assure that the optional code is never executed and fall back to the available default. Such an optional dependency cannot be expressed in a modularized environment where any declared module dependency is validated upon the application’s startup, even if the dependency was never used.

A special use-case for optional dependencies are optional annotation bundles. Today, the Java runtime treats annotations as optional metadata. This means that if an annotation’s type cannot be located by a class loader, the Java runtime simply ignores the annotation in question instead of throwing a NoClassDefFoundError. For example, the FindBugs application offers an annotation bundle for suppressing potential bugs after a user found the code in question to be a false-positive. During an application’s regular runtime, the FindBugs-specific annotations are not required and are therefore not included in the application bundle. However, when running FindBugs, the utility explicitly adds the annotation package such that the annotations become visible. In project Jigsaw, this is no longer possible. The annotation type is only available if a module declares a dependency to the annotation bundle. If this dependency is later missing at runtime, an error is raised, despite the annotation's irrelevance.

Non-modularization


Not bundling a framework as a module in Java 9 is of course the easiest way to avoid all of the discussed restrictions. The Java runtime considers any non-modularized jar-file to be part of a class loader’s so-called unnamed module. This unnamed module defines an implicit dependency on all other modules that exist within the running application and exports all of its packages to any other module. This serves as a fallback when mixing modularized and non-modularized code. Due to the implicit imports and exports of an unnamed module, all non-migrated code should continue to function as before.

While such an opt-out might be the best solution for a reflection-heavy framework, slow adoption of project Jigsaw does also defeat the purpose of a module system. With a lack of time being the major constraint of most open-source projects, this outcome is unfortunately quite probable. Furthermore, many open-source developers are bound to compiling their libraries to older versions of Java. Due to the different runtime behavior of modularized and non-modularized code, a framework would need to maintain two branches for being able to use Java 9 APIs to traverse module boundaries in the modularized bundle. It is unlikely that many open-source developers would make the time for such a hybrid solution.

Code instrumentation


In Java, reflective method access is not the only way of a library to interact with unknown user code. Using the instrumentation API, it is possible to redefine classes to include additional method calls. This is commonly used to for example implement method-level security or to collect code metrics.

When instrumenting code, the bytecode of a Java class is typically altered right before it is loaded by a class loader. This can cause unresolvable conflicts if the instrumenting code cannot access a loaded class before its first usage. Currently, there exists no solution to this problem.

Summary


Software estimates are difficult and we all tend to underestimate the complexity of our applications. Project Jigsaw imposes a fundamental change to the runtime behavior of Java applications and it makes perfect sense to delay the release until every eventuality is thoroughly evaluated. Currently, there are too many open questions and it is a good choice to delay the release date.

I would prefer that module boundaries were not enforced by the runtime at all but remain a compiler construct. The Java platform already implements compile-time erasure of generic types and despite some imperfections this solution has worked very well. Without runtime enforcement, modules would also be optional to adopt for dynamic languages on the JVM where the same form of modularization as in Java might not make sense. Finally, I feel that the current strict form of runtime encapsulation tries to solve a problem that does not exist. After working with Java for many years, I have rarely encountered situations where the unintentional usage of internal APIs has caused big problems. In contrast, I remember many occasions where abusing an API that was meant to be private has solved a problem that I could not have worked around. At the same time, other symptoms of lacking modules in Java, often referred to as jar hell, remain unsolved by Jigsaw which does not distinguish between different versions of a module.

Finally, I argue that backwards compatibility applies beyond the binary level. As a matter of fact, a binary incompatibility is usually easier to deal with than a behavioral change. Therefore, method contracts should be respected at least as highly as binary compatibility. In this context, Java has done a great job over the years. While project Jigsaw does not technically break method contracts by providing unnamed modules, modularization makes subtle changes to code behavior that is based on its bundling. In my opinion, this will be confusing to both experienced Java developers and newcomers and result in reappearing runtime errors.

This is why I find the price for enforcing runtime module boundaries too high compared to the benefits that it offers. OSGi, a runtime module system with versioning capabilities already exists for those that really require modularization. As a big benefit, OSGi is implemented on top of the virtual machine and can therefore not influence VM behavior. Alternatively, I think that Jigsaw could include a canonical way for libraries to opt-out of runtime constraints where it makes sense such as for reflection-heavy libraries.

Montag, 30. März 2015

Dismantling invokedynamic

Many Java developers regarded the JDK's version seven release as somewhat a disappointment. On the surface, merely a few language and library extensions made it into the release, namely Project Coin and NIO2. But under the covers, the seventh version of the platform shipped the single biggest extension to the JVM's type system ever introduced after its initial release. Adding the invokedynamic instruction did not only lay the foundation for implementing lambda expressions in Java 8, it also was a game changer for translating dynamic languages into the Java byte code format.

While the invokedynamic instruction is an implementation detail for executing a language on the Java virtual machine, understanding the functioning of this instruction gives true insights into the inner workings of executing a Java program. This article gives a beginner's view on what problem the invokedynamic instruction solves and how it solves it.

Method handles


Method handles are often described as a retrofitted version of Java's reflection API, but this is not what they are meant to represent. While method handles can represent a method, constructor or field, they are not intended to describe properties of these class members. It is for example not possible to directly extract metadata from a method handle such as modifiers or annotation values of the represented method. And while method handles allow for the invocation of a referenced method, their main purpose is to be used together with an invokedynamic call site. For gaining a better understanding of method handles, looking at them as an imperfect replacement for the reflection API is however a reasonable starting point.

Method handles cannot be instantiated. Instead, method handles are created by using a designated lookup object. These objects are themselves created by using a factory method that is provided by the MethodHandles class. Whenever this factory is invoked, it first creates a security context which ensures that the resulting lookup object can only locate methods that are also visible to the class from which the factory method was invoked. A lookup object can then be created as follows:

class Example {
  void doSomething() {
    MethodHandles.Lookup lookup = MethodHandles.lookup();
  }
  private void foo() { /* ... */ }
}

As argued before, the above lookup object could only be used to locate methods that are also visible to the Example class such as foo. It would for example be impossible to look up a private method of another class. This is a first major difference to using the reflection API where private methods of outside classes can be located just as any other method and where these methods can even be invoked after marking such a method as accessible. Method handles are therefore sensible of their creation context which is a first major difference to the reflection API.

Apart from that, a method handle is more specific than the reflection API by describing a specific type of method rather than representing just any method. In a Java program, a method's type is a composite of both the method's return type and the types of its parameters. For example, the only method of the following Counter class returns an int representing the number of characters of the only String-typed argument:

class Counter {
  static int count(String name) {
    return name.length();
  }
}

A representation of this method's type can be created by using another factory. This factory is found in the MethodType class which also represents instances of created method types. Using this factory, the method type for Counter::count can be created by handing over the method's return type and its parameter types bundled as an array:

MethodType methodType = MethodType.methodType(int.class, new Class<?>[] {String.class});

By using the lookup object that was created before and the above method type, it is now possible to locate a method handle that represents the Counter::count method as depicted in the following code:

MethodType methodType = MethodType.methodType(int.class, new Class<?>[] {String.class});
MethodHandles.Lookup lookup = MethodHandles.lookup();
MethodHandle methodHandle = lookup.findStatic(Counter.class, "count", methodType);
int count = methodHandle.invokeExact("foo");
assertThat(count, is(3));

At first glance, using a method handle might seem like an overly complex version of using the reflection API. However, keep in mind that the direct invocation of a method using a handle is not the main intent of its use.

The main difference of the above example code and of invoking a method via the reflection API is only revealed when looking into the differences of how the Java compiler translates both invocations into Java byte code. When a Java program invokes a method, this method is uniquely identified by its name and by its (non-generic) parameter types and even by its return type. It is for this reason that it is possible to overload methods in Java. And even though the Java programming language does not allow it, the JVM does in theory allow to overload a method by its return type.

Following this principle, a reflective method call is executed as a common method call of the Method::invoke method. This method is identified by its two parameters which are of the types Object and Object[]. In addition to this, the method is identified by its Object return type. Because of this signature, all arguments to this method need to always be boxed and enclosed in an array. Similarly, the return value needs to be boxed if it was primitive or null is returned if the method was void.

Method handles are the exception to this rule. Instead of invoking a method handle by referring to the signature of MethodHandle::invokeExact signature which takes an Object[] as its single argument and returns Object, method handles are invoked by using a so-called polymorphic signature. A polymorphic signature is created by the Java compiler dependant on the types of the actual arguments and the expected return type at a call site. For example, when invoking the method handle as above with

int count = methodHandle.invokeExact("foo");

the Java compiler translates this invocation as if the invokeExact method was defined to accept a single single argument of type String and returning an int type. Obviously, such a method does not exist and for (almost) any other method, this would result in a linkage error at runtime. For method handles, the Java Virtual Machine does however recognize this signature to be polymorphic and treats the invocation of the method handle as if the Counter::count method that the handle refers to was inset directly into the call site. Thus, the method can be invoked without the overhead of boxing primitive values or the return type and without placing the argument values inside an array.

At the same time, when using the invokeExact invocation, it is guaranteed to the Java virtual machine that the method handle always references a method at runtime that is compatible to the polymorphic signature. For the example, the JVM expected that the referenced method actually accepts a String as its only argument and that it returns a primitive int. If this constraint was not fulfilled, the execution would instead result in a runtime error. However, any other method that accepts a single String and that returns a primitive int could be successfully filled into the method handle's call site to replace Counter::count.

In contrast, using the Counter::count method handle at the following three invocations would result in runtime errors, even though the code compiles successfully:

int count1 = methodHandle.invokeExact((Object) "foo");
int count2 = (Integer) methodHandle.invokeExact("foo");
methodHandle.invokeExact("foo");

The first statement results in an error because the argument that is handed to the handle is too general. While the JVM expected a String as an argument to the method, the Java compiler suggested that the argument would be an Object type. It is important to understand that the Java compiler took the casting as a hint for creating a different polymorphic signature with an Object type as a single parameter type while the JVM expected a String at runtime. Note that this restriction also holds for handing too specific arguments, for example when casting an argument to an Integer where the method handle required a Number type as its argument. In the second statement, the Java compiler suggested to the runtime that the handle's method would return an Integer wrapper type instead of the primitive int. And without suggesting a return type at all in the third statement, the Java compiler implicitly translated the invocation into a void method call. Hence, invokeExact really does mean exact.

This restriction can sometimes be too harsh. For this reason, instead of requiring an exact invocation, the method handle also allows for a more forgiving invocation where conversions such as type castings and boxings are applied. This sort of invocation can be applied by using the MethodHandle::invoke method. Using this method, the Java compiler still creates a polymorphic signature. This time, the Java virtual machine does however test the actual arguments and the return type for compatibility at run time and converts them by applying boxings or castings, if appropriate. Obviously, these transformations can sometimes add a runtime overhead.

Fields, methods and constructors: handles as a unified interface


Other than Method instances of the reflection API, method handles can equally reference fields or constructors. The name of the MethodHandle type could therefore be seen as too narrow. Effectively, it does not matter what class member is referenced via a method handle at runtime as long as its MethodType, another type with a misleading name, matches the arguments that are passed at the associated call site.

Using the appropriate factories of a MethodHandles.Lookup object, a field can be looked up to represent a getter or a setter. Using getters or setters in this context does not refer to invoking an actual method that follows the Java bean specification. Instead, the field-based method handle directly reads from or writes to the field but in shape of a method call via invoking the method handle. By representing such field access via method handles, field access or method invocations can be used interchangeably.

As an example for such interchange, take the following class:

class Bean {
  String value;
  void print(String x) {
    System.out.println(x);
  }
}

For the above Bean class, the following method handles can be used for either writing a string to the value field or for invoking the print method with the same string as an argument:

MethodHandle fieldHandle = lookup.findSetter(Bean.class, "value", String.class);
MethodType methodType = MethodType.methodType(void.class, new Class<?>[] {String.class});
MethodHandle methodHandle = lookup.findVirtual(Bean.class, "print", methodType);

As long as the method handle call site is handed an instance of Bean together with a String while returning void, both method handles could be used interchangeably as shown here:

anyHandle.invokeExact((Bean) mybean, (String) myString);

Note that the polymorphic signature of the above call site does not match the method type of the above handle. However, within Java byte code, non-static methods are invoked as if they were static methods with where the this reference is handed as a first, implicit argument. A non-static method's nominal type does therefore diverge from its actual runtime type. Similarly, access to a non-static field requires an instance to be access.

Similarly to fields and methods, it is possible to locate and invoke constructors which are considered as methods with a void return value for their nominal type. Furthermore, one can not only invoke a method directly but even invoke a super method as long as this super method is reachable for the class from which the lookup factory was created. In contrast, invoking a super method is not possible at all when relying on the reflection API. If required, it is even possible to return a constant value from a handle.

Performance metrics


Method handles are often described as being a more performant as the Java reflection API. At least for recent releases of the HotSpot virtual machine, this is not true. The simplest way of proving this is writing an appropriate benchmark. Then again, is not all too simple to write a benchmark for a Java program which is optimized while it is executed. The de facto standard for writing a benchmark has become using JMH, a harness that ships under the OpenJDK umbrella. The full benchmark can be found as a gist in my GitHub profile. In this article, only the most important aspects of this benchmark are covered.

From the benchmark, it becomes obvious that reflection is already implemented quite efficiently. Modern JVMs know a concept named inflation where a frequently invoked reflective method call is replaced with runtime generated Java byte code. What remains is the overhead of applying the boxing for passing arguments and receiving a return values. These boxings can sometimes be eliminated by the JVM's Just-in-time compiler but this is not always possible. For this reason, using method handles can be more performant than using the reflection API if method calls involve a significant amount of primitive values. This does however require that the exact method signatures are already known at compile time such that the appropriate polymorphic signature can be created. For most use cases of the reflection API, this guarantee can however not be given because the invoked method's types are not known at compile time. In this case, using method handles does not offer any performance benefits and should not be used to replace it.

Creating an invokedynamic call site


Normally, invokedynamic call sites are created by the Java compiler only when it needs to translate a lambda expression into byte code. It is worthwhile to note that lambda expressions could have been implemented without invokedynamic call sites altogether, for example by converting them into anonymous inner classes. As a main difference to the suggested approach, using invokedynamic delays the creation of a similar class to runtime. We are looking into class creation in the next section. For now, bear however in mind that invokedynamic does not have anything to do with class creation, it only allows to delay the decision of how to dispatch a method until runtime.

For a better understanding of invokedynamic call sites, it helps to create such call sites explicitly in order to look at the mechanic in isolation. To do so, the following example makes use of my code generation framework Byte Buddy which provides explicit byte code generation of invokedynamic call sites without requiring a any knowledge of the byte code format.

Any invokedynamic call site eventually yields a MethodHandle that references the method to be invoked. Instead of invoking this method handle manually, it is however up to the Java runtime to do so. Because method handles have become a known concept to the Java virtual machine, these invocations are then optimized similarly to a common method call. Any such method handle is received from a so-called bootstrap method which is nothing more than a plain Java method that fulfills a specific signature. For a trivial example of a bootstrap method, look at the following code:

class Bootstrapper {
  public static CallSite bootstrap(Object... args) throws Throwable {
    MethodType methodType = MethodType.methodType(int.class, new Class<?>[] {String.class})
    MethodHandles.Lookup lookup = MethodHandles.lookup();
    MethodHandle methodHandle = lookup.findStatic(Counter.class, "count", methodType);
    return new ConstantCallSite(methodHandle);
  }
}

For now, we do not care much about the arguments of the method. Instead, notice that the method is static what is as a matter of fact a requirement. Within Java byte code, an invokedynamic call site references the full signature of a bootstrap method but not a specific object which could have a state and a life cycle. Once the invokedynamic call site is invoked, control flow is handed to the referenced bootstrap method which is now responsible for identifying a method handle. Once this method handle is returned from the bootstrap method, it is invoked by the Java runtime.

As obvious from the above example, a MethodHandle is not returned directly from a bootstrap method. Instead, the handle is wrapped inside of a CallSite object. Whenever a bootstrap method is invoked, the invokedynamic call site is later permanently bound to the CallSite object that is returned from this method. Consequently, a bootstrap method is only invoked a single time for any call site. Thanks to this intermediate CallSite object, it is however possible to exchange the referenced MethodHandle at a later point. For this purpose, the Java class library already offers different implementations of CallSite. We have already seen a ConstantCallSite in the example code above. As the name suggests, a ConstantCallSite always references the same method handle without a possibility of a later exchange. Alternatively, it is however also possible to for example use a MutableCallSite which allows to change the referenced MethodHandle at a later point in time or it is even possible to implement a custom CallSite class.

With the above bootstrap method and Byte Buddy, we can now implement a custom invokedynamic instruction. For this, Byte Buddy offers the InvokeDynamic instrumentation that accepts a bootstrap method as its only mandatory argument. Such instrumentations are then fed to Byte Buddy. Assuming the following class:

abstract class Example {
  abstract int method();
}

we can use Byte Buddy to subclass Example in order to override method. We are then going to implement this method to contain a single invokedynamic call site. Without any further configuration, Byte Buddy creates a polymorphic signature that resembles the method type of the overridden method. However, for non-static methods, the this reference is set as a first, implicit argument. Assuming that we want to bind the Counter::count method which expects a String as a single argument, we could not bind this handle to Example::method because of this type mismatch. Therefore, we need to create a different call site without the implicit argument but with an String in its place. This can be achieved by using Byte Buddy's domain specific language:

Instrumentation invokeDynamic = InvokeDynamic
 .bootstrap(Bootstrapper.class.getDeclaredMethod(“bootstrap”, Object[].class))
 .withoutImplicitArguments()
 .withValue("foo");

With this instrumentation in place, we can finally extend the Example class and override method to implement the invokedynamic call site as in the following code snippet:

Example example = new ByteBuddy()
  .subclass(Example.class)
   .method(named(“method”)).intercept(invokeDynamic)
   .make()
   .load(Example.class.getClassLoader(), 
         ClassLoadingStrategy.Default.INJECTION)
   .getLoaded()
   .newInstance();
int result = example.method();
assertThat(result, is(3));

As obvious from the above assertion, the characters of the "foo" string were counted correctly. By setting appropriate break points in the code, it is further possible to validate that the bootstrap method is called and that control flow further reaches the Counter::count method.

So far, we did not gain much from using an invokedynamic call site. The above bootstrap method would always bind Counter::count and can therefore only produce a valid result if the invokedynamic call site really wanted to transform a String into an int. Obviously, bootstrap methods can however be more flexible thanks to the arguments they receive from the invokedynamic call site. Any bootstrap method receives at least three arguments:

As a first argument, the bootstrap method receives a MethodHandles.Lookup object. The security context of this object is that of the class that contains the invokedynamic call site that triggered the bootstrapping. As discussed before, this implies that private methods of the defining class could be bound to the invokedynamic call site using this lookup instance.
The second argument is a String representing a method name. This string serves as a hint to indicate from the call site which method should be bound to it. Strictly speaking, this argument is not required as it is perfectly legal to bind a method with another name. Byte Buddy simply serves the the name of the overridden method as this argument, if not specified differently.
Finally, the MethodType of the method handle that is expected to be returned is served as a third argument. For the example above, we specified explicitly that we expect a String as a single parameter. At the same time, Byte Buddy derived that we require an int as a return value from looking at the overridden method, as we again did not specify any explicit return type.

It is up to the implementor of a bootstrap method what exact signature this method should portray as long as it can at least accept these three arguments. If the last parameter of a bootstrap method represents an Object array, this last parameter is treated as a varargs and can therefore accept any excess arguments. This is also the reason why the above example bootstrap method is valid.

Additionally, a bootstrap method can receive several arguments from an invokedynamic call site as long as these arguments can be stored in a class's constant pool. For any Java class, a constant pool stores values that are used inside of a class, largely numbers or string values. As of today, such constants can be primitive values of at least 32 bit size, Strings, Classes, MethodHandles and MethodTypes. This allows bootstrap methods to be used more flexible, if locating a suitable method handle requires additional information in form of such arguments.

Lambda expressions


Whenever the Java compiler translates a lambda expression into byte code, it copies the lambda's body into a private method inside of the class in which the expression is defined. These methods are named lambda$X$Y with X being the name of the method that contains the lambda expression and with Y being a zero-based sequence number. The parameters of such a method are those of the functional interface that the lambda expression implements. Given that the lambda expression makes no use of non-static fields or methods of the enclosing class, the method is also defined to be static.

For compensation, the lambda expression is itself substituted by an invokedynamic call site. On its invocation, this call site requests the binding of a factory for an instance of the functional interface. As arguments to this factory, the call site supplies any values of the lambda expression's enclosing method which are used inside of the expression and a reference to the enclosing instance, if required. As a return type, the factory is required to provide an instance of the functional interface.

For bootstrapping a call site, any invokedynamic instruction currently delegates to the LambdaMetafactory class which is included in the Java class library. This factory is then responsible for creating a class that implements the functional interface and which invokes the appropriate method that contains the lambda's body which, as described before, is stored in the original class. In the future, this bootstrapping process might however change which is one of the major advantages of using invokedynamic for implementing lambda expressions. If one day, a better suited language feature was available for implementing lambda expressions, the current implementation could simply be swapped out.

In order to being able to create a class that implements the functional interface, any call site representing a lambda expression provides additional arguments to the bootstrap method. For the obligatory arguments, it already provides the name of the functional interface's method. Also, it provides a MethodType of the factory method that the bootstrapping is supposed to yield as a result. Additionally, the bootstrap method is supplied another MethodType that describes the signature of the functional interface's method. To that, it receives a MethodHandle referencing the method that contains the lambda's method body. Finally, the call site provides a MethodType of the generic signature of the functional interface's method, i.e. the signature of the method at the call site before type-erasure was applied.

When invoked, the bootstrap method looks at these arguments and creates an appropriate implementation of a class that implements the functional interface. This class is created using the ASM library, a low-level byte code parser and writer that has become the de facto standard for direct Java byte code manipulation. Besides implementing the functional interface's method, the bootstrap method also adds an appropriate constructor and a static factory method for creating instances of the class. It is this factory method that is later bound to the invokedyanmic call site. As arguments, the factory receives an instance to the lambda method's enclosing instance, in case it is accessed and also any values that are read from the enclosing method.

As an example, consider the following lambda expression:

class Foo {
  int i;
  void bar(int j) {
    Consumer consumer = k -> System.out.println(i + j + k);
  }
}

In order to be executed, the lambda expression requires access to both the enclosing instance of Foo and to the value j of its enclosing method. Therefore, the desugared version of the above class looks something like the following where the invokedynamic instruction is represented by some pseudo-code:

class Foo {
  int i;
  void bar(int j) {
    Consumer consumer = <invokedynamic(this, j)>;
  }
  private /* non-static */ void lambda$foo$0(int j, int k) {
    System.out.println(this.i + j + k);
  }
}

In order to being able to invoke lambda$foo$0, both the enclosing Foo instance and the j variable are handed to the factory that is bound by the invokedyanmic instruction. This factory then receives the variables it requires in order to create an instance of the generated class. This generated class would then look something like the following:

class Foo$$Lambda$0 implements Consumer {
  private final Foo _this;
  private final int j;
  private Foo$$Lambda$0(Foo _this, int j) {
    this._this = _this;
    this.j = j;
  }
  private static Consumer get$Lambda(Foo _this, int j) {
    return new Foo$$Lambda$0(_this, j);
  }
  public void accept(Object value) { // type erasure
    _this.lambda$foo$0(_this, j, (Integer) value);
  }
}

Eventually, the factory method of the generated class is bound to the invokedynamic call site via a method handle that is contained by a ConstantCallSite. However, if the lambda expression is fully stateless, i.e. it does not require access to the instance or method in which it is enclosed, the LambdaMetafactory returns a so-called constant method handle that references an eagerly created instance of the generated class. Hence, this instance serves as a singleton to be used for every time that the lambda expression's call site is reached. Obviously, this optimization decision affects your application's memory footprint and is something to keep in mind when writing lambda expressions. Also, no factory method is added to a class of a stateless lambda expression.

You might have noticed that the lambda expression's method body is contained in a private method which is now invoked from another class. Normally, this would result in an illegal access error. To overcome this limitation, the generated classes are loaded using so-called anonymous class loading. Anonymous class loading can only be applied when a class is loaded explicitly by handing a byte array. Also, it is not normally possible to apply anonymous class loading in user code as it is hidden away in the internal classes of the Java class library. When a class is loaded using anonymous class loading, it receives a host class of which it inherits its full security context. This involves both method and field access rights and the protection domain such that a lambda expression can also be generated for signed jar files. Using this approch, lambda expression can be considered more secure than anonymous inner classes because private methods are never reachable from outside of a class.

Under the covers: lambda forms


Lambda forms are an implementation detail of how MethodHandles are executed by the virtual machine. Because of their name, lambda forms are however often confused with lambda expressions. Instead, lambda forms are inspired by lambda calculus and received their name for that reason, not for their actual usage to implement lambda expressions in the OpenJDK.

In earlier versions of the OpenJDK 7, method handles could be executed in one of two modes. Method handles were either directly rendered as byte code or they were dispatched using explicit assembly code that was supplied by the Java runtime. The byte code rendering was applied to any method handle that was considered to be fully constant throughout the lifetime of a Java class. If the JVM could however not prove this property, the method handle was instead executed by dispatching it to the supplied assembly code. Unfortunately, because assembly code cannot be optimized by Java's JIT-compiler, this lead to non-constant method handle invocations to "fall off the performance cliff". As this also affected the lazily bound lambda expressions, this was obviously not a satisfactory solution.

LambdaForms were introduced to solve this problem. Roughly speaking, lambda forms represent byte code instructions which, as stated before, can be optimized by a JIT-compiler. In the OpenJDK, a MethodHandle's invocation semantics are today represented by a LambdaForm to which the handle carries a reference. With this optimizable intermediate representation, the use of non-constant MethodHandles has become significantly more performant. As a matter of fact, it is even possible to see a byte-code compiled LambdaForm in action. Simply place a break point inside of a bootstrap method or inside of a method that is invoked via a MethodHandle. Once the break point kicks it, the byte code-translated LambdaForms can be found on the call stack.

Why this matters for dynamic languages


Any language that should be executed on the Java virtual machine needs to be translated to Java byte code. And as the name suggests, Java byte code aligns rather close to the Java programming language. This includes the requirement to define a strict type for any value and before invokedynamic was introduced, a method call required to specify an explicit target class for dispatching a method. Looking at the following JavaScript code, specifying either information is however not possible when translating the method into byte code:

function (foo) {
  foo.bar();
}

Using an invokedynamic call site, it has become possible to delay the identification of the method's dispatcher until runtime and furthermore, to rebind the invocation target, in case that a previous decision needs to be corrected. Before, using the reflection API with all of its performance drawbacks was the only real alternative to implementing a dynamic language.

The real profiteer of the invokedynamic instruction are therefore dynamic programming languages. Adding the instruction was a first step away from aligning the byte code format to the Java programming language, making the JVM a powerful runtime even for dynamic languages. And as lambda expressions proved, this stronger focus on hosting dynamic languages on the JVM does neither interfere with evolving the Java language. In contrast, the Java programming languages gained from these efforts.