Introduction of a MemoryAddress class to OpenJDK

Authored by Eloise Slater (Software Engineer at THG)

What is a MemoryAddress and why did we introduce it?

In Java pointers are typically stored within 64-bit longs. These pointers can be used within Java code itself and passed between Java code and C/C++ native code. However, CHERI platforms implement a new type of 128-bit fat pointer called a CHERI capability. As a 64-bit long is the largest Java primitive, there is no primitive large enough to store a CHERI capability. This is why we introduced the MemoryAddress class into the Java language as a means for storing capabilities and passing capabilities between Java code and native code.

There are a few intricacies of how the MemoryAddress class works. Its layout is shown below:

public class MemoryAddress { 
    @Native private final long rawAddress; 

    public MemoryAddress(long address) { 
        this.rawAddress = address; 
    } 

    public long getRawAddress() { 
        return rawAddress; 
    } 

    public static long getRawAddress(MemoryAddress address) { 
        return MemoryAddress.isNull(address) ? 0 : address.getRawAddress(); 
    } 

    public boolean isNullPointer() { 
        return rawAddress == 0; 
    } 

    @Override 
    public boolean equals(Object o) { 
        if (this == o) return true; 
        if (o == null || getClass() != o.getClass()) return false; 
        MemoryAddress that = (MemoryAddress) o; 
        return getRawAddress() == that.getRawAddress(); 
    } 

    @Override 
    public int hashCode() { 
        return Objects.hash(getRawAddress()); 
    } 

    public MemoryAddress subtract(long offset) { 
        return add(-offset); 
    } 

    public native MemoryAddress add(long offset); 

    public MemoryAddress bitAnd(MemoryAddress rhs) { 
        return bitAnd(rhs.getRawAddress()); 
    } 

    public native MemoryAddress bitAnd(long rhs); 

    public static boolean isNull(MemoryAddress address) { 
        return address == null || address.isNullPointer(); 
    } 
} 

There are a few conditions that must be met for the MemoryAddress class to work:

  1. A MemoryAddress that contains a valid capability is never instantiated in Java code. This may seem at odds with the class shown above since there is a public constructor that lets you instantiate MemoryAddress. However, this constructor can only create a valid MemoryAddress on platforms with 64-bit pointers, it cannot create a valid capability. Any MemoryAddress that contains valid 128-bit capability must be instantiated in C++.
  2. A pointer cannot be deference in Java code: Since it's not possible to deference a pointer in Java this is not a problem.
  3. Any addition or subtraction that returns a valid MemoryAddress should take place in native code: To increment a pointer in Java you must now call into native code using the add or subtract methods on the MemoryAddress class. You cannot just add/subtract onto the address bits of a CHERI capability since they cannot be forged (doing so would clear the validity tag of the capability) . However, any arithmetic operation that does not need to return a valid capability can be performed on the rawAddress field of a MemoryAddress.

How does MemoryAddress get instantiated?

We have added the class: java_lang_MemoryAddress into javaClasses inside hotspot. A method called: Handle java_lang_MemoryAddress::create(void *ptr, TRAPS) has been added to create a MemoryAddress object with the following two fields:

  1. address: This is an injected field link to Hui’s blog post that is only present on the native representation of MemoryAddress. On morello this field has the signature morellointptr which is a T_ADDRESS, whilst on aarch64 it has the signature byte since this field is not used on aarch64. On creation of a MemoryAddress instance the address is set to the capability value on morello, whilst on aarch64 it is set to -1. This field is never exposed to Java code.
  2. rawAddress: this long field is present on the Java representation of the class. On aarch64 the field is set the to the full value of the pointer, while on morello it is set to the value of the capability cast into a long. The Java function getRawAddress() will return this value.

We have also added the function: jobject java_lang_MemoryAddress::create_object(void *ptr, TRAPS), which wraps the function create but returns a jobject.

How do you retrieve the pointer in C++?

There are 8 methods on java_lang_MemoryAddress that let you retrieve the pointer, the signatures are shown below:

  static void *get_ptr(oop memoryAddress); 
  static void *get_ptr(oop memoryAddress, long offset); 
  static void *get_ptr(jobject obj, long offset); 
  static void *get_ptr(jobject obj) { return get_ptr(obj, (long) 0); } 

  static void *get_ptr(oop memoryAddress, TRAPS); 
  static void *get_ptr(oop memoryAddress, long offset, TRAPS); 
  static void *get_ptr(jobject obj, long offset, TRAPS); 
  static void *get_ptr(jobject obj, TRAPS) { return get_ptr(obj, 0, CHECK_NULL); } 

These methods are all very similar with the different variations depending on:

  • The format of the MemoryAddress (oop or jobject).
  • If you want to add an offset to the pointer.
  • If you want to TRAP. They all eventually call: java_lang_MemoryAddress::ptr(oop memoryAddress), and on morello this will return the address field that stores the capability, whilst on aarch64 this will return the rawAddress field that stores the long.

Use of rawAddress on morello

The rawAddress field is quite handy on morello and we have used it in the following scenarios:

  • As a key in hashmap for the JVMCI.
  • To perform arithmetic operations when a valid capability doesn't need to be returned, without additional native calls.
  • To allow for easy checking of whether the address is 0/null without an additional native call.
  • Allowed for incremental porting for Graal, as we are able to have JVMCI using MemoryAddress instead of longs. But Graal can pull out the raw addresses and carry on as normal.

Changes to Java and the Java API

Any Java class that stores a pointer that will be dereferenced later will need to be updated to use a MemoryAddress class.

Unsafe

There are two unsafe API's in the JDK, the first: src/java.base/share/classes/jdk/internal/misc/Unsafe.java and second: src/jdk.unsupported/share/classes/sun/misc/Unsafe.javareferred to as internal-unsafe and sun-misc-unsafe from now on, respectively. The sun-misc-unsafe is an externally accessible wrapper around the internal-unsafe, that provides access to a subset of the methods within internal-unsafe. The sun-misc-unsafe is an unsupported API that can be used by third parties. Whilst it is technically unsupported it is used by many 3rd party libraries.

The Unsafe API allows direct access to memory in a more C/C++ that is normally prohibited as a Java developer. The API heavily uses raw pointers to perform direct memory accesses hence the API needed updating to use MemoryAddress for the port to morello.

Below shows the method putInt:

    @IntrinsicCandidate 
    public native void putInt(Object o, long offset, int x); 

previously you would call this method one of two ways: putInt(myObject, offset_to_int, int_to_put) or putInt(null, raw_address_to_int, int_to_put). In the first pathway we are calling it with an object so there are no raw addresses used. The second pathway uses a raw address, which on morello needs changing to be a MemoryAddress. We have extended the Unsafe API so both the MemoryAddress and raw pathway are still availible. We decided not to update the Java API signature but rather its use, and its implementation in native code: src/hotspot/share/prims/unsafe.cpp. To call putInt on the morello platform: putInt(MemoryAddress_object, offset_to_add_to_address, int_to_put). This signature allows us to increment the capability and perform the unsafe function in one native call rather than two (increment then call unsafe).

Below shows the updated unsafe.cpp to reflect the signature change in Unsafe.java API. The get/put functions are created via the macro below:

UNSAFE_ENTRY(void, Unsafe_Put##Type(JNIEnv *env, jobject unsafe, jobject obj, jlong offset, java_type x)) { \ 
  MemoryAccess<java_type>(thread, obj, offset).put(x); \ 
} UNSAFE_END \ 

where we instatiate a MemoryAccess object:

  MemoryAccess(JavaThread* thread, jobject obj, jlong offset) 
    : _thread(thread), _obj(JNIHandles::resolve(obj)), _offset((uintptr_t)offset) { 
    assert_field_offset_sane(_obj, offset); 
  } 

We haven't updated either the macro or the constructor for MemoryAccess, but we have updated the put and get methods on the MemoryAccess class as shown below.

  void put(T x) { 
    if (_obj == NULL) { // the original pathway for raw address access 
      GuardUnsafeAccess guard(_thread); 
      RawAccess<>::store(addr(), normalize_for_write(x)); 
    } else { 
      if (_obj->is_a(vmClasses::MemoryAddress_klass())) { 
        GuardUnsafeAccess guard(_thread); 
        RawAccess<>::store(addr_from_obj(), normalize_for_write(x)); // morello pathway for raw address access 
      } else { 
        HeapAccess<>::store_at(_obj, _offset, normalize_for_write(x)); // access via offset from an object 
      } 
    } 
  } 

We have kept the original pathway where the _obj == NULL, and it does a raw access using the _offset. We have changed the else branch, where we check if the object is a MemoryAddress class. If it is, we do a RawAccess and call the method addr_from_obj():

  volatile T* addr_from_obj() { 
      void* ptr = java_lang_MemoryAddress::get_ptr(_obj, _offset); 
      return static_cast<volatile T*>(ptr); 
  } 

the java_lang_MemoryAddress::get_ptr(_obj, _offset) is where we increment the pointer as shown below:

void *java_lang_MemoryAddress::get_ptr(oop memoryAddress, long offset) { 
  void *ptr = get_ptr(memoryAddress); 
#ifdef MORELLO 
  // MOJO: Pointer stored is null, and you are trying to use offset as raw address. 
  if (ptr == nullptr && offset != 0) ShouldNotReachHere(); 
#endif 
  return (void*)((uintptr_t) ptr + offset); 
} 

We have added the following methods, where the original methods have been kept for compatibility:

  • getMemoryAddress: this should replace getLong where the long returned is a raw pointer, and all calls to getAddress.
  • putMemoryAddress: this should replace putLong where the long returned is a raw pointer, and all calls to putAddress.
  • allocateMemoryObject: this should replace calls to allocateMemory.
  • reallocateMemoryObject: this should replace calls to reallocateMemory
  • freeMemoryObject: this should replace calls to freeMemory

We have also updated the method: compareAndSetAddress to use MemoryAddress.

Buffers

Java buffer code is located in the nio package. Buffers can either be direct or non-direct buffers. Non-direct buffers are allocated on the Java heap, whilst direct buffers are allocated on the native heap. The word direct indicts you are directly handling memory in this case, hence direct buffers are not collected by GC's and must be manually freed. Since direct buffers handle memory directly, they store the address of where they have been allocated in a field on the object. We have updated this field to use MemoryAddress.

Currently we have added the Buffer class to have both the long address and the MemoryAddress memoryAddress on the class to prevent inheritance issues.

public abstract class Buffer { 
    ... 
    long address; 
    MemoryAddress memoryAddress; 
    ... 

For brevity I will not go into the full details of how we have updated all the buffer code to include MemoryAddress, as it doesn't affect the API much since it's abstracted away. The majority of the changes can be found in the code related to direct buffers including: Direct-X-Buffer-bin.java.template and Direct-X-Buffer.java.template. In particular, most changes are in the constructor for the DirectBuffer either taking the MemoryAddress for the array location or using the unsafe.allocateMemoryObject to allocate the array. For all HeapBuffers the memoryAddress field is set to null.

We also updated classes in the zip package to use .memoryAddress rather than .address including: CRC32, Adler, CRC32C, Deflater and Inflater. In the sun.nio package we have update: SockerChannelImpl, NioSocketImpl, DatagramChannelImpl and UnixUserDefinedFileAttributeView. See the table below for more details of where MemoryAddress has propagated out.

Native buffers

The: NativeBuffer.java class has been updated to have a MemoryAddress instead of a long. This has not changed the Java API itself as the NativeBuffer constructor take in the size to allocate, which then calls unsafe.allocateMemoryObject. But this does mean that MemoryAddress has had to be propagated out to other classes, see table below for details.

All the files update to use MemoryAddress:

At the time of writing this is a list of all the Java files that now use MemoryAddress, with their Java or Native counterparts. This list does not include any files for the JVMCI.

Java class Native or Java counterparts
share/classes/java/lang/MemoryAddress.java share/native/libjava/MemoryAddress.c
bsd/classes/sun/nio/ch/KQueueSelectorImpl.java bsd/classes/sun/nio/ch/KQueue.java
bsd/classes/sun/nio/ch/KQueue.java bsd/native/libnio/ch/KQueue.c
bsd/classes/sun/nio/ch/KQueuePort.java bsd/classes/sun/nio/ch/KQueue.java
bsd/classes/sun/nio/fs/BsdNativeDispatcher.java bsd/native/libnio/fs/BsdNativeDispatcher.c
bsd/classes/sun/nio/fs/BsdFileSystem.java bsd/classes/sun/nio/fs/BsdNativeDispatcher.java
share/classes/sun/nio/ch/DatagramChannelImpl.java unix/native/libnio/ch/DatagramChannelImpl.c
share/classes/sun/nio/ch/FileChannelImpl.java unix/native/libnio/ch/FileChannelImpl.c
unix/classes/sun/nio/ch/DatagramDispatcher.java unix/native/libnio/ch/DatagramDispatcher.c
unix/classes/sun/nio/ch/SocketDispatcher.java unix/native/libnio/ch/SocketDispatcher.c
unix/classes/sun/nio/ch/FileDispatcherImpl.java unix/native/libnio/ch/FileDispatcherImpl.c
unix/classes/sun/nio/ch/NativeThread.java unix/native/libnio/ch/NativeThread.c
share/classes/sun/nio/ch/SocketChannelImpl.java unix/classes/sun/nio/ch/NativeThread.java
share/classes/sun/nio/ch/ServerSocketChannelImpl.java unix/classes/sun/nio/ch/NativeThread.java
share/classes/sun/nio/ch/NativeThreadSet.java unix/classes/sun/nio/ch/NativeThread.java
share/classes/sun/nio/ch/NioSocketImpl.java unix/classes/sun/nio/ch/NativeThread.java
unix/classes/sun/nio/ch/SourceChannelImpl.java unix/classes/sun/nio/ch/NativeThread.java
unix/classes/sun/nio/ch/SinkChannelImpl.java unix/classes/sun/nio/ch/NativeThread.java
share/classes/java/nio/MappedMemoryUtils.java unix/native/libnio/MappedMemoryUtils.c
unix/classes/sun/nio/fs/UnixNativeDispatcher.java unix/native/libnio/fs/UnixNativeDispatcher.c
unix/classes/sun/nio/fs/UnixCopyFile.java unix/classes/sun/nio/fs/UnixNativeDispatcher.java
unix/classes/sun/nio/fs/UnixUserDefinedFileAttributeView.java sun.nio.ch.DirectBuffer, NativeBuffer, unix/classes/sun/nio/fs/UnixNativeDispatcher.java, share/classes/jdk/internal/misc/Unsafe.java
unix/classes/sun/nio/fs/UnixDirectoryStream.java unix/classes/sun/nio/fs/UnixNativeDispatcher.java
unix/classes/sun/nio/fs/UnixFileSystemProvider.java unix/classes/sun/nio/fs/UnixNativeDispatcher.java, unix/classes/sun/nio/fs/UnixDirectoryStream.java
unix/classes/sun/nio/fs/UnixSecureDirectoryStream.java unix/classes/sun/nio/fs/UnixDirectoryStream.java
linux/classes/sun/nio/fs/LinuxNativeDispatcher.java linux/native/libnio/fs/LinuxNativeDispatcher.c
linux/native/libnio/fs/LinuxWatchService.c share/classes/jdk/internal/misc/Unsafe.java, NativeBuffer, unix/classes/sun/nio/fs/UnixNativeDispatcher.java
linux/classes/sun/nio/fs/LinuxFileSystem.java linux/classes/sun/nio/fs/LinuxNativeDispatcher.java
linux/classes/sun/nio/fs/LinuxWatchService.java share/classes/jdk/internal/misc/Unsafe.java, unix/classes/sun/nio/fs/UnixNativeDispatcher.java
share/classes/jdk/internal/loader/NativeLibrary.java N/A, interface
share/classes/jdk/internal/loader/NativeLibraries.java share/classes/jdk/internal/loader/NativeLibrary.java, share/native/libjava/NativeLibraries.c
share/classes/java/lang/ClassLoader.java share/classes/jdk/internal/loader/NativeLibraries.java
share/classes/java/lang/System.java share/classes/java/lang/ClassLoader.java
share/classes/java/util/zip/Deflater.java share/native/libzip/Deflater.c, share/classes/java/nio/Direct-X-Buffer-bin.java.template
share/classes/java/util/zip/Inflater.java share/native/libzip/Inflater.c, share/classes/java/nio/Direct-X-Buffer-bin.java.template
share/classes/java/util/zip/CRC32.java share/native/libzip/CRC32.c, share/classes/java/nio/Direct-X-Buffer-bin.java.template
share/classes/java/util/zip/CRC32C.java share/classes/jdk/internal/misc/Unsafe.java
share/classes/java/util/zip/Adler32.java share/native/libzip/Adler32.c
share/classes/sun/nio/ch/NativeObject.java share/classes/jdk/internal/misc/Unsafe.java
share/classes/sun/nio/ch/IOVecWrapper.java share/classes/sun/nio/ch/NativeObject.java, share/classes/jdk/internal/misc/Unsafe.java
share/classes/sun/nio/ch/NativeSocketAddress.java share/classes/jdk/internal/misc/Unsafe.java
share/classes/sun/nio/ch/Util.java share/classes/jdk/internal/misc/Unsafe.java, share/classes/java/nio/Direct-X-Buffer-bin.java.template
share/classes/sun/nio/ch/NativeDispatcher.java
share/classes/jdk/internal/access/JavaNioAccess.java N/A, interface
share/classes/sun/nio/ch/IOUtil.java share/classes/jdk/internal/access/JavaNioAccess.java, share/classes/sun/nio/ch/NativeDispatcher.java, share/classes/sun/nio/ch/IOVecWrapper.java
share/classes/sun/nio/ch/DirectBuffer.java share/classes/java/lang/MemoryAddress.java
share/classes/sun/nio/fs/NativeBuffer.java share/classes/jdk/internal/misc/Unsafe.java
share/classes/jdk/internal/misc/Unsafe.java share/prims/unsafe.cpp
share/classes/java/nio/Buffer.java share/classes/java/nio/MappedMemoryUtils.java
share/classes/java/nio/X-Buffer.java.template share/classes/java/nio/Buffer.java
share/classes/java/nio/Direct-X-Buffer-bin.java.template share/classes/jdk/internal/misc/Unsafe.java, share/classes/sun/nio/ch/DirectBuffer.java, share/classes/java/nio/X-Buffer.java.template
share/classes/jdk/internal/misc/X-ScopedMemoryAccess.java.template share/classes/java/nio/Direct-X-Buffer.java.template, share/classes/java/nio/X-Buffer.java.template, share/classes/java/nio/Direct-X-Buffer-bin.java.template, share/classes/java/nio/Buffer.java
share/classes/java/nio/MappedByteBuffer.java share/classes/jdk/internal/misc/X-ScopedMemoryAccess.java.template, share/classes/jdk/internal/access/foreign/UnmapperProxy.java
share/classes/jdk/internal/access/JavaLangAccess.java N/A, interface
share/classes/jdk/internal/access/foreign/UnmapperProxy.java N/A, interface

Helper functions added to jvm.c

Finally, we have also added JVM_NewMemoryAddress and JVM_GetMemoryAddress helper functions in jvm.cpp which wrap methods in javaClasses.cpp. These methods are mostly used in the .c files shown in the table above.

JVM_ENTRY(jobject, JVM_NewMemoryAddress(JNIEnv *env, void *ptr)) 
  Handle addr = java_lang_MemoryAddress::create(ptr, CHECK_NULL); 
  assert(java_lang_MemoryAddress::get_ptr(addr()) == ptr, "checking that stored pointer == target pointer"); 
  return JNIHandles::make_local(THREAD, addr()); 
JVM_END 

JVM_ENTRY(void*, JVM_GetMemoryAddress(JNIEnv *env, jobject obj)) 
  return java_lang_MemoryAddress::get_ptr(obj, CHECK_NULL); 
JVM_END