Authored by Eloise Slater (Software Engineer at THG)
What is a MemoryAddress and why did we introduce it?
In Java
pointers are typically stored within 64-bit longs. These pointers can be used within Java
code itself and passed between Java
code and C/C++
native code. However, CHERI platforms implement a new type of 128-bit fat pointer called a CHERI capability. As a 64-bit long is the largest Java
primitive, there is no primitive large enough to store a CHERI capability. This is why we introduced the MemoryAddress
class into the Java
language as a means for storing capabilities and passing capabilities between Java
code and native code.
There are a few intricacies of how the MemoryAddress
class works. Its layout is shown below:
public class MemoryAddress {
@Native private final long rawAddress;
public MemoryAddress(long address) {
this.rawAddress = address;
}
public long getRawAddress() {
return rawAddress;
}
public static long getRawAddress(MemoryAddress address) {
return MemoryAddress.isNull(address) ? 0 : address.getRawAddress();
}
public boolean isNullPointer() {
return rawAddress == 0;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
MemoryAddress that = (MemoryAddress) o;
return getRawAddress() == that.getRawAddress();
}
@Override
public int hashCode() {
return Objects.hash(getRawAddress());
}
public MemoryAddress subtract(long offset) {
return add(-offset);
}
public native MemoryAddress add(long offset);
public MemoryAddress bitAnd(MemoryAddress rhs) {
return bitAnd(rhs.getRawAddress());
}
public native MemoryAddress bitAnd(long rhs);
public static boolean isNull(MemoryAddress address) {
return address == null || address.isNullPointer();
}
}
There are a few conditions that must be met for the MemoryAddress
class to work:
- A
MemoryAddress
that contains a valid capability is never instantiated in Java code. This may seem at odds with the class shown above since there is a public constructor that lets you instantiateMemoryAddress
. However, this constructor can only create a validMemoryAddress
on platforms with 64-bit pointers, it cannot create a valid capability. AnyMemoryAddress
that contains valid 128-bit capability must be instantiated inC++
. - A pointer cannot be deference in Java code: Since it's not possible to deference a pointer in Java this is not a problem.
- Any addition or subtraction that returns a valid
MemoryAddress
should take place in native code: To increment a pointer in Java you must now call into native code using theadd
orsubtract
methods on theMemoryAddress
class. You cannot just add/subtract onto the address bits of a CHERI capability since they cannot be forged (doing so would clear the validity tag of the capability) . However, any arithmetic operation that does not need to return a valid capability can be performed on therawAddress
field of aMemoryAddress
.
How does MemoryAddress
get instantiated?
We have added the class: java_lang_MemoryAddress
into javaClasses
inside hotspot. A method called: Handle java_lang_MemoryAddress::create(void *ptr, TRAPS)
has been added to create a MemoryAddress
object with the following two fields:
address
: This is an injected field link to Hui’s blog post that is only present on the native representation ofMemoryAddress
. On morello this field has the signaturemorellointptr
which is aT_ADDRESS
, whilst on aarch64 it has the signaturebyte
since this field is not used on aarch64. On creation of aMemoryAddress
instance the address is set to the capability value on morello, whilst on aarch64 it is set to -1. This field is never exposed to Java code.rawAddress
: this long field is present on the Java representation of the class. On aarch64 the field is set the to the full value of the pointer, while on morello it is set to the value of the capability cast into a long. The Java functiongetRawAddress()
will return this value.
We have also added the function: jobject java_lang_MemoryAddress::create_object(void *ptr, TRAPS)
, which wraps the function create
but returns a jobject
.
How do you retrieve the pointer in C++
?
There are 8 methods on java_lang_MemoryAddress
that let you retrieve the pointer, the signatures are shown below:
static void *get_ptr(oop memoryAddress);
static void *get_ptr(oop memoryAddress, long offset);
static void *get_ptr(jobject obj, long offset);
static void *get_ptr(jobject obj) { return get_ptr(obj, (long) 0); }
static void *get_ptr(oop memoryAddress, TRAPS);
static void *get_ptr(oop memoryAddress, long offset, TRAPS);
static void *get_ptr(jobject obj, long offset, TRAPS);
static void *get_ptr(jobject obj, TRAPS) { return get_ptr(obj, 0, CHECK_NULL); }
These methods are all very similar with the different variations depending on:
- The format of the
MemoryAddress
(oop
orjobject
). - If you want to add an offset to the pointer.
- If you want to
TRAP
. They all eventually call:java_lang_MemoryAddress::ptr(oop memoryAddress)
, and on morello this will return theaddress
field that stores the capability, whilst on aarch64 this will return therawAddress
field that stores the long.
Use of rawAddress
on morello
The rawAddress
field is quite handy on morello and we have used it in the following scenarios:
- As a key in hashmap for the JVMCI.
- To perform arithmetic operations when a valid capability doesn't need to be returned, without additional native calls.
- To allow for easy checking of whether the address is
0/null
without an additional native call. - Allowed for incremental porting for Graal, as we are able to have JVMCI using
MemoryAddress
instead of longs. But Graal can pull out the raw addresses and carry on as normal.
Changes to Java and the Java API
Any Java class that stores a pointer that will be dereferenced later will need to be updated to use a MemoryAddress
class.
Unsafe
There are two unsafe API's in the JDK, the first: src/java.base/share/classes/jdk/internal/misc/Unsafe.java
and second: src/jdk.unsupported/share/classes/sun/misc/Unsafe.java
referred to as internal-unsafe
and sun-misc-unsafe
from now on, respectively. The sun-misc-unsafe is an externally accessible wrapper around the internal-unsafe, that provides access to a subset of the methods within internal-unsafe
. The sun-misc-unsafe
is an unsupported API that can be used by third parties. Whilst it is technically unsupported it is used by many 3rd party libraries.
The Unsafe API allows direct access to memory in a more C/C++
that is normally prohibited as a Java developer. The API heavily uses raw pointers to perform direct memory accesses hence the API needed updating to use MemoryAddress
for the port to morello.
Below shows the method putInt
:
@IntrinsicCandidate
public native void putInt(Object o, long offset, int x);
previously you would call this method one of two ways: putInt(myObject, offset_to_int, int_to_put)
or putInt(null, raw_address_to_int, int_to_put)
. In the first pathway we are calling it with an object so there are no raw addresses used. The second pathway uses a raw address, which on morello needs changing to be a MemoryAddress
. We have extended the Unsafe API so both the MemoryAddress and raw pathway are still availible. We decided not to update the Java API signature but rather its use, and its implementation in native code: src/hotspot/share/prims/unsafe.cpp
. To call putInt
on the morello platform: putInt(MemoryAddress_object, offset_to_add_to_address, int_to_put)
. This signature allows us to increment the capability and perform the unsafe function in one native call rather than two (increment then call unsafe).
Below shows the updated unsafe.cpp
to reflect the signature change in Unsafe.java
API. The get/put functions are created via the macro below:
UNSAFE_ENTRY(void, Unsafe_Put##Type(JNIEnv *env, jobject unsafe, jobject obj, jlong offset, java_type x)) { \
MemoryAccess<java_type>(thread, obj, offset).put(x); \
} UNSAFE_END \
where we instatiate a MemoryAccess
object:
MemoryAccess(JavaThread* thread, jobject obj, jlong offset)
: _thread(thread), _obj(JNIHandles::resolve(obj)), _offset((uintptr_t)offset) {
assert_field_offset_sane(_obj, offset);
}
We haven't updated either the macro or the constructor for MemoryAccess
, but we have updated the put
and get
methods on the MemoryAccess
class as shown below.
void put(T x) {
if (_obj == NULL) { // the original pathway for raw address access
GuardUnsafeAccess guard(_thread);
RawAccess<>::store(addr(), normalize_for_write(x));
} else {
if (_obj->is_a(vmClasses::MemoryAddress_klass())) {
GuardUnsafeAccess guard(_thread);
RawAccess<>::store(addr_from_obj(), normalize_for_write(x)); // morello pathway for raw address access
} else {
HeapAccess<>::store_at(_obj, _offset, normalize_for_write(x)); // access via offset from an object
}
}
}
We have kept the original pathway where the _obj == NULL
, and it does a raw access using the _offset
. We have changed the else branch, where we check if the object is a MemoryAddress
class. If it is, we do a RawAccess
and call the method addr_from_obj()
:
volatile T* addr_from_obj() {
void* ptr = java_lang_MemoryAddress::get_ptr(_obj, _offset);
return static_cast<volatile T*>(ptr);
}
the java_lang_MemoryAddress::get_ptr(_obj, _offset)
is where we increment the pointer as shown below:
void *java_lang_MemoryAddress::get_ptr(oop memoryAddress, long offset) {
void *ptr = get_ptr(memoryAddress);
#ifdef MORELLO
// MOJO: Pointer stored is null, and you are trying to use offset as raw address.
if (ptr == nullptr && offset != 0) ShouldNotReachHere();
#endif
return (void*)((uintptr_t) ptr + offset);
}
We have added the following methods, where the original methods have been kept for compatibility:
getMemoryAddress
: this should replacegetLong
where the long returned is a raw pointer, and all calls togetAddress
.putMemoryAddress
: this should replaceputLong
where the long returned is a raw pointer, and all calls toputAddress
.allocateMemoryObject
: this should replace calls toallocateMemory
.reallocateMemoryObject
: this should replace calls toreallocateMemory
freeMemoryObject
: this should replace calls tofreeMemory
We have also updated the method: compareAndSetAddress
to use MemoryAddress
.
Buffers
Java buffer code is located in the nio
package. Buffers can either be direct or non-direct buffers. Non-direct buffers are allocated on the Java heap, whilst direct buffers are allocated on the native heap. The word direct indicts you are directly handling memory in this case, hence direct buffers are not collected by GC's and must be manually freed. Since direct buffers handle memory directly, they store the address of where they have been allocated in a field on the object. We have updated this field to use MemoryAddress
.
Currently we have added the Buffer
class to have both the long address
and the MemoryAddress memoryAddress
on the class to prevent inheritance issues.
public abstract class Buffer {
...
long address;
MemoryAddress memoryAddress;
...
For brevity I will not go into the full details of how we have updated all the buffer code to include MemoryAddress
, as it doesn't affect the API much since it's abstracted away. The majority of the changes can be found in the code related to direct buffers including: Direct-X-Buffer-bin.java.template
and Direct-X-Buffer.java.template
. In particular, most changes are in the constructor for the DirectBuffer either taking the MemoryAddress
for the array location or using the unsafe.allocateMemoryObject
to allocate the array. For all HeapBuffers the memoryAddress
field is set to null
.
We also updated classes in the zip
package to use .memoryAddress
rather than .address
including: CRC32
, Adler
, CRC32C
, Deflater
and Inflater
. In the sun.nio
package we have update: SockerChannelImpl
, NioSocketImpl
, DatagramChannelImpl
and UnixUserDefinedFileAttributeView
. See the table below for more details of where MemoryAddress
has propagated out.
Native buffers
The: NativeBuffer.java
class has been updated to have a MemoryAddress
instead of a long. This has not changed the Java API itself as the NativeBuffer
constructor take in the size to allocate, which then calls unsafe.allocateMemoryObject
. But this does mean that MemoryAddress
has had to be propagated out to other classes, see table below for details.
All the files update to use MemoryAddress
:
At the time of writing this is a list of all the Java files that now use MemoryAddress
, with their Java or Native counterparts. This list does not include any files for the JVMCI.
Java class | Native or Java counterparts |
---|---|
share/classes/java/lang/MemoryAddress.java | share/native/libjava/MemoryAddress.c |
bsd/classes/sun/nio/ch/KQueueSelectorImpl.java | bsd/classes/sun/nio/ch/KQueue.java |
bsd/classes/sun/nio/ch/KQueue.java | bsd/native/libnio/ch/KQueue.c |
bsd/classes/sun/nio/ch/KQueuePort.java | bsd/classes/sun/nio/ch/KQueue.java |
bsd/classes/sun/nio/fs/BsdNativeDispatcher.java | bsd/native/libnio/fs/BsdNativeDispatcher.c |
bsd/classes/sun/nio/fs/BsdFileSystem.java | bsd/classes/sun/nio/fs/BsdNativeDispatcher.java |
share/classes/sun/nio/ch/DatagramChannelImpl.java | unix/native/libnio/ch/DatagramChannelImpl.c |
share/classes/sun/nio/ch/FileChannelImpl.java | unix/native/libnio/ch/FileChannelImpl.c |
unix/classes/sun/nio/ch/DatagramDispatcher.java | unix/native/libnio/ch/DatagramDispatcher.c |
unix/classes/sun/nio/ch/SocketDispatcher.java | unix/native/libnio/ch/SocketDispatcher.c |
unix/classes/sun/nio/ch/FileDispatcherImpl.java | unix/native/libnio/ch/FileDispatcherImpl.c |
unix/classes/sun/nio/ch/NativeThread.java | unix/native/libnio/ch/NativeThread.c |
share/classes/sun/nio/ch/SocketChannelImpl.java | unix/classes/sun/nio/ch/NativeThread.java |
share/classes/sun/nio/ch/ServerSocketChannelImpl.java | unix/classes/sun/nio/ch/NativeThread.java |
share/classes/sun/nio/ch/NativeThreadSet.java | unix/classes/sun/nio/ch/NativeThread.java |
share/classes/sun/nio/ch/NioSocketImpl.java | unix/classes/sun/nio/ch/NativeThread.java |
unix/classes/sun/nio/ch/SourceChannelImpl.java | unix/classes/sun/nio/ch/NativeThread.java |
unix/classes/sun/nio/ch/SinkChannelImpl.java | unix/classes/sun/nio/ch/NativeThread.java |
share/classes/java/nio/MappedMemoryUtils.java | unix/native/libnio/MappedMemoryUtils.c |
unix/classes/sun/nio/fs/UnixNativeDispatcher.java | unix/native/libnio/fs/UnixNativeDispatcher.c |
unix/classes/sun/nio/fs/UnixCopyFile.java | unix/classes/sun/nio/fs/UnixNativeDispatcher.java |
unix/classes/sun/nio/fs/UnixUserDefinedFileAttributeView.java | sun.nio.ch.DirectBuffer, NativeBuffer, unix/classes/sun/nio/fs/UnixNativeDispatcher.java, share/classes/jdk/internal/misc/Unsafe.java |
unix/classes/sun/nio/fs/UnixDirectoryStream.java | unix/classes/sun/nio/fs/UnixNativeDispatcher.java |
unix/classes/sun/nio/fs/UnixFileSystemProvider.java | unix/classes/sun/nio/fs/UnixNativeDispatcher.java, unix/classes/sun/nio/fs/UnixDirectoryStream.java |
unix/classes/sun/nio/fs/UnixSecureDirectoryStream.java | unix/classes/sun/nio/fs/UnixDirectoryStream.java |
linux/classes/sun/nio/fs/LinuxNativeDispatcher.java | linux/native/libnio/fs/LinuxNativeDispatcher.c |
linux/native/libnio/fs/LinuxWatchService.c | share/classes/jdk/internal/misc/Unsafe.java, NativeBuffer, unix/classes/sun/nio/fs/UnixNativeDispatcher.java |
linux/classes/sun/nio/fs/LinuxFileSystem.java | linux/classes/sun/nio/fs/LinuxNativeDispatcher.java |
linux/classes/sun/nio/fs/LinuxWatchService.java | share/classes/jdk/internal/misc/Unsafe.java, unix/classes/sun/nio/fs/UnixNativeDispatcher.java |
share/classes/jdk/internal/loader/NativeLibrary.java | N/A, interface |
share/classes/jdk/internal/loader/NativeLibraries.java | share/classes/jdk/internal/loader/NativeLibrary.java, share/native/libjava/NativeLibraries.c |
share/classes/java/lang/ClassLoader.java | share/classes/jdk/internal/loader/NativeLibraries.java |
share/classes/java/lang/System.java | share/classes/java/lang/ClassLoader.java |
share/classes/java/util/zip/Deflater.java | share/native/libzip/Deflater.c, share/classes/java/nio/Direct-X-Buffer-bin.java.template |
share/classes/java/util/zip/Inflater.java | share/native/libzip/Inflater.c, share/classes/java/nio/Direct-X-Buffer-bin.java.template |
share/classes/java/util/zip/CRC32.java | share/native/libzip/CRC32.c, share/classes/java/nio/Direct-X-Buffer-bin.java.template |
share/classes/java/util/zip/CRC32C.java | share/classes/jdk/internal/misc/Unsafe.java |
share/classes/java/util/zip/Adler32.java | share/native/libzip/Adler32.c |
share/classes/sun/nio/ch/NativeObject.java | share/classes/jdk/internal/misc/Unsafe.java |
share/classes/sun/nio/ch/IOVecWrapper.java | share/classes/sun/nio/ch/NativeObject.java, share/classes/jdk/internal/misc/Unsafe.java |
share/classes/sun/nio/ch/NativeSocketAddress.java | share/classes/jdk/internal/misc/Unsafe.java |
share/classes/sun/nio/ch/Util.java | share/classes/jdk/internal/misc/Unsafe.java, share/classes/java/nio/Direct-X-Buffer-bin.java.template |
share/classes/sun/nio/ch/NativeDispatcher.java | |
share/classes/jdk/internal/access/JavaNioAccess.java | N/A, interface |
share/classes/sun/nio/ch/IOUtil.java | share/classes/jdk/internal/access/JavaNioAccess.java, share/classes/sun/nio/ch/NativeDispatcher.java, share/classes/sun/nio/ch/IOVecWrapper.java |
share/classes/sun/nio/ch/DirectBuffer.java | share/classes/java/lang/MemoryAddress.java |
share/classes/sun/nio/fs/NativeBuffer.java | share/classes/jdk/internal/misc/Unsafe.java |
share/classes/jdk/internal/misc/Unsafe.java | share/prims/unsafe.cpp |
share/classes/java/nio/Buffer.java | share/classes/java/nio/MappedMemoryUtils.java |
share/classes/java/nio/X-Buffer.java.template | share/classes/java/nio/Buffer.java |
share/classes/java/nio/Direct-X-Buffer-bin.java.template | share/classes/jdk/internal/misc/Unsafe.java, share/classes/sun/nio/ch/DirectBuffer.java, share/classes/java/nio/X-Buffer.java.template |
share/classes/jdk/internal/misc/X-ScopedMemoryAccess.java.template | share/classes/java/nio/Direct-X-Buffer.java.template, share/classes/java/nio/X-Buffer.java.template, share/classes/java/nio/Direct-X-Buffer-bin.java.template, share/classes/java/nio/Buffer.java |
share/classes/java/nio/MappedByteBuffer.java | share/classes/jdk/internal/misc/X-ScopedMemoryAccess.java.template, share/classes/jdk/internal/access/foreign/UnmapperProxy.java |
share/classes/jdk/internal/access/JavaLangAccess.java | N/A, interface |
share/classes/jdk/internal/access/foreign/UnmapperProxy.java | N/A, interface |
Helper functions added to jvm.c
Finally, we have also added JVM_NewMemoryAddress
and JVM_GetMemoryAddress
helper functions in jvm.cpp
which wrap methods in javaClasses.cpp
. These methods are mostly used in the .c
files shown in the table above.
JVM_ENTRY(jobject, JVM_NewMemoryAddress(JNIEnv *env, void *ptr))
Handle addr = java_lang_MemoryAddress::create(ptr, CHECK_NULL);
assert(java_lang_MemoryAddress::get_ptr(addr()) == ptr, "checking that stored pointer == target pointer");
return JNIHandles::make_local(THREAD, addr());
JVM_END
JVM_ENTRY(void*, JVM_GetMemoryAddress(JNIEnv *env, jobject obj))
return java_lang_MemoryAddress::get_ptr(obj, CHECK_NULL);
JVM_END