Issue
For example, the HotSpot JVM implement null-pointer detection by catching SIGSEGV signal. So if we manually generate a SIGSEGV from external, will that also be recognized as NullPointerException
in some circumstances ?
Solution
Will sending
kill -11
to java process raises a NullPointerException?
It should not: a NullPointerException
is a specific exception that occurs when an application tries to use an object reference that has the null value.
Yet, from JavaSE 17 / Troubleshooting guide / Handle Signals and Exceptions
The Java HotSpot VM installs signal handlers to implement various features and to handle fatal error conditions.
For example, in an optimization to avoid explicit null checks in cases where
java.lang.NullPointerException
will be thrown rarely, theSIGSEGV
signal is caught and handled, and theNullPointerException
is thrown.In general, there are two categories where signal/traps happen:
When signals are expected and handled, like implicit null-handling. Another example is the safepoint polling mechanism, which protects a page in memory when a safepoint is required. Any thread that accesses that page causes a
SIGSEGV
, which results in the execution of a stub that brings the thread to a safepoint.Unexpected signals. That includes a
SIGSEGV
when executing in VM code, Java Native Interface (JNI) code, or native code. In these cases, the signal is unexpected, so fatal error handling is invoked to create the error log and terminate the process.
That approach allows the JVM to optimize performance by reducing the overhead of explicit null checks in the code, relying instead on the operating system's memory protection mechanisms to detect access to null references. When such access occurs, the operating system generates a SIGSEGV
signal, which the JVM then interprets as an attempt to dereference a null pointer, leading to the throwing of a NullPointerException
.
However, it is important to note that this is an internal mechanism of the JVM and is distinct from externally generated SIGSEGV
signals, such as those sent using the kill
command. External SIGSEGV
signals are generally used to indicate serious errors, including invalid memory access, and are more likely to result in a JVM crash or core dump rather than a NullPointerException
.
+---------------------+ +-----------------------------------+
| External Process | | Java Process running on HotSpot |
| sending SIGSEGV | ------> | JVM |
| (kill -11) | | Likely JVM Crash or Core Dump |
+---------------------+ +-----------------------------------+
Is the JVM always capable of detecting whether an external
SIGSEGV
is an externalSIGSEGV
or is it possible to confuse an externalSIGSEGV
for a null access when it happens at a specific time, i.e. when a potential null access is expected?
Again, it should not, but this is an implementation-specific aspect of JVM behavior.
That means the likelihood of such confusion happening in practice may vary depending on the JVM version, the specific code being executed, and the state of the JVM at the time of the signal.
See for instance "How does the JVM know when to throw a NullPointerException"
The JVM could implement the null check using virtual memory hardware. The JVM arranges that page zero in its virtual address space is mapped to a page that is unreadable + unwriteable.
Since null is represented as zero, when Java code tries to dereference null this will try to access a non-addressible page and will lead to the OS delivering a "segfault" signal to the JVM.
The JVM's segfault signal handler could trap this, figure out where the code was executing, and create and throw an NPE on the stack of the appropriate thread.
In that scenario, it should be easy to distinguish a trapped signal from within the code execution, from a received signal from the OS.
Also: "Can a SIGSEGV
in Java not crash the JVM?"
There are definitely scenarios where the JVM's
SIGSEGV
signal handler may turn theSIGSEGV
event into a Java exception.
You will only get a JVM hard crash if that cannot happen; e.g. if the thread that triggered theSIGSEGV
was executing code in a native library when the event happened.
HotSpot JVM deliberately generates SIGSEGV at startup to check certain CPU features. There is no switch to turn it off. I suggest skipping
SIGSEGV
ingdb
altogether, because JVM uses it for its own purpose in many cases.
What if the stack happens to locate at accessing an address when the
SIGSEGV
is triggered externally?
The hotspot had a major refactoring around signal handling in JDK-8255711, resulting in commit dd8e4ff.
The current code is os_linux_x86.cpp#PosixSignals::pd_hotspot_signal_handler
// decide if this trap can be handled by a stub
address stub = nullptr;
address pc = nullptr;
//%note os_trap_1
if (info != nullptr && uc != nullptr && thread != nullptr) {
pc = (address) os::Posix::ucontext_get_pc(uc);
if (sig == SIGSEGV && info->si_addr == 0 && info->si_code == SI_KERNEL) {
// An irrecoverable SI_KERNEL SIGSEGV has occurred.
// It's likely caused by dereferencing an address larger than TASK_SIZE.
return false;
}
// Handle ALL stack overflow variations here
if (sig == SIGSEGV) {
address addr = (address) info->si_addr;
// check if fault address is within thread stack
if (thread->is_in_full_stack(addr)) {
// stack overflow
if (os::Posix::handle_stack_overflow(thread, addr, pc, uc, &stub)) {
return true; // continue
}
}
}
if ((sig == SIGSEGV) && VM_Version::is_cpuinfo_segv_addr(pc)) {
// Verify that OS save/restore AVX registers.
stub = VM_Version::cpuinfo_cont_addr();
}
if (thread->thread_state() == _thread_in_Java) {
// Java thread running in Java code => find exception handler if any
// a fault inside compiled code, the interpreter, or a stub
if (sig == SIGSEGV && SafepointMechanism::is_poll_address((address)info->si_addr)) {
stub = SharedRuntime::get_poll_stub(pc);
} else if (sig == SIGBUS /* && info->si_code == BUS_OBJERR */) {
// BugId 4454115: A read from a MappedByteBuffer can fault
// here if the underlying file has been truncated.
// Do not crash the VM in such a case.
CodeBlob* cb = CodeCache::find_blob(pc);
CompiledMethod* nm = (cb != nullptr) ? cb->as_compiled_method_or_null() : nullptr;
bool is_unsafe_arraycopy = thread->doing_unsafe_access() && UnsafeCopyMemory::contains_pc(pc);
if ((nm != nullptr && nm->has_unsafe_access()) || is_unsafe_arraycopy) {
address next_pc = Assembler::locate_next_instruction(pc);
if (is_unsafe_arraycopy) {
next_pc = UnsafeCopyMemory::page_error_continue_pc(pc);
}
stub = SharedRuntime::handle_unsafe_access(thread, next_pc);
}
}
else
#ifdef AMD64
if (sig == SIGFPE &&
(info->si_code == FPE_INTDIV || info->si_code == FPE_FLTDIV)) {
stub =
SharedRuntime::
continuation_for_implicit_exception(thread,
pc,
SharedRuntime::
IMPLICIT_DIVIDE_BY_ZERO);
#else
if (sig == SIGFPE /* && info->si_code == FPE_INTDIV */) {
// HACK: si_code does not work on linux 2.2.12-20!!!
int op = pc[0];
if (op == 0xDB) {
// FIST
// TODO: The encoding of D2I in x86_32.ad can cause an exception
// prior to the fist instruction if there was an invalid operation
// pending. We want to dismiss that exception. From the win_32
// side it also seems that if it really was the fist causing
// the exception that we do the d2i by hand with different
// rounding. Seems kind of weird.
// NOTE: that we take the exception at the NEXT floating point instruction.
assert(pc[0] == 0xDB, "not a FIST opcode");
assert(pc[1] == 0x14, "not a FIST opcode");
assert(pc[2] == 0x24, "not a FIST opcode");
return true;
} else if (op == 0xF7) {
// IDIV
stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::IMPLICIT_DIVIDE_BY_ZERO);
} else {
// TODO: handle more cases if we are using other x86 instructions
// that can generate SIGFPE signal on linux.
tty->print_cr("unknown opcode 0x%X with SIGFPE.", op);
fatal("please update this code.");
}
#endif // AMD64
} else if (sig == SIGSEGV &&
MacroAssembler::uses_implicit_null_check(info->si_addr)) {
// Determination of interpreter/vtable stub/compiled code null exception
stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::IMPLICIT_NULL);
}
} else if ((thread->thread_state() == _thread_in_vm ||
thread->thread_state() == _thread_in_native) &&
(sig == SIGBUS && /* info->si_code == BUS_OBJERR && */
thread->doing_unsafe_access())) {
address next_pc = Assembler::locate_next_instruction(pc);
if (UnsafeCopyMemory::contains_pc(pc)) {
next_pc = UnsafeCopyMemory::page_error_continue_pc(pc);
}
stub = SharedRuntime::handle_unsafe_access(thread, next_pc);
}
// jni_fast_Get<Primitive>Field can trap at certain pc's if a GC kicks in
// and the heap gets shrunk before the field access.
if ((sig == SIGSEGV) || (sig == SIGBUS)) {
address addr = JNI_FastGetField::find_slowcase_pc(pc);
if (addr != (address)-1) {
stub = addr;
}
}
}
The JVM uses various checks to determine the context of a SIGSEGV
signal. However, I do not see a straightforward mechanism to distinguish an externally sent SIGSEGV
from one internally generated due to a null reference access.
The signal handler examines the execution context, including the program counter and the stack, to infer the cause of the SIGSEGV
. In case of a null reference, it looks for specific patterns that suggest a null pointer exception. But if an external SIGSEGV
happens to coincide precisely with a situation where the JVM's execution state resembles that of a null pointer access, distinguishing between the two can be challenging.
However, such a scenario is relatively unlikely due to the level of precision required in timing.
Answered By - VonC Answer Checked By - Gilberto Lyons (WPSolving Admin)