例如,HotSpot JVM通过捕获SIGSEGV
信号来实现空点检测.那么,如果我们从外部手动生成SIGSEGV,它也会被识别为NullPointerException in some circumstances吗?
例如,HotSpot JVM通过捕获SIGSEGV
信号来实现空点检测.那么,如果我们从外部手动生成SIGSEGV,它也会被识别为NullPointerException in some circumstances吗?
向Java进程发送
kill -11
会引发NullPointerException异常吗?
不应该是这样的:NullPointerException
是当应用程序try 使用具有空值的对象引用时发生的特定异常.
然而,从JavaSE 17 / Troubleshooting guide / Handle Signals and Exceptions
Java HotSpot VM安装信号处理程序以实现各种功能并处理致命错误条件.
例如,在极少抛出
java.lang.NullPointerException
的情况下避免显式空判断的优化中,捕获并处理SIGSEGV
信号,并抛出NullPointerException
.一般来说,有两类信号/trap 发生:
当预期和处理信号时,如隐式空处理.另一个例子是SafePoint轮询机制,它在需要SafePoint时保护内存中的页面.任何访问该页的线程都会产生
SIGSEGV
,这会导致执行一个存根,将该线程带到一个安全点.意想不到的信号.这包括在VM代码、Java本机接口(JNI)代码或本机代码中执行时的
SIGSEGV
.在这些情况下,信号是意外的,因此调用致命错误处理来创建错误日志(log)并终止进程.
这种方法允许JVM通过减少代码中显式空判断的开销来优化性能,而不是依赖操作系统的内存保护机制来检测对空引用的访问.当发生这种访问时,操作系统生成一个SIGSEGV
信号,然后JVM将其解释为试图取消引用空指针,从而导致抛出NullPointerException
.
但是,需要注意的是,这是JVM的internal机制,不同于外部生成的SIGSEGV
信号,例如使用kill
命令发送的信号.外部SIGSEGV
信号通常用于指示严重错误,包括无效的内存访问,并且更有可能导致JVM崩溃或核心转储,而不是NullPointerException
.
+---------------------+ +-----------------------------------+
| External Process | | Java Process running on HotSpot |
| sending SIGSEGV | ------> | JVM |
| (kill -11) | | Likely JVM Crash or Core Dump |
+---------------------+ +-----------------------------------+
JVM是否始终能够检测外部
SIGSEGV
是否是外部SIGSEGV
,或者当外部SIGSEGV
在特定时间发生时(即预期潜在的空访问时),是否可能将外部SIGSEGV
混淆为空访问?
Again, it should not, but this is an implementation-specific aspect of JVM behavior.
That means the likelihood of such confusion happening in practice may vary depending on the JVM version, the specific code being executed, and the state of the JVM at the time of the signal.
例如,"How does the JVM know when to throw a NullPointerException"
JVM可以使用虚拟内存硬件实现空判断.JVM将其虚拟地址空间中的页面零映射到不可读+不可写的页面.
因为NULL被表示为零,所以当Java代码试图取消引用NULL时,这将try 访问不可寻址的页面,并将导致操作系统向JVM发送"Segerror"信号.
JVM的段错误信号处理程序可以捕获这一点,找出代码执行的位置,并在适当线程的堆栈上创建并抛出NPE.
在这种情况下,应该很容易区分来自代码执行中的捕获信号和来自OS的接收信号.
所以,"Can a SIGSEGV
in Java not crash the JVM?"
There are definitely scenarios where the JVM's
SIGSEGV
signal handler may turn theSIGSEGV
event into a Java exception.
You will only get a JVM hard crash if that cannot happen; e.g. if the thread that triggered theSIGSEGV
was executing code in a native library when the event happened.
一百:
HotSpot JVM deliberately generates SIGSEGV at startup判断某些CPU功能.没有switch 可以关闭它.我建议完全跳过
gdb
中的SIGSEGV
,因为JVM在许多情况下都是出于自己的目的使用它.
当
SIGSEGV
从外部触发时,如果堆栈恰好位于访问某个地址时,该怎么办?
这个热点在JDK-8255711年对信号处理进行了重大重构,结果是commit dd8e4ff.
当前的代码是os_linux_x86.cpp#PosixSignals::pd_hotspot_signal_handler
// decide if this trap can be handled by a stub
address stub = nullptr;
address pc = nullptr;
//%note os_trap_1
if (info != nullptr && uc != nullptr && thread != nullptr) {
pc = (address) os::Posix::ucontext_get_pc(uc);
if (sig == SIGSEGV && info->si_addr == 0 && info->si_code == SI_KERNEL) {
// An irrecoverable SI_KERNEL SIGSEGV has occurred.
// It's likely caused by dereferencing an address larger than TASK_SIZE.
return false;
}
// Handle ALL stack overflow variations here
if (sig == SIGSEGV) {
address addr = (address) info->si_addr;
// check if fault address is within thread stack
if (thread->is_in_full_stack(addr)) {
// stack overflow
if (os::Posix::handle_stack_overflow(thread, addr, pc, uc, &stub)) {
return true; // continue
}
}
}
if ((sig == SIGSEGV) && VM_Version::is_cpuinfo_segv_addr(pc)) {
// Verify that OS save/restore AVX registers.
stub = VM_Version::cpuinfo_cont_addr();
}
if (thread->thread_state() == _thread_in_Java) {
// Java thread running in Java code => find exception handler if any
// a fault inside compiled code, the interpreter, or a stub
if (sig == SIGSEGV && SafepointMechanism::is_poll_address((address)info->si_addr)) {
stub = SharedRuntime::get_poll_stub(pc);
} else if (sig == SIGBUS /* && info->si_code == BUS_OBJERR */) {
// BugId 4454115: A read from a MappedByteBuffer can fault
// here if the underlying file has been truncated.
// Do not crash the VM in such a case.
CodeBlob* cb = CodeCache::find_blob(pc);
CompiledMethod* nm = (cb != nullptr) ? cb->as_compiled_method_or_null() : nullptr;
bool is_unsafe_arraycopy = thread->doing_unsafe_access() && UnsafeCopyMemory::contains_pc(pc);
if ((nm != nullptr && nm->has_unsafe_access()) || is_unsafe_arraycopy) {
address next_pc = Assembler::locate_next_instruction(pc);
if (is_unsafe_arraycopy) {
next_pc = UnsafeCopyMemory::page_error_continue_pc(pc);
}
stub = SharedRuntime::handle_unsafe_access(thread, next_pc);
}
}
else
#ifdef AMD64
if (sig == SIGFPE &&
(info->si_code == FPE_INTDIV || info->si_code == FPE_FLTDIV)) {
stub =
SharedRuntime::
continuation_for_implicit_exception(thread,
pc,
SharedRuntime::
IMPLICIT_DIVIDE_BY_ZERO);
#else
if (sig == SIGFPE /* && info->si_code == FPE_INTDIV */) {
// HACK: si_code does not work on linux 2.2.12-20!!!
int op = pc[0];
if (op == 0xDB) {
// FIST
// TODO: The encoding of D2I in x86_32.ad can cause an exception
// prior to the fist instruction if there was an invalid operation
// pending. We want to dismiss that exception. From the win_32
// side it also seems that if it really was the fist causing
// the exception that we do the d2i by hand with different
// rounding. Seems kind of weird.
// NOTE: that we take the exception at the NEXT floating point instruction.
assert(pc[0] == 0xDB, "not a FIST opcode");
assert(pc[1] == 0x14, "not a FIST opcode");
assert(pc[2] == 0x24, "not a FIST opcode");
return true;
} else if (op == 0xF7) {
// IDIV
stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::IMPLICIT_DIVIDE_BY_ZERO);
} else {
// TODO: handle more cases if we are using other x86 instructions
// that can generate SIGFPE signal on linux.
tty->print_cr("unknown opcode 0x%X with SIGFPE.", op);
fatal("please update this code.");
}
#endif // AMD64
} else if (sig == SIGSEGV &&
MacroAssembler::uses_implicit_null_check(info->si_addr)) {
// Determination of interpreter/vtable stub/compiled code null exception
stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::IMPLICIT_NULL);
}
} else if ((thread->thread_state() == _thread_in_vm ||
thread->thread_state() == _thread_in_native) &&
(sig == SIGBUS && /* info->si_code == BUS_OBJERR && */
thread->doing_unsafe_access())) {
address next_pc = Assembler::locate_next_instruction(pc);
if (UnsafeCopyMemory::contains_pc(pc)) {
next_pc = UnsafeCopyMemory::page_error_continue_pc(pc);
}
stub = SharedRuntime::handle_unsafe_access(thread, next_pc);
}
// jni_fast_Get<Primitive>Field can trap at certain pc's if a GC kicks in
// and the heap gets shrunk before the field access.
if ((sig == SIGSEGV) || (sig == SIGBUS)) {
address addr = JNI_FastGetField::find_slowcase_pc(pc);
if (addr != (address)-1) {
stub = addr;
}
}
}
JVM使用各种判断来确定SIGSEGV
信号的上下文.然而,我没有看到一种直接的机制来区分外部发送的SIGSEGV
和内部由于空引用访问而生成的SIGSEGV
.
信号处理程序判断执行上下文,包括程序计数器和堆栈,以推断SIGSEGV
的原因.在空引用的情况下,它会查找建议空指针异常的特定模式.但是,如果外部SIGSEGV
恰好与JVM的执行状态类似于空指针访问的情况相一致,那么区分这两者可能具有挑战性.
然而,由于时间上所需的精确度,这种情况相对不太可能发生.