我试图在一个用Rust编写的解释器项目中,在运行时组装和调用代码.我用assembler crate来做这个.我想对JIT代码可以直接调用的重要运行时功能有extern "C" fn个包装.为了最小化生成的代码段中的复杂性,我希望将解释器状态保持为全局变量.

然而,每当我试图访问和/或修改任何全局状态时,程序都会不断崩溃.一个简单的println!("hello world")与一般保护故障一起崩溃,似乎是在stdio::_print()访问stdout时.还有一个简单的示例,其中包含static mut个变量转储核心.有趣的是,虽然前者在Valgrind下崩溃,堆栈跟踪整洁,但后一个示例运行正常,实际上传递了提示程序按预期工作的断言.请注意,静态变量不需要是可变的,读取它就足以导致崩溃.

我对assembler crate 下面使用的mmap个细节一无所知,但我无法找到我的方法崩溃的任何线索.任何指导都将不胜感激.

我创造了一个Repl.it repl with an MRE.

use anyhow::Result;
use assembler::*;
use assembler::mnemonic_parameter_types::{registers::*, immediates::*};

const CHUNK_LENGTH: usize = 4096;
const LABEL_COUNT: usize = 64;

static mut X: u32 = 0;

#[no_mangle]
unsafe extern "C" fn foo() {
    // printing here will lead to a coredump,
    // Valgrind will provide more insight (general protection fault)
  
    // println!("hello world");

    // modifying a global variable instead will also dump
    // core but will run without fail in Valgrind
    X += 1
}

fn main() -> Result<()> {
    let mut memory_map = ExecutableAnonymousMemoryMap::new(CHUNK_LENGTH, true, true)?;
    let mut instr_stream = memory_map.instruction_stream(&InstructionStreamHints {
        number_of_labels: LABEL_COUNT,
        ..Default::default()
    });

    let f = instr_stream.nullary_function_pointer::<i64>();
    instr_stream.call_function(foo as unsafe extern "C" fn());
    instr_stream.mov_Register64Bit_Immediate64Bit(Register64Bit::RAX, Immediate64Bit(0x123456789abcdef0));
    instr_stream.ret();
    instr_stream.finish();

    assert_eq!(unsafe { f() }, 0x123456789abcdef0);
    assert_eq!(unsafe { X }, 1);
    Ok(())
}

以下是两种情况下的Valgrind输出.

全局变量:

==4186== Memcheck, a memory error detector
==4186== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4186== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==4186== Command: target/debug/jit-ffi-fault-mre
==4186== 
==4186== 
==4186== HEAP SUMMARY:
==4186==     in use at exit: 0 bytes in 0 blocks
==4186==   total heap usage: 14 allocs, 14 frees, 199,277 bytes allocated
==4186== 
==4186== All heap blocks were freed -- no leaks are possible
==4186== 
==4186== For lists of detected and suppressed errors, rerun with: -s
==4186== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

println!:

==4341== Memcheck, a memory error detector
==4341== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4341== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==4341== Command: target/debug/jit-ffi-fault-mre
==4341== 
==4341== 
==4341== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==4341==  General Protection Fault
==4341==    at 0x13FB7A: std::io::stdio::_print (stdio.rs:1028)
==4341==    by 0x1174B0: foo (main.rs:15)
==4341==    by 0x4E58004: ???
==4341==    by 0x11A1BC: jit_ffi_fault_mre::main (main.rs:35)
==4341==    by 0x113FEA: core::ops::function::FnOnce::call_once (function.rs:248)
==4341==    by 0x11497D: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:122)
==4341==    by 0x114E70: std::rt::lang_start::{{closure}} (rt.rs:145)
==4341==    by 0x13CA95: call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (function.rs:280)
==4341==    by 0x13CA95: do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panicking.rs:492)
==4341==    by 0x13CA95: try<i32, &(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (panicking.rs:456)
==4341==    by 0x13CA95: catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panic.rs:137)
==4341==    by 0x13CA95: {closure#2} (rt.rs:128)
==4341==    by 0x13CA95: do_call<std::rt::lang_start_internal::{closure_env#2}, isize> (panicking.rs:492)
==4341==    by 0x13CA95: try<isize, std::rt::lang_start_internal::{closure_env#2}> (panicking.rs:456)
==4341==    by 0x13CA95: catch_unwind<std::rt::lang_start_internal::{closure_env#2}, isize> (panic.rs:137)
==4341==    by 0x13CA95: std::rt::lang_start_internal (rt.rs:128)
==4341==    by 0x114E3F: std::rt::lang_start (rt.rs:144)
==4341==    by 0x11A37B: main (in /home/runner/UnsightlyAwfulPhases/jit-ffi-fault-mre/target/debug/jit-ffi-fault-mre)
==4341== 
==4341== HEAP SUMMARY:
==4341==     in use at exit: 85 bytes in 3 blocks
==4341==   total heap usage: 14 allocs, 11 frees, 199,277 bytes allocated
==4341== 
==4341== LEAK SUMMARY:
==4341==    definitely lost: 0 bytes in 0 blocks
==4341==    indirectly lost: 0 bytes in 0 blocks
==4341==      possibly lost: 0 bytes in 0 blocks
==4341==    still reachable: 85 bytes in 3 blocks
==4341==         suppressed: 0 bytes in 0 blocks
==4341== Rerun with --leak-check=full to see details of leaked memory
==4341== 
==4341== For lists of detected and suppressed errors, rerun with: -s
==4341== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
/tmp/nix-shell-4318-0/rc: line 1:  4341 Segmentation fault      (core dumped) valgrind target/debug/jit-ffi-fault-mre

推荐答案

我知道这段代码有两个问题.

其中一条来自call_function的文档中的这条 comments :

在64位模式下,32位位移符号扩展到64位.

警告:发出代码的位置可能是,如果它距离公共库函数调用(例如printf)超过2Gb;在这种情况下,最好间接使用绝对地址,例如call\u Register64Bit或call\u any64bit内存.

这是一种奇特的说法,表示这个call的参数是相对于当前rip的32位值.

通常,一起编译的代码都非常接近,这已经足够了,但对于Rust编译函数和动态分配的匿名映射,不能保证它们之间的距离小于2GB.

例如,在我的系统中,foo位于地址0x559a8039f5a0,而匿名内存位于0x40000000.这距离超过87657 GB!

解决方案是按照文档中的指示执行,并执行64位绝对跳转,例如使用rax.

另一个问题是,在x86\u 64 ABI中,堆栈必须与16字节对齐.但是,执行callnullary的函数只会将8个字节推送到堆栈中,而且会发生错位.

要解决这个问题,函数需要以某种方式重新调整堆栈.如果该函数具有本地自动存储,则通过保留字节数乘以16加8来完成.不使用自动存储的函数,例如您的函数,通常只在开始时随机执行push,在结束时执行相应的pop.

工作代码如下:

// push %rax
instr_stream.push_Register64Bit_r64(Register64Bit::RAX);
// movabs foo, %rax
instr_stream.mov_Register64Bit_Immediate64Bit(
    Register64Bit::RAX,
    Immediate64Bit(foo as i64)
);
// call *%rax
instr_stream.call_Register64Bit(Register64Bit::RAX);
// movabs 0x123456789abcdef0, %rax
instr_stream.mov_Register64Bit_Immediate64Bit(Register64Bit::RAX, Immediate64Bit(0x123456789abcdef0));
// pop %ecx
instr_stream.pop_Register64Bit_r64(Register64Bit::RCX);
// ret
instr_stream.ret();
instr_stream.finish();

Rust相关问答推荐

如何使用syn插入 comments ?

交换引用时的生命周期

替换可变引用中的字符串会泄漏内存吗?

一种随机局部搜索算法的基准(分数)

关于 map 闭合求和的问题

获取与父字符串相关的&;str的原始片段

处理带有panic 的 Err 时,匹配臂具有不兼容的类型

通过异常从同步代码中产生yield 是如何工作的?

要求类型参数有特定的大小?

在描述棋盘时如何最好地使用特征与枚举

为什么我的trait 对象类型不匹配?

当你删除一个存在于堆栈中的值时,为什么 rust 不会抱怨

将 Futures 的生命周期特征绑定到 fn 参数

为什么在 macOS / iOS 上切换 WiFi 网络时 reqwest 响应会挂起?

Rust,我如何正确释放堆分配的内存?

在空表达式语句中移动的值

需要括号的宏调用中的不必要的括号警告 - 这是编写宏的糟糕方法吗?

如何创建动态创建值并向它们返回borrow 的工厂?

如何在不设置精度的情况下打印浮点数时保持尾随零?

在 macro_rules 中转义 $ 美元符号