我相信我在执行O‘Neill的PCG PRNG时发现了GCC的错误.(Initial code on Godbolt's Compiler Explorer)
在将oldstate
乘以MULTIPLIER
(结果存储在rdi中)后,GCC不会将该结果添加到INCREMENT
,而是将INCREMENT
移动到rdx,然后将其用作rand32_ret.state的返回值
最小可复制示例(Compiler Explorer):
#include <stdint.h>
struct retstruct {
uint32_t a;
uint64_t b;
};
struct retstruct fn(uint64_t input)
{
struct retstruct ret;
ret.a = 0;
ret.b = input * 11111111111 + 111111111111;
return ret;
}
生成的程序集(GCC 9.2,x86_64,-O3):
fn:
movabs rdx, 11111111111 # multiplier constant (doesn't fit in imm32)
xor eax, eax # ret.a = 0
imul rdi, rdx
movabs rdx, 111111111111 # add constant; one more 1 than multiplier
# missing add rdx, rdi # ret.b=... that we get with clang or older gcc
ret
# returns RDX:RAX = constant 111111111111 : 0
# independent of input RDI, and not using the imul result it just computed
有趣的是,将 struct 修改为将uint64_t作为第一个成员produces correct code,changing both members to be uint64_t也是如此
x86-64 system V确实会在rdx:rax中返回小于16字节的 struct ,如果它们很容易复制的话.在这种情况下,第二个构件在RDX中,因为Rax的上半部分是用于对齐的填充物,或者当.a
是较窄的类型时是.b
.(sizeof(retstruct)
是16;我们不使用__attribute__((packed))
,因此它遵循alignof(Uint64_T)=8.)
Does this code contain any undefined behaviour that would allow GCC to emit the "incorrect" assembly?
如果没有,则应在https://gcc.gnu.org/bugzilla/天内报告