如果重要的话,我在装有NumPy 1.26.4的Windows11专业版台式电脑上使用的是Python3.11.5 64位.

为了更好地理解当我从某个给定的SeedSequence请求一个np.random.Generator对象时,NumPy在幕后做了什么,我决定try 在纯Python语言中重建当我从给定的熵值初始化一个SeedSequence时会发生什么.

基于SeedSequence Found here的源代码,以及我对uint32溢出工作原理的理解,以及(至少在我的机器上)np.dtype(np.uint32).itemsize是4,即XSHIFT,定义为np.dtype(np.uint32).itemsize * 8 // 2是16的事实,我编写了以下代码:

seed = int(input('Please enter a seed: '))
Entropy = seed
Spawn_key = ()
Pool_size = 8
N_children_spawned = 0
Pool = [0 for _ in range(Pool_size)]
Assembled_entropy = []
Ent = Entropy + 0
while Ent > 0:
    Assembled_entropy.append(Ent & 0xffffffff)
    Ent >>= 32
if not Assembled_entropy:
    Assembled_entropy = [0]

hash_const = 0x43b0d7e5
for i in range(Pool_size):
    if i < len(Assembled_entropy):
        Assembled_entropy[i] ^= hash_const
        hash_const *= 0x931e8875
        hash_const &= 0xffffffff
        Assembled_entropy[i] *= hash_const
        Assembled_entropy[i] &= 0xffffffff
        Assembled_entropy[i] ^= Assembled_entropy[i] >> 16
        Pool[i] = Assembled_entropy[i]
    else:
        value = hash_const
        hash_const *= 0x931e8875
        hash_const &= 0xffffffff
        value *= hash_const
        value &= 0xffffffff
        value ^= value >> 16
        Pool[i] = value
for i_src in range(Pool_size):
    for i_dst in range(Pool_size):
        if i_src != i_dst:
            Pool[i_src] ^= hash_const
            hash_const *= 0x931e8875
            hash_const &= 0xffffffff
            Pool[i_src] *= hash_const
            Pool[i_src] &= 0xffffffff
            Pool[i_src] ^= Pool[i_src] >> 16
            x = (0xca01f9dd * Pool[i_dst]) & 0xffffffff
            y = (0x4973f715 * Pool[i_src]) & 0xffffffff
            Pool[i_dst] = x - y
            Pool[i_dst] &= 0xffffffff
            Pool[i_dst] ^= Pool[i_dst] >> 16
for i_src in range(Pool_size, len(Assembled_entropy)):
    for i_dst in range(Pool_size):
        Assembled_entropy[i_src] ^= hash_const
        hash_const *= 0x931e8875
        hash_const &= 0xffffffff
        Assembled_entropy[i_src] *= hash_const
        Assembled_entropy[i_src] &= 0xffffffff
        Assembled_entropy[i_src] ^= Assembled_entropy[i_src] >> 16
        x = (0xca01f9dd * Pool[i_dst]) & 0xffffffff
        y = (0x4973f715 * Assembled_entropy[i_src]) & 0xffffffff
        Pool[i_dst] = x - y
        Pool[i_dst] &= 0xffffffff
        Pool[i_dst] ^= Pool[i_dst] >> 16
print(Pool)

我已经复制了下面一些测试运行的shell 输出.

Please enter a seed: 0
[595626433, 3558985979, 200295889, 3864401631, 3155212474, 198111058, 4047350828, 373757291]
Please enter a seed: 1
[2396653877, 491222160, 2441066534, 3196981647, 1764919720, 3210735412, 1132315803, 1197535761]
Please enter a seed: 123456789
[2161290507, 266876805, 2694113549, 3306969538, 3218948428, 3543586554, 886289367, 3129292100]
Please enter a seed: 123456789123456789
[2628723507, 610487362, 209721652, 1960674985, 3519121735, 1259052354, 2097159984, 3934338599]
Please enter a seed: 123456789123456789123456789123456789
[2988668238, 798946769, 2484899198, 1005350017, 2633831484, 343737596, 1402961265, 3184558744]
Please enter a seed: 123456789123456789123456789123456789123456789123456789123456789123456789
[431881030, 3789410928, 218849910, 879851040, 1423068736, 85390627, 3721593143, 198649564]
Please enter a seed: 123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789
[702225118, 2293461530, 514808704, 2115883586, 3179647446, 3197133803, 3807436730, 1822195906]

from numpy.random import SeedSequence
seed = int(input('Please enter a seed: '))
seedseq = SeedSequence(entropy=seed, spawn_key=[], pool_size=8, n_children_spawned=0)
print([int(value) for value in seedseq.pool])

然而,将这些相同的值提供给上面版本的程序(直接调用NumPy‘S SeedSequence)会产生非常不同的结果:

Please enter a seed: 0
[2043904064, 467759482, 3940449851, 2747621207, 4006820188, 4161973813, 800317807, 2622167125]
Please enter a seed: 1
[476219752, 3923368624, 2653737542, 2876255837, 1861759290, 3300511046, 3253139541, 2224879358]
Please enter a seed: 123456789
[480462800, 1421661229, 2686834002, 3365909768, 3295673516, 1830753151, 1249963727, 3680881655]
Please enter a seed: 123456789123456789
[3112345096, 1618497203, 2864025213, 3262672577, 379697145, 163816190, 1265228116, 2568065655]
Please enter a seed: 123456789123456789123456789123456789
[2197723902, 2868273012, 1547285866, 2772382071, 2016971656, 1130152919, 897020445, 135618137]
Please enter a seed: 123456789123456789123456789123456789123456789123456789123456789123456789
[3230290517, 251217303, 1180998335, 454107561, 4150025399, 1840013050, 1216833737, 89665521]
Please enter a seed: 123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789
[902839167, 3446715647, 2106916613, 1578536987, 595141342, 3126308643, 400300642, 3659109886]

这里发生什么事情?



UPDATE:根据@OskarHoffman的回答,我已经修复了我的代码.它包含在这里,以防任何人感兴趣.

seed = int(input('Please enter a seed: '))
Entropy = seed
Spawn_key = ()
Pool_size = 8
N_children_spawned = 0
Pool = [0 for _ in range(Pool_size)]
Assembled_entropy = []
Ent = Entropy + 0
while Ent > 0:
    Assembled_entropy.append(Ent & 0xffffffff)
    Ent >>= 32
if not Assembled_entropy:
    Assembled_entropy = [0]

hash_const = 0x43b0d7e5
for i in range(Pool_size):
    if i < len(Assembled_entropy):
        temp = Assembled_entropy[i] ^ hash_const
        hash_const *= 0x931e8875
        hash_const &= 0xffffffff
        temp *= hash_const
        temp &= 0xffffffff
        temp ^= temp >> 16
        Pool[i] = temp
    else:
        value = hash_const
        hash_const *= 0x931e8875
        hash_const &= 0xffffffff
        value *= hash_const
        value &= 0xffffffff
        value ^= value >> 16
        Pool[i] = value
for i_src in range(Pool_size):
    for i_dst in range(Pool_size):
        if i_src != i_dst:
            temp = Pool[i_src] ^ hash_const
            hash_const *= 0x931e8875
            hash_const &= 0xffffffff
            temp *= hash_const
            temp &= 0xffffffff
            temp ^= temp >> 16
            x = (0xca01f9dd * Pool[i_dst]) & 0xffffffff
            y = (0x4973f715 * temp) & 0xffffffff
            Pool[i_dst] = x - y
            Pool[i_dst] &= 0xffffffff
            Pool[i_dst] ^= Pool[i_dst] >> 16
for i_src in range(Pool_size, len(Assembled_entropy)):
    for i_dst in range(Pool_size):
        temp = Assembled_entropy[i_src] ^ hash_const
        hash_const *= 0x931e8875
        hash_const &= 0xffffffff
        temp *= hash_const
        temp &= 0xffffffff
        temp ^= temp >> 16
        x = (0xca01f9dd * Pool[i_dst]) & 0xffffffff
        y = (0x4973f715 * temp) & 0xffffffff
        Pool[i_dst] = x - y
        Pool[i_dst] &= 0xffffffff
        Pool[i_dst] ^= Pool[i_dst] >> 16
print(Pool)

推荐答案

区别在于您的第二个for循环实现了hashmix()函数.您在位置i_src处修改Pool列表以计算y的值.NumPy实现则不会.它只是复制值Pool[i_src](通过将其用作调用hashmix函数的参数)并修改该副本(之后丢弃它).

因此,将for循环修改为:

for i_src in range(Pool_size):
    for i_dst in range(Pool_size):
        if i_src != i_dst:
            # work with new variable instead of modifying Pool[i_src]
            temp = Pool[i_src] ^ hash_const
            hash_const *= 0x931e8875
            hash_const &= 0xffffffff
            temp *= hash_const
            temp &= 0xffffffff
            temp ^= temp >> 16
            x = (0xca01f9dd * Pool[i_dst]) & 0xffffffff
            y = (0x4973f715 * temp) & 0xffffffff
            Pool[i_dst] = x - y
            Pool[i_dst] &= 0xffffffff
            Pool[i_dst] ^= Pool[i_dst] >> 16

我得到的结果与NumPy-Implementation相同.

Python相关问答推荐

分组数据并删除重复数据

有症状地 destruct 了Python中的regex?

使可滚动框架在tkinter环境中看起来自然

处理带有间隙(空)的duckDB上的重复副本并有效填充它们

如何在python polars中停止otherate(),当使用when()表达式时?

ODE集成中如何终止solve_ivp的无限运行

Stacked bar chart from billrame

在含噪声的3D点网格中识别4连通点模式

Django RawSQL注释字段

如何在turtle中不使用write()来绘制填充字母(例如OEG)

ConversationalRetrivalChain引发键错误

Pandas—堆栈多索引头,但不包括第一列

用两个字符串构建回文

使用np.fft.fft2和cv2.dft重现相位谱.为什么结果并不相似呢?

如何在Polars中将列表中的新列添加到现有的数据帧中?

对包含JSON列的DataFrame进行分组

有什么方法可以在不对多索引DataFrame的列进行排序的情况下避免词法排序警告吗?

根据边界点的属性将图划分为子图

为什么在安装了64位Python的64位Windows 10上以32位运行?

生产者/消费者-Queue.get by list