在 Python 中搜索整数中的位序列

发布于03月30日

我有两个整数，让我们称它们为haystack和needle.我需要判断一下，如果needle的二进制表示出现在haystack[OPTIONALLY找到第一个出现的位置]

示例

haystack = 0b10101111010110010101010101110
needle = 0b1011001 # occurred in position 13
needle = 0b111011 # not occurred

我在寻找最低可能的时间复杂度，我不能写一个时间复杂度超过O(h)的代码，其中h是大海捞针中的比特数.您可以在下面看到我的代码.

我需要判断预定义的needle(它永远不会改变，它是一个odd数字)在数十亿个随机的haystack整数中出现(所以我们不能对haystack进行预处理以优化速度)

由于查找位置是可选的，如果您可以编写一段时间复杂度更高的代码，只返回一个指示发生的布尔值，那么它也是完美的.因为在数十亿次判断中，我知道它没有发生，当它发生时，我可以使用以下代码找到位置.

一个有假阳性结果的好的概率算法也是可以的.

def find_needle_in_haystack(haystack, needle):
    n = needle.bit_length()  # calculate the number of bits in needle
    mask = (1 << n) - 1  # create a mask with n bits
    i = 0
    while haystack != 0:
        x = haystack & mask  # bitwise AND with mask to get first n bits
        if x == needle:
            return i
        i += 1
        haystack >>= 1  # shift haystack to the right by 1 bit
    return -1  # needle not found in haystack

# Assuming that needle is an array of bits. def needle_to_lookup (needle): ord2bits = [] for j in range(256): bits = [] k = j for _ in range(8): bits.append(k % 2) k = k // 2 ord2bits.append(tuple(reversed(bits))) lookup = [] for i in range(len(needle) + 1): for bits in ord2bits: # Do we successfully keep matching? matched = i for j in range(8): if i + j == len(needle): matched = i+j break elif needle[i+j] == bits[j]: matched = i+j else: matched = 0 break if 0 == matched: # Failed to extend for a byte. for j in range(8): for k in range(8 - j): if k == len(needle): matched = k break elif needle[k] == bits[j+k]: matched = k+1 else: matched = 0 break if 0 < matched: break lookup.append(matched) return lookup def find_needle(needle, byte_list): lookup = needle_to_lookup(needle) matched = 0 for byte in byte_list: matched = lookup[256*matched + byte] if matched == len(needle): return True return False print(find_needle([1, 0, 1, 1, 0, 0, 1], bytes([175, 89, 85, 184]))) print(find_needle([1, 1, 1, 0, 1, 1], bytes([175, 89, 85, 184])))

在 Python 中搜索整数中的位序列

示例

推荐答案

Python相关问答推荐

在Pandas框架中截短至固定数量的列

使用polars .滤镜进行切片速度比pandas .loc慢

pandas DataFrame GroupBy.diff函数的意外输出

使用FASTCGI在IIS上运行Django频道

未删除映射表的行

如何访问所有文件，例如环境变量

Pandas - groupby字符串字段并按时间范围 Select

从dict的列中分钟

如何使用它？

driver. find_element无法通过class_name找到元素'""

Python列表不会在条件while循环中正确随机化'

如何使regex代码只适用于空的目标单元格

lityter不让我输入左边的方括号，'

用fft计算指数复和代替求和来模拟衍射？

Python：从目录内的文件导入目录

python3中np. divide(x，y)和x/y有什么区别？'

在matplotlib中重叠极 map 以创建径向龙卷风图

递归链表反转与打印语句挂起

Match-Case构造中的对象可调用性测试

Scipy.linprog的可行性有问题吗？(A_ub@x0<；=b_ub).all()为True-但是-linprog(np.zeros_like(X0)，A_ub=A_ub，b_ub=b_ub)不可行