我想找出一个正则表达式模式,它能够捕获给定汇编代码片段的操作和它所作用的两个寄存器或地址.这就是我到目前为止所拥有的:
import re
assembly_code = """
lea r8, [rcx + 8*rax]
movsd xmm0, qword ptr [rcx + 8*rax] ## xmm0 = mem[0],zero
mov rcx, r9
xor edi, edi
.p2align 4, 0x90
LBB0_12: ## Parent Loop BB0_2 Depth=1
## Parent Loop BB0_3 Depth=2
## Parent Loop BB0_4 Depth=3
## Parent Loop BB0_10 Depth=4
## Parent Loop BB0_11 Depth=5
## => This Inner Loop Header: Depth=6
movsd xmm1, qword ptr [r13 + 8*rdi] ## xmm1 = mem[0],zero
mulsd xmm1, qword ptr [rcx]
addsd xmm0, xmm1
movsd qword ptr [r8], xmm0
add rcx, 2048
lea r12, [rsi + rdi]
add r12, 1
add rdi, 1
cmp r12, r14
jl LBB0_12
## %bb.13: ## in Loop: Header=BB0_11 Depth=5
add rax, 1
add r9, 8
cmp rax, rbx
jl LBB0_11
"""
pattern = r"\b(mov|movaps|movups|movaps|movss|movsd|movlps|movhps|movlpd|movhpd|movd|movq)\b\s+(\S+)\s*,\s*(\S+(\s*\[.*?\])?)"
matches = re.findall(pattern, assembly_code)
for match in matches:
print("Instruction: ", match[0])
print("Operand 1: ", match[1])
print("Operand 2: ", match[2])
print("---")
但输出如下所示:
Instruction: movsd
Operand 1: xmm0
Operand 2: qword
---
Instruction: mov
Operand 1: rcx
Operand 2: r9
---
Instruction: movsd
Operand 1: xmm1
Operand 2: qword
---
我的目标是像qword ptr [r13 + 8*rdi]
个模式在其完整的形式.如何修改模式以使其正确捕获完整的字符串?