我在PyTorch的变形金刚编码器中遇到了一个填充掩码的问题.我正在努力确保填充序列中的值不会影响模型的输出.然而,即使在将输入序列中的填充值设置为零之后,我仍然观察到输出中的差异.
以下是我的代码的简化版本:
import torch as th
from torch import nn
# Data
batch_size = 2
seq_len = 5
input_size = 16
src = th.randn(batch_size, seq_len, input_size)
# Set some values to a high value
src[0, 2, :] = 1000.0
src[1, 4, :] = 1000.0
# Generate a padding mask
padding_mask = th.zeros(batch_size, seq_len, dtype=th.bool)
padding_mask[0, 2] = 1
padding_mask[1, 4] = 1
# Pass the data through the encoder of the model
encoder = nn.TransformerEncoder(
nn.TransformerEncoderLayer(
d_model=input_size,
nhead=1,
batch_first=True,
),
num_layers=1,
norm=None,
)
out1000 = encoder(src, src_key_padding_mask=padding_mask)
# Modify the input data so that the masked vector does not affect
src[0, 2, :] = 0.0
src[1, 4, :] = 0.0
# Pass the modified data through the model
out0 = encoder(src, src_key_padding_mask=padding_mask)
# Check if the results are the same
assert th.allclose(
out1000[padding_mask == 0],
out0[padding_mask == 0],
atol=1e-5,
)
尽管在输入序列中将填充的值设置为零,但我仍然观察到Transformer编码器的输出中的差异.有没有人能帮我解释一下为什么会发生这种情况?如何确保填充序列中的值不会影响模型的输出?