我在使用TensorFlow和Kera的暹罗网络工作.由于数据量很大,我也在try 使用生成器进行批量加载.我有一个用于训练的自定义对比损失函数,但在梯度计算过程中遇到错误.

错误:

Epoch 1/10
y_true shape: (None, 1)
y_pred shape: (None, 1)
loss: Tensor("contrastive_loss/Mean:0", shape=(), dtype=float32)
y_true shape: (None, 1)
y_pred shape: (None, 1)
loss: Tensor("contrastive_loss/Mean:0", shape=(), dtype=float32)
   1/1125 [..............................] - ETA: 7:56:52 - loss: 0.1281 - accuracy: 0.0625
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-17-d373b6e0f1af> in <cell line: 11>()
      9 #val_gen = data_generator(val_pairs, val_labels, batch_size, os.path.join(extract_path, 'train'))
     10 
---> 11 model.fit(
     12     train_gen,
     13     # validation_data=val_gen,

1 frames
/usr/local/lib/python3.10/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     51   try:
     52     ctx.ensure_initialized()
---> 53     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
     54                                         inputs, attrs, num_outputs)
     55   except core._NotOkStatusException as e:

InvalidArgument错误: Graph execution error:

Detected at node 'gradient_tape/contrastive_loss/mul_1/BroadcastGradientArgs' defined at (most recent call last):
    File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
      return _run_code(code, main_globals, None,
------
------"Error messages condensed"
------
    File "/usr/local/lib/python3.10/dist-packages/keras/src/optimizers/optimizer.py", line 276, in compute_gradients
      grads = tape.gradient(loss, var_list)
Node: 'gradient_tape/contrastive_loss/mul/BroadcastGradientArgs'
Incompatible shapes: [0,1] vs. [32,1]
     [[{{node gradient_tape/contrastive_loss/mul/BroadcastGradientArgs}}]] [Op:__inference_train_function_11308]

数据生成器:

def data_generator(pairs, labels, batch_size, img_dir):
    """
    Generate batches of images and labels for training/validation.

    Parameters:
    - pairs: List of tuples containing left image id, list of candidate right image ids,
             and index of ground truth right image.
    - batch_size: Number of pairs to load in each batch.
    - img_dir: Directory containing the images.

    Yields:
    Batch of images and labels.
    """
    num_samples = len(pairs)

    while True:
        # Shuffle pairs for randomness in each epoch
        # np.random.shuffle(pairs)

        # Create a list of sequence indices
        indices = np.arange(num_samples)

        # Shuffle the indices
        np.random.shuffle(indices)

        # Use the shuffled indices to shuffle the sequences
        pairs = np.array(pairs)
        pairs = pairs[indices]
        pairs = pairs.tolist()

        labels = np.array(labels)
        labels = labels[indices]
        labels = labels.tolist()

        for start_idx in range(0, num_samples, batch_size):
            end_idx = min(start_idx + batch_size, num_samples)
            batch_pairs = pairs[start_idx:end_idx]
            labels = labels[start_idx:end_idx]

            left_images = []
            right_images = []
            # labels = []

            for pair in batch_pairs:
                left_img_id, right_img_id = pair

                # Load left image
                left_img = load_and_preprocess_image(left_img_id, img_dir, 'left')
                left_images.append(left_img)

                # Load right images
                right_img = load_and_preprocess_image(right_img_id, img_dir, 'right')
                right_images.append(right_img)

            # Convert lists to numpy arrays
            left_images = np.array(left_images)
            right_images = np.array(right_images)
            labels = np.array(labels)

            yield [left_images, right_images], labels

模特培训代码:

batch_size = 32
train_gen = data_generator(train_pairs, train_labels, batch_size, os.path.join(extract_path, 'train'))
val_gen = data_generator(val_pairs, val_labels, batch_size, os.path.join(extract_path, 'train'))

model.compile(optimizer='rmsprop', loss=contrastive_loss)
model.fit(
    train_gen,
    validation_data=val_gen,
    steps_per_epoch=len(train_pairs) // batch_size,
    validation_steps=len(val_pairs) // batch_size,
    epochs=10
)

我已经在前向传递过程中验证了y_truey_pred都具有形状(None, 1),这可以从上面错误中输出的打印消息中看出.我不确定为什么我在错误中看到[32,1]的形状.

有没有人知道这可能是什么原因,或者如何解决?

我试着在损失函数中使用打印消息来确定形状,并仔细判断了我的生成函数生成的形状,这似乎是正确的.我本以为y_truey_pred的形状不匹配,但结果是一样的.这让我摸不着头脑.我在整个互联网上寻找解决方案,但没有解决.

UPDATE:问题已修复.问题出在生成器方法上,我在内部更新labels,因为在第二次迭代中没有返回标签,因此出现了形状不匹配错误.第labels = labels[start_idx:end_idx]行应更改为batch_labels = labels[start_idx:end_idx]

推荐答案

问题出在生成器方法上,我在内部更新labels,因为在第二次迭代中没有返回标签,因此出现了形状不匹配错误.第labels = labels[start_idx:end_idx]行应更改为batch_labels = labels[start_idx:end_idx]

Python相关问答推荐

数字梯度的意外值

拆分pandas列并创建包含这些拆分值计数的新列

多处理代码在while循环中不工作

如何在具有重复数据的pandas中对groupby进行总和,同时保留其他列

log 1 p numpy的意外行为

使用setuptools pyproject.toml和自定义目录树构建PyPi包

将9个3x3矩阵按特定顺序排列成9x9矩阵

给定高度约束的旋转角解析求解

joblib:无法从父目录的另一个子文件夹加载转储模型

如何使用Numpy. stracards重新编写滚动和?

使用特定值作为引用替换数据框行上的值

判断solve_ivp中的事件

OpenCV轮廓.很难找到给定图像的所需轮廓

替换现有列名中的字符,而不创建新列

PYTHON、VLC、RTSP.屏幕截图不起作用

用两个字符串构建回文

多个矩阵的张量积

Django.core.exceptions.SynchronousOnlyOperation您不能从异步上下文中调用它-请使用线程或SYNC_TO_ASYNC

文本溢出了Kivy的视区

极点用特定值替换前n行