我正在研究一个与人工智能相关的问题,我需要在视频中跟踪几个人体部位.我用图像创建了一个数据加载器,并在调用Dataset类时进行了多次转换.
下面是一个代码示例:
transform = transforms.Compose(
[
transforms.Resize(img_size),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
]
)
dataset = NamedClassDataset(annotation_folder_path=path, transform=transform, img_size=img_size, normalized=normalize)
train_set, validation_set = torch.utils.data.random_split(dataset, get_train_test_size(dataset,train_percent))
train_loader = DataLoader(dataset=train_set, shuffle=shuffle, batch_size=batch_size,num_workers=num_workers,pin_memory=pin_memory)
validation_loader = DataLoader(dataset=validation_set, shuffle=shuffle, batch_size=batch_size,num_workers=num_workers, pin_memory=pin_memory)
The problem is :运行模型后,我显示带有预测点的图像,以查看其质量.但由于图像已调整大小和规格,我无法检索其原始质量和 colored颜色 .我想在原始图像上显示点,而不是转换图像,我想知道通常的方法是什么.
我已经想到了两种解决方案,各有其缺点:
- 恢复转换,但在调用resize时不可能,因为我们丢失了信息
- 在NamedClassDataset的
__getitem__
方法中,返回索引作为第三个参数(以及图像和标签).但pytorch方法在使用__getitem__
时只需要两个输出,即(图像、相关标签).
编辑:以下是我的NamedClassDataset类的getitem:
def __getitem__(self, index):
(img_path, coords) = self.annotations.iloc[index].values
img = Image.open(img_path).convert("RGB")
w,h = img.size
# Normalize by img size
if self.img_size is not None:
if self.normalized:
coords = coords/(w,h) # Normalized
else:
n_h,n_w = self.img_size
coords = coords/(w,h)*(n_w,n_h) # Not normalized
y_coords = torch.flatten(torch.tensor(coords)).float() # Flatten outputs and convert from double to float32
if self.transform is not None:
img = self.transform(img)
return (img, y_coords)