今天早些时候,这对我来说效果很好,但当我重新启动笔记本时,它突然表现得非常奇怪.
ds = ds.map(
lambda file, label: tuple([tf.numpy_function(read_npy_file, [file], [tf.float32]), label]),
num_parallel_calls=tf.data.AUTOTUNE)
,其中read\u npy\u文件如下所示:
def read_npy_file(data):
# 'data' stores the file name of the numpy binary file storing the features of a particular sound file
# as a bytes string.
# decode() is called on the bytes string to decode it from a bytes string to a regular string
# so that it can passed as a parameter into np.load()
data = np.load(data.decode())
return data.astype(np.float32)
如您所见,映射应该创建另一个张量元组,其中第一个张量是numpy数组,第二个张量是标签,未触及.这在早些时候起到了很好的作用,但现在却出现了最奇怪的行为.我在read_npy_file()
函数中放置了print语句,以查看传入的数据是否正确.我希望它传递一个字节字符串,但当我在read_npy_file()
函数中调用print(data)
并从数据集中提取1项以使用ds.take(1)
触发一个映射时,它会生成此输出:
b'./challengeA_data/log_spectrogram/2603ebb3-3cd3-43cc-98ef-0c128c515863.npy'b'./challengeA_data/log_spectrogram/fab6a266-e97a-4935-a0c3-444fc4426fc5.npy'b'./challengeA_data/log_spectrogram/93014682-60a2-45bd-9c9e-7f3c97b83be9.npy'b'./challengeA_data/log_spectrogram/710f2430-5da3-4822-a252-6ad3601b92d9.npy'b'./challengeA_data/log_spectrogram/e757058c-91de-4381-8184-65f001c95647.npy'
b'./challengeA_data/log_spectrogram/38b12689-04ba-422b-a972-5856b05ca868.npy'
b'./challengeA_data/log_spectrogram/7c9ccc04-a2d2-4eec-bafd-0c97b3658c26.npy'b'./challengeA_data/log_spectrogram/c7cc3520-7218-4d07-9f0a-6bd7bb90a551.npy'
b'./challengeA_data/log_spectrogram/21f6060a-9766-4810-bd7c-0437f47ccb98.npy'
我没有修改输出的任何格式.
如果有任何帮助,我将不胜感激.与TFDS合作绝对是一场噩梦,哈哈.
这是完整的代码
def read_npy_file(data):
# 'data' stores the file name of the numpy binary file storing the features of a particular sound file
# as a bytes string.
# decode() is called on the bytes string to decode it from a bytes string to a regular string
# so that it can passed as a parameter into np.load()
print(data)
data = np.load(data.decode())
return data.astype(np.float32)
specgram_ds = tf.data.Dataset.from_tensor_slices((specgram_files, labels))
specgram_ds = specgram_ds.map(
lambda file, label: tuple([tf.numpy_function(read_npy_file, [file], [tf.float32]), label]),
num_parallel_calls=tf.data.AUTOTUNE)
num_files = len(train_df)
num_train = int(0.8 * num_files)
num_val = int(0.1 * num_files)
num_test = int(0.1 * num_files)
specgram_ds = specgram_ds.shuffle(buffer_size=1000)
specgram_train_ds = specgram_ds.take(num_train)
specgram_test_ds = specgram_ds.skip(num_train)
specgram_val_ds = specgram_test_ds.take(num_val)
specgram_test_ds = specgram_test_ds.skip(num_val)
# iterating over one item to trigger the mapping function
for item in specgram_ds.take(1):
pass
谢谢