我目前正在运行一个脚本,在该脚本中,我获取整个音频文件并使用audiofile库(反过来,它使用soundfile库)在Python语言中保存它.

我试图模仿audiofile.read()的行为,我给它一个偏移量和持续时间(以秒为单位),并且只返回该特定声音间隔的相应NumPyarray.这里唯一的区别是,我没有像库要求的那样接受.wav个文件,而是已经将整个音频文件作为NumPy数组,并且需要从中提取正确的开始和结束间隔.

I've tried copying the logic of calculating the start和End,然后从sound_file[start:end]开始切开Numy数组,但这似乎不起作用.我不太熟悉信号处理如何处理音频文件,所以我在这里有点困惑,任何帮助都将不胜感激!

以下是我的代码:

我希望它接受一个Numy数组,并返回相同的Numy数组,该数组被分割为只包括指定的开始和持续时间.我加载的所有文件最初都是96 kHz,重新采样为16 khz,并保存为NumPyarray.


from audiofile.core.utils import duration_in_seconds
import audmath

def read_from_np(
    file,
    duration,
    offset,
    sampling_rate = 16000
):

    if duration is not None:
        duration = duration_in_seconds(duration, sampling_rate)
        if np.isnan(duration):
            duration = None
    if offset is not None and offset != 0:
        offset = duration_in_seconds(offset, sampling_rate)
        if np.isnan(offset):
            offset = None

    # Support for negative offset/duration values
    # by counting them from end of signal
    #
    if offset is not None and offset < 0 or duration is not None and duration < 0:
        # Import duration here to avoid circular imports
        from audiofile.core.info import duration as get_duration

        signal_duration = get_duration(file)
    # offset | duration
    # None   | < 0
    if offset is None and duration is not None and duration < 0:
        offset = max([0, signal_duration + duration])
        duration = None
    # None   | >= 0
    if offset is None and duration is not None and duration >= 0:
        if np.isinf(duration):
            duration = None
    # >= 0   | < 0
    elif offset is not None and offset >= 0 and duration is not None and duration < 0:
        if np.isinf(offset) and np.isinf(duration):
            offset = 0
            duration = None
        elif np.isinf(offset):
            duration = 0
        else:
            if np.isinf(duration):
                offset = min([offset, signal_duration])
                duration = np.sign(duration) * signal_duration
            orig_offset = offset
            offset = max([0, offset + duration])
            duration = min([-duration, orig_offset])
    # >= 0   | >= 0
    elif offset is not None and offset >= 0 and duration is not None and duration >= 0:
        if np.isinf(offset):
            duration = 0
        elif np.isinf(duration):
            duration = None
    # < 0    | None
    elif offset is not None and offset < 0 and duration is None:
        offset = max([0, signal_duration + offset])
    # >= 0    | None
    elif offset is not None and offset >= 0 and duration is None:
        if np.isinf(offset):
            duration = 0
    # < 0    | > 0
    elif offset is not None and offset < 0 and duration is not None and duration > 0:
        if np.isinf(offset) and np.isinf(duration):
            offset = 0
            duration = None
        elif np.isinf(offset):
            duration = 0
        elif np.isinf(duration):
            duration = None
        else:
            offset = signal_duration + offset
            if offset < 0:
                duration = max([0, duration + offset])
            else:
                duration = min([duration, signal_duration - offset])
            offset = max([0, offset])
    # < 0    | < 0
    elif offset is not None and offset < 0 and duration is not None and duration < 0:
        if np.isinf(offset):
            duration = 0
        elif np.isinf(duration):
            duration = -signal_duration
        else:
            orig_offset = offset
            offset = max([0, signal_duration + offset + duration])
            duration = min([-duration, signal_duration + orig_offset])
            duration = max([0, duration])

    # Convert to samples
    #
    # Handle duration first
    # and returned immediately
    # if duration == 0
    if duration is not None and duration != 0:
        duration = audmath.samples(duration, sampling_rate)
    if duration == 0:
        from audiofile.core.info import channels as get_channels

        channels = get_channels(file)
        if channels > 1 or always_2d:
            signal = np.zeros((channels, 0))
        else:
            signal = np.zeros((0,))
        return signal, sampling_rate
    if offset is not None and offset != 0:
        offset = audmath.samples(offset, sampling_rate)
    else:
        offset = 0


    start = offset
    # duration == 0 is handled further above with immediate return
    if duration is not None:
        stop = duration + start

    return np.expand_dims(file[0, start:stop], 0)

推荐答案

您的代码可以归结为

    return np.expand_dims(file[0, start:stop], 0)

这是正确的.

所以如果你对结果不满意, 这是因为计算错了(start, stop)对, 也就是说,错误的(offset, duration)对.

采样率显然固定在恰好16_000 每秒采样数. 频道的数量可以是12,这似乎令人担忧.

有大量的可选行为 与offsetduration参数相关联. 把它扔掉. 专注于编写一个simple助手,它接受 偏移量是always个非负整数, 并且持续时间是always个正的有限整数. 不是NaN. 使用assertraise,这样None或负数 将会因致命的错误而爆炸.

接下来,将重点放在始终具有 相同数量的频道.

在这一点上,要做到这一点并不难.

Python相关问答推荐

使用mySQL的SQlalchemy过滤重叠时间段

Python json.转储包含一些UTF-8字符的二元组,要么失败,要么转换它们.我希望编码字符按原样保留

如何从pandas的rame类继承并使用filepath实例化

SQLAlchemy Like ALL ORM analog

NumPy中条件嵌套for循环的向量化

如何根据一列的值有条件地 Select 前N个组,然后按两列分组?

如何指定列数据类型

如何更改groupby作用域以找到满足掩码条件的第一个值?

幂集,其中每个元素可以是正或负""""

用两个字符串构建回文

语法错误:文档. evaluate:表达式不是合法表达式

使用嵌套对象字段的Qdrant过滤

使用polars. pivot()旋转一个框架(类似于R中的pivot_longer)

你能把函数的返回类型用作其他地方的类型吗?'

Regex用于匹配Python中逗号分隔的AWS区域

在任何要保留的字段中添加引号的文件,就像在Pandas 中一样

PYTHON中的selenium不会打开 chromium URL

try 在单个WITH_COLUMNS_SEQ操作中链接表达式时,使用Polars数据帧时出现ComputeError

Python:在cmd中添加参数时的语法

pyspark where子句可以在不存在的列上工作