Python 如何从表示音频的Numy数组中提取持续时间和偏移量

发布于02月22日

我目前正在运行一个脚本，在该脚本中，我获取整个音频文件并使用audiofile库(反过来，它使用soundfile库)在Python语言中保存它.

我试图模仿audiofile.read()的行为，我给它一个偏移量和持续时间(以秒为单位)，并且只返回该特定声音间隔的相应NumPyarray.这里唯一的区别是，我没有像库要求的那样接受.wav个文件，而是已经将整个音频文件作为NumPy数组，并且需要从中提取正确的开始和结束间隔.

I've tried copying the logic of calculating the start和End，然后从sound_file[start:end]开始切开Numy数组，但这似乎不起作用.我不太熟悉信号处理如何处理音频文件，所以我在这里有点困惑，任何帮助都将不胜感激！

以下是我的代码:

我希望它接受一个Numy数组，并返回相同的Numy数组，该数组被分割为只包括指定的开始和持续时间.我加载的所有文件最初都是96 kHz，重新采样为16 khz，并保存为NumPyarray.


from audiofile.core.utils import duration_in_seconds
import audmath

def read_from_np(
    file,
    duration,
    offset,
    sampling_rate = 16000
):

    if duration is not None:
        duration = duration_in_seconds(duration, sampling_rate)
        if np.isnan(duration):
            duration = None
    if offset is not None and offset != 0:
        offset = duration_in_seconds(offset, sampling_rate)
        if np.isnan(offset):
            offset = None

    # Support for negative offset/duration values
    # by counting them from end of signal
    #
    if offset is not None and offset < 0 or duration is not None and duration < 0:
        # Import duration here to avoid circular imports
        from audiofile.core.info import duration as get_duration

        signal_duration = get_duration(file)
    # offset | duration
    # None   | < 0
    if offset is None and duration is not None and duration < 0:
        offset = max([0, signal_duration + duration])
        duration = None
    # None   | >= 0
    if offset is None and duration is not None and duration >= 0:
        if np.isinf(duration):
            duration = None
    # >= 0   | < 0
    elif offset is not None and offset >= 0 and duration is not None and duration < 0:
        if np.isinf(offset) and np.isinf(duration):
            offset = 0
            duration = None
        elif np.isinf(offset):
            duration = 0
        else:
            if np.isinf(duration):
                offset = min([offset, signal_duration])
                duration = np.sign(duration) * signal_duration
            orig_offset = offset
            offset = max([0, offset + duration])
            duration = min([-duration, orig_offset])
    # >= 0   | >= 0
    elif offset is not None and offset >= 0 and duration is not None and duration >= 0:
        if np.isinf(offset):
            duration = 0
        elif np.isinf(duration):
            duration = None
    # < 0    | None
    elif offset is not None and offset < 0 and duration is None:
        offset = max([0, signal_duration + offset])
    # >= 0    | None
    elif offset is not None and offset >= 0 and duration is None:
        if np.isinf(offset):
            duration = 0
    # < 0    | > 0
    elif offset is not None and offset < 0 and duration is not None and duration > 0:
        if np.isinf(offset) and np.isinf(duration):
            offset = 0
            duration = None
        elif np.isinf(offset):
            duration = 0
        elif np.isinf(duration):
            duration = None
        else:
            offset = signal_duration + offset
            if offset < 0:
                duration = max([0, duration + offset])
            else:
                duration = min([duration, signal_duration - offset])
            offset = max([0, offset])
    # < 0    | < 0
    elif offset is not None and offset < 0 and duration is not None and duration < 0:
        if np.isinf(offset):
            duration = 0
        elif np.isinf(duration):
            duration = -signal_duration
        else:
            orig_offset = offset
            offset = max([0, signal_duration + offset + duration])
            duration = min([-duration, signal_duration + orig_offset])
            duration = max([0, duration])

    # Convert to samples
    #
    # Handle duration first
    # and returned immediately
    # if duration == 0
    if duration is not None and duration != 0:
        duration = audmath.samples(duration, sampling_rate)
    if duration == 0:
        from audiofile.core.info import channels as get_channels

        channels = get_channels(file)
        if channels > 1 or always_2d:
            signal = np.zeros((channels, 0))
        else:
            signal = np.zeros((0,))
        return signal, sampling_rate
    if offset is not None and offset != 0:
        offset = audmath.samples(offset, sampling_rate)
    else:
        offset = 0


    start = offset
    # duration == 0 is handled further above with immediate return
    if duration is not None:
        stop = duration + start

    return np.expand_dims(file[0, start:stop], 0)

Python 如何从表示音频的Numy数组中提取持续时间和偏移量

推荐答案

Python相关问答推荐

使用mySQL的SQlalchemy过滤重叠时间段

Python json.转储包含一些UTF-8字符的二元组，要么失败，要么转换它们.我希望编码字符按原样保留

如何从pandas的rame类继承并使用filepath实例化

SQLAlchemy Like ALL ORM analog

NumPy中条件嵌套for循环的向量化

如何根据一列的值有条件地 Select 前N个组，然后按两列分组？

如何指定列数据类型

如何更改groupby作用域以找到满足掩码条件的第一个值？

幂集，其中每个元素可以是正或负""""

用两个字符串构建回文

语法错误：文档. evaluate：表达式不是合法表达式

使用嵌套对象字段的Qdrant过滤

使用polars. pivot()旋转一个框架(类似于R中的pivot_longer)

你能把函数的返回类型用作其他地方的类型吗？'

Regex用于匹配Python中逗号分隔的AWS区域

在任何要保留的字段中添加引号的文件，就像在Pandas 中一样

PYTHON中的selenium不会打开 chromium URL

try 在单个WITH_COLUMNS_SEQ操作中链接表达式时，使用Polars数据帧时出现ComputeError

Python：在cmd中添加参数时的语法

pyspark where子句可以在不存在的列上工作