I'm implementing an application which records and analyzes audio in real time (or at least as close to real time as possible), using the JDK Version 8 Update 201. While performing a test which simulates typical use cases of the application, I noticed that after several hours of recording audio continuously, a sudden delay of somewhere between one and two seconds was introduced. Up until this point there was no noticeable delay. It was only after this critical point of recording for several hours when this delay started to occur.

What I've tried so far

To check if my code for timing the recording of the audio samples is wrong, I commented out everything related to timing. This left me essentially with this update loop which fetches audio samples as soon as they are ready (Note: Kotlin code):

while (!isInterrupted) {
    val audioData = read(sampleSize, false)
    listener.audioFrameCaptured(audioData)
}

这是我的读取方法:

fun read(samples: Int, buffered: Boolean = true): AudioData {
    //Allocate a byte array in which the read audio samples will be stored.
    val bytesToRead = samples * format.frameSize
    val data = ByteArray(bytesToRead)

    //Calculate the maximum amount of bytes to read during each iteration.
    val bufferSize = (line.bufferSize / BUFFER_SIZE_DIVIDEND / format.frameSize).roundToInt() * format.frameSize
    val maxBytesPerCycle = if (buffered) bufferSize else bytesToRead

    //Read the audio data in one or multiple iterations.
    var bytesRead = 0
    while (bytesRead < bytesToRead) {
        bytesRead += (line as TargetDataLine).read(data, bytesRead, min(maxBytesPerCycle, bytesToRead - bytesRead))
    }

    return AudioData(data, format)
}

然而,即使没有我这边的任何时间,问题也没有得到解决.因此,我继续进行了一些实验,让应用程序使用不同的音频格式运行,这会导致非常混乱的结果(除非另有规定,否则我将使用带有little endian的PCM签名16位立体声音频格式,默认采样率为44100.0 Hz):

  1. 延迟出现之前必须经过的临界时间似乎因使用的机器不同而有所不同.在我的Windows10台式PC上,这个时间在6.5到7个小时之间.然而,在我的笔记本电脑(也使用Windows10)上,同样的音频格式需要4到5个小时.
  2. The amount of audio channels used seems to have an effect. If I change the amount of channels from stereo to mono, the time before the delay starts to appear is doubled to somewhere between 13 and 13.5 hours on my desktop.
  3. 将采样大小从16位减少到8位还会导致延迟开始出现之前的时间加倍.在我的桌面上呆了13到13.5个小时.
  4. 将字节顺序从小端更改为大端没有任何影响.
  5. Switching from stereomix to a physical microphone has no effect either.
  6. I tried opening the line using different buffer sizes (1024, 2048 and 3072 sample frames) as well as its default buffer size. This also didn't change anything.
  7. 刷新TargetDataLine after,延迟已经开始发生,导致所有字节在大约一到两秒内为零.在此之后,我再次得到非零值.然而,延迟仍然存在.如果我刷新临界点before行,我得不到那些零字节.
  8. 停止并重新启动TargetDataLine after出现的延迟也不会改变任何事情.
  9. Closing and reopening the TargetDataLine, however, does get rid of the delay until it reappears after several hours from there on.
  10. Automatically flushing the TargetDataLines internal buffer every ten minutes does not help to resolve the issue. Therefore, a buffer overflow in the internal buffer does not seem to be the cause.
  11. 使用并行垃圾收集器来避免应用程序冻结也无济于事.
  12. 使用的采样率似乎很重要.如果我将采样率加倍到88200赫兹,延迟开始出现在运行时间的3到3.5小时之间.
  13. If I let it run under Linux using my "default" audio format, it still runs fine after about 9 hours of runtime.

Conclusions that I've drawn:

These results let me come to the conclusion that the time for which I can record audio before this issue starts to happen is dependent on the machine on which the application is run and dependent on the byte rate (i.e. frame size and sample rate) of the audio format. This seems to hold true (although I can't completely confirm this as of now) because if I combine the changes made in 2 and 3, I would assume that I can record audio samples for four times as long (which would be somewhere between 26 and 27 hours) as when using my "default" audio format before the delay starts to appear. As I didn't find the time to let the application run for this long yet, I can only tell that it did run fine for about 15 hours before I had to stop it due to time constraints on my side. So, this hypothesis is still to be confirmed or denied.

根据项目符号13的结果,似乎整个问题只有在使用Windows时才会出现.因此,我think认为它might是javax.sound.sampled API的平台特定部分中的错误.

Even though I think I might have found a way to change when this issue starts to happen, I'm not satisfied with the result. I could periodically close and reopen the line to avoid the problem from starting to appear at all. However, doing this would result in some arbitrary small amount of time where I wouldn't be able to capture audio samples. Furthermore, the Javadoc states that some lines can't be reopened at all after being closed. Therefore, this is not a good solution in my case.

Ideally, this whole issue shouldn't be happening at all. Is there something I am completely missing or am I experiencing limitations of what is possible with the javax.sound.sampled API? How can I get rid of this issue at all?

Edit: By suggestion of Xtreme Biker and gidds I created a small example application. You can find it inside this Github repository.

推荐答案

我对Java音频接口有(相当)丰富的经验.

  1. 这不是JVM版本的问题——java音频系统自java 1.3或1.5以来几乎没有升级过
  2. java音频系统是操作系统提供的任何音频接口API的穷人包装器.在linux中是Pulseaudio库,在windows中是direct show audio API(如果我没有弄错后者的话).
  3. 同样,音频系统API是一种遗留API——有些功能不起作用或没有实现,其他行为非常奇怪,因为它们依赖于过时的设计(如果需要,我可以提供示例).
  4. 这不是垃圾收集的问题--如果您对"延迟"的定义是我所理解的(音频数据延迟1-2秒,这意味着您在1-2秒之后开始听到消息),那么,垃圾收集器不能导致目标数据行神奇地捕获空白数据,然后像往常一样将数据附加到相当于2秒的字节偏移量中.
  5. 这里最有可能发生的情况是,硬件或驱动程序在某个时间点为您提供了价值2秒的乱码数据,然后像往常一样流式传输其余数据,从而导致您所经历的"延迟".
  6. 它在Linux上完美工作的事实意味着它不是一个硬件问题,而是一个与驱动程序相关的问题.
  7. To affirm that suspicion, you can try capturing audio via FFmpeg for the same duration and see if the issue is reproduced.
  8. If you are using specialized audio capturing hardware, better approach your hardware manufacturer and inquire him about the issue you are facing on windows.
  9. 无论如何,在从头开始编写音频捕获应用程序时,我强烈建议尽可能远离Java音频系统.这对于POC来说很好,但它是一个未维护的遗留API.JNA总是一个可行的 Select (我在Linux中使用它,AlsA/Pull音频来控制音频硬件属性,java音频系统不能改变),所以你可以在C++中寻找Windows的音频捕捉示例并将它们翻译成java.它将为您提供对音频捕获设备的精细控制,远远超过JVM提供的OOTB.如果你想看一个活生生的/有呼吸功能的JNA例子,请查看我的JNA AAC encoder项目.
  10. 同样,如果您使用特殊捕获HARWDAG,制造商很有可能已经为自己提供了与硬件接口的低级别C API,并且您也应该考虑看看它.
  11. If that's not the case, maybe you and your company/client should consider using specialized capturing hardware (doesn't have to be that expensive).

Kotlin相关问答推荐

在Kotlin Jetpack中重用下拉菜单

在kotlin中使用List(mylist. size){index—TODO()}或Map迭代>

在Webflux应用程序中通过kotlin协程启动fire and forget job

Kotlin异步不并行运行任务

处理合成层次 struct 中的深层按钮以切换视图

两个LocalDateTime之间的Kotlin差异

如何在 Kotlin 中初始化 Short 数组?

如果不为空,则为 builder 设置一个值 - Kotlin

正则表达式 FindAll 不打印结果 Kotlin

Kotlin:如何使用第一个参数的默认值进行函数调用并为第二个参数传递一个值?

Kotlin RxJava 可空的错误

在 Kotlin 中,当枚举类实现接口时,如何解决继承的声明冲突?

如何在调试中修复 ClassNotFoundException: kotlinx.coroutines.debug.AgentPremain?

如何从定义它们的类外部调用扩展方法?

如何将vararg转换为list?

在 gradle android library kotlin 项目中禁用 META-INF/* 生成

Android EditText 协程go 抖操作符,如 RxJava

如何在Android Studio 4.1中默认启用Kotlin Android扩展

Android Kotlin 创建类实现 Parcelable 在 writeToParcel 方法的 override中给出错误

Kotlin中的函数接口