我用cv2编辑图像,并用FFMPEG从帧中创建视频.有关更多详细信息,请参阅本文.

图像是3D RGB NumPy array(形状类似[h,w,3]),它们存储在Python list中.

是的,我知道cv2有一个VideoWriter,我以前也用过,但它不足以满足我的需要.

简单地说,它只能使用随附的FFMPEG版本,该版本不支持CUDA,在生成视频时会占用所有CPU时间,而根本不使用任何GPU时间,输出太大,我无法将许多FFMPEG参数传递给VideoWrite初始化.

我下载了FFMPEG for Windows的预编译二进制文件,支持CUDA here,我使用的是Windows 10 21H1 x64,我的GPU是NVIDIA Geforce GTX 1050 Ti.

无论如何,我需要处理找到的所有参数herethere,以找到质量和压缩之间的最佳折衷,如下所示:

command = '{} -y -stream_loop {} -framerate {} -hwaccel cuda -hwaccel_output_format cuda -i {}/{}_%d.png -c:v hevc_nvenc -preset 18 -tune 1 -rc vbr -cq {} -multipass 2 -b:v {} -vf scale={}:{} {}'
os.system(command.format(FFMPEG, loops-1, fps, tmp_folder, file_name, quality, bitrate, frame_width, frame_height, outfile))

我需要准确地使用我下载的二进制文件,并指定尽可能多的参数,以实现最佳结果.

目前,我只能将数组作为图像保存到磁盘,并将图像用作FFMPEG的输入,这很慢,但我需要的正是二进制和所有这些参数.

经过数小时的谷歌搜索,我找到了ffmpeg-python,这似乎很适合这项工作,我甚至找到了this :我可以将二进制路径作为参数传递给run函数this

import ffmpeg
import io


def vidwrite(fn, images, framerate=60, vcodec='libx264'):
    if not isinstance(images, np.ndarray):
        images = np.asarray(images)
    _,height,width,channels = images.shape
    process = (
        ffmpeg
            .input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height), r=framerate)
            .output(fn, pix_fmt='yuv420p', vcodec=vcodec, r=framerate)
            .overwrite_output()
            .run_async(pipe_stdin=True, overwrite_output=True, pipe_stderr=True)
    )
    for frame in images:
        try:
            process.stdin.write(
                frame.astype(np.uint8).tobytes()
            )
        except Exception as e: # should probably be an exception related to process.stdin.write
            for line in io.TextIOWrapper(process.stderr, encoding="utf-8"): # I didn't know how to get the stderr from the process, but this worked for me
                print(line) # <-- print all the lines in the processes stderr after it has errored
            process.stdin.close()
            process.wait()
            return # cant run anymore so end the for loop and the function execution

但是,我需要将所有这些参数以及可能更多的参数传递给流程,我不确定这些参数应该传递到哪里(stream_loop应该传递到哪里?hwaccelhwaccel_output_formatmultipass……?).

我如何正确地将一堆NumPy数组传输到由支持CUDA的二进制文件生成的FFMPEG进程,并将各种参数传递给该进程的初始化?

推荐答案

如果您已经知道FFmpeg CLI的语法,可以使用my following answer中的子流程模块(语法适用于FFmpeg CLI).

使用ffmpeg-python包装时:

  • 所有输入参数(-i之前)都在input(...)部分中.
  • 过滤器(-vf-filter_complex)链接为filter(...).filter(...).filter(...)
  • 输出参数(输入文件名后)在output(...)部分.
  • 当有冒号为'b:v'的参数时,我们必须使用字典符号,如**{'b:v': '0'}.
  • .overwrite_output()相当于-y.

笔记:

  • hevc_nvenc applies H.265 (HEVC) codec.
    In case you prefer H.264 (AVC) codec, use h264_nvenc (may require different parameters).
  • 输入像素格式应为'bgr24'(而不是'rgb24'),因为OpenCV使用BGR排序.

Using CUDA accelerated scaling (resize):
The standard scale filter, uses CPU software scaling.
For GPU CUDA accelerated scaling we may use scale_cuda filter.
Before using scale_cuda, we have to upload the frame from the CPU memory to the GPU memory using hwupload_cuda filter.
We should also use the following arguments (at the beginning): vsync=0, hwaccel='cuda', hwaccel_output_format='cuda'.
See: Using FFmpeg with NVIDIA GPU Hardware Acceleration.


下面是一个Python代码示例,演示了h264_nvenc个编码和scale_cuda个过滤器(为测试编写编号的帧):

import cv2
import numpy as np
import ffmpeg
   
width, height, n_frames, fps = 640, 480, 50, 25  # 50 frames, resolution 640x480, and 25 fps
out_width, out_height = 320, 240  # Downscale to 320x240 (for example).

output_filename = 'output.mp4'

# Set pix_fmt to bgr24, because OpenCV uses BGR ordering (not RGB).
# vcodec='hevc_nvenc' - Select hevc_nvenc codec for NVIDIA GPU accelerated H.265 (HEVC) video encoding.
# hwupload_cuda - upload the frame from CPU memory to GPU memory before using CUDA accelerated scaling filter.
# scale_cuda - Use CUDA (GPU accelerated) scaling filter
# Use dictionary notation due to arguments with colon.

# Execute FFmpeg sub-process using stdin pipe as input.
process = (
    ffmpeg
    .input('pipe:', vsync=0, hwaccel='cuda', hwaccel_output_format='cuda', format='rawvideo', pix_fmt='bgr24', s=f'{width}x{height}', r=f'{fps}')
    .filter('hwupload_cuda')  # https://docs.nvidia.com/video-technologies/video-codec-sdk/ffmpeg-with-nvidia-gpu/
    .filter('scale_cuda', w=out_width, h=out_height)  # CUDA accelerated scaling filter
    .filter('setsar', sar=1)  # Keep the aspect ratio
    .output(output_filename, vcodec='hevc_nvenc', **{'preset:v': '18', 'tune:v': '1', 'rc:v': 'vbr', 'cq:v': '19', 'b:v': '0'}, multipass=2)
    .overwrite_output()
    .run_async(pipe_stdin=True, overwrite_output=True)
)


# Build synthetic video frames and write them to ffmpeg input stream (for testing):
for i in range(n_frames):
    # Build synthetic image for testing ("render" a video frame).
    img = np.full((height, width, 3), 60, np.uint8)
    cv2.putText(img, str(i+1), (width//2-100*len(str(i+1)), height//2+100), cv2.FONT_HERSHEY_DUPLEX, 10, (255, 30, 30), 20)  # Blue number

    # Write raw video frame to input stream of ffmpeg sub-process.
    process.stdin.write(img.tobytes())

# Close and flush stdin
process.stdin.close()

# Wait for sub-process to finish
process.wait()

Note:
The above code was tested with NVIDIA GeForce GTX 1650, there is no guarantee that it's going to work with GTX 1050 Ti (due to hardware limitations).

Python相关问答推荐

SQLGory-file包FilField不允许提供自定义文件名,自动将文件保存为未命名

Excel图表-使用openpyxl更改水平轴与Y轴相交的位置(Python)

如何在polars(pythonapi)中解构嵌套 struct ?

为什么sys.exit()不能与subproccess.run()或subprocess.call()一起使用

如何在WSL2中更新Python到最新版本(3.12.2)?

实现自定义QWidgets作为QTimeEdit的弹出窗口

NumPy中条件嵌套for循环的向量化

使用NeuralProphet绘制置信区间时出错

如何在表中添加重复的列?

当我try 在django中更新模型时,模型表单数据不可见

Gunicorn无法启动Flask应用,因为无法将应用解析为属性名或函数调用.'"'' "

如何在FastAPI中替换Pydantic的constr,以便在BaseModel之外使用?'

504未连接IB API TWS错误—即使API连接显示已接受''

如何在PythonPandas 中对同一个浮动列进行逐行划分?

极点替换值大于组内另一个极点数据帧的最大值

为什么我只用exec()函数运行了一次文件,而Python却运行了两次?

如何将一个文件的多列导入到Python中的同一数组中?

我如何处理超类和子类的情况

如何在Quarto中的标题页之前创建序言页

搜索结果未显示.我的URL选项卡显示:http://127.0.0.1:8000/search?";,而不是这个:";http://127.0.0.1:8000/search?q=name";