Python pytest、xdist和共享生成的文件依赖项

发布于03月14日

我有多个测试需要一个昂贵的生成文件. 我希望在每次测试运行时重新生成该文件，但不要超过一次. 更复杂的是，这些测试和文件都依赖于输入参数.

def expensive(param) -> Path:
    # Generate file and return its path.

@mark.parametrize('input', TEST_DATA)
class TestClass:

    def test_one(self, input) -> None:
        check_expensive1(expensive(input))

    def test_two(self, input) -> None:
        check_expensive2(expensive(input))

如何确保即使在并行运行这些测试时，该文件也不会跨线程重新生成？作为背景，我将MakeFiles的测试基础 struct 移植到pytest.

我可以使用基于文件的锁来进行同步，但我相信其他人也遇到过这个问题，他们更愿意使用现有的解决方案.

对于单个线程，使用functools.cache效果很好.带有scope="module"的装置根本不起作用，因为参数input位于函数作用域.

推荐答案

在pytest—xdist文档第"Making session-scoped fixtures execute only once"节中有一个现有的解决方案:

import json

import pytest
from filelock import FileLock


@pytest.fixture(scope="session")
def session_data(tmp_path_factory, worker_id):
    if worker_id == "master":
        # not executing in with multiple workers, just produce the data and let
        # pytest's fixture caching do its job
        return produce_expensive_data()

    # get the temp directory shared by all workers
    root_tmp_dir = tmp_path_factory.getbasetemp().parent

    fn = root_tmp_dir / "data.json"
    with FileLock(str(fn) + ".lock"):
        if fn.is_file():
            data = json.loads(fn.read_text())
        else:
            data = produce_expensive_data()
            fn.write_text(json.dumps(data))
    return data

请注意，filelock不是标准库的一部分，但可以从PyPI获得.你可以找到documentation here.