FileRegression + LazyDatadir may conflict with user generated files => confusing errors
Issue occurs in pytest-regressions >= 2.8.0
Scenario:
You have your expected results generated to tests/test_spam/index.json as follows (e.g., from an older version):
{
"spam": "eggs"
}
The corresponding test in tests/test_spam.py looks like this:
import json
from pathlib import Path
from pytest_regressions.file_regression import FileRegressionFixture
def request_data_from_api(outfile: Path) -> None:
"""
Fake for actual program logic.
"""
outfile.write_text('{"spam":"eggs"}')
def test_spam(tmp_path: Path, file_regression: FileRegressionFixture) -> None:
output_file = tmp_path.joinpath("index.json")
request_data_from_api(output_file)
# Nicely format data
raw_data = json.loads(output_file.read_text())
result_data = json.dumps(raw_data, indent=4)
file_regression.check(result_data, basename="index", extension=".json")
What now happens:
- The test buffers the raw data in
{tmp_path}/index.json - The test passes the formatted data to
check() -
check()correctly derivestests/test_spam/index.jsonas thesource_filenamecontaining the expected data. -
LazyDatadirgeneratesfilename = {tmp_path}/index.jsonfor the actual comparison. - It wants to copy the contents from
source_filenametofilename, but since this file already exists (it contains the the raw data, see 2nd line oftest_spam()), it does not perform a copy. -
check()goes on and compares theresult_data(formatted json) with the contents offilename(raw json) which of course failes. - Bonus: If
--force-regenis passed, the formattedresult_datais written tosource_filename, which does not change anything.
Possible solutions:
- Use a random filename in
LazyDatadirso that it becomes very improbable thatfilenameoverwrites another the file that users may have created in their tests. - Print a warning / raise an error if this problem is detected
- Tell users that hey should not write to
{tmp_path}/{basename}.{extension}
Indeed, we can possibly detect this before writing the file, and fail the test explaining the situation (not a warning).
Would you like to contribute a PR?
I can try to provide a fix, but it might take some time. :)
The problem needs to be fixed. Until now, I came up with two possible solutions:
- Raise an error – easy to do but hacky and results in unnecessarry errors:
diff --git a/src/pytest_datadir/plugin.py b/src/pytest_datadir/plugin.py
index 86346cd..d3740aa 100644
--- a/src/pytest_datadir/plugin.py
+++ b/src/pytest_datadir/plugin.py
@@ -1,7 +1,7 @@
import os
import shutil
import sys
-from dataclasses import dataclass
+from dataclasses import dataclass, field
from pathlib import Path
from typing import Union
@@ -71,6 +71,7 @@ class LazyDataDir:
original_datadir: Path
tmp_path: Path
+ _copied: bool = field(default=False, init=False)
def joinpath(self, other: Union[Path, str]) -> Path:
"""
@@ -84,6 +85,12 @@ class LazyDataDir:
"""
original = self.original_datadir / other
target = self.tmp_path / other
+ if not self._copied and target.exists():
+ rel_target = target.relative_to(self.tmp_path)
+ raise RuntimeError(
+ f"{{tmp_path}}/{rel_target} was already created outside of a datadir "
+ "fixture!"
+ )
if original.exists() and not target.exists():
if original.is_file():
target.parent.mkdir(parents=True, exist_ok=True)
@@ -94,6 +101,7 @@ class LazyDataDir:
shutil.copytree(
_win32_longpath(str(original)), _win32_longpath(str(target))
)
+ object.__setattr__(self, "_copied", True)
return target
def __truediv__(self, other: Union[Path, str]) -> Path:
- Copy files into a unique sub-directory within
{tmp_path}. Cleaner solution but may be a breaking change?
diff --git a/src/pytest_datadir/plugin.py b/src/pytest_datadir/plugin.py
index 86346cd..a2eaef0 100644
--- a/src/pytest_datadir/plugin.py
+++ b/src/pytest_datadir/plugin.py
@@ -1,4 +1,5 @@
import os
+import uuid
import shutil
import sys
from dataclasses import dataclass
@@ -114,7 +115,9 @@ def lazy_datadir(original_datadir: Path, tmp_path: Path) -> LazyDataDir:
original_datadir: The original data directory.
tmp_path: Pytest's built-in fixture providing a temporary directory path.
"""
- return LazyDataDir(original_datadir, tmp_path)
+ temp_path = tmp_path / str(uuid.uuid4())
+ temp_path.mkdir(parents=True, exist_ok=True)
+ return LazyDataDir(original_datadir, temp_path)
@pytest.fixture
@@ -132,7 +135,7 @@ def lazy_shared_datadir(request: pytest.FixtureRequest, tmp_path: Path) -> LazyD
tmp_path: Pytest's built-in fixture providing a temporary directory path.
"""
original_shared_path = Path(os.path.join(request.fspath.dirname, "data"))
- temp_path = tmp_path / "data"
+ temp_path = tmp_path / f"data-{uuid.uuid4()}"
temp_path.mkdir(parents=True, exist_ok=False)
return LazyDataDir(original_shared_path, temp_path)
What do you preffer, @nicoddemus ?
I think the first one would be best, thanks!
Oh sorry @sscherfke, my bad, I didn't realize the code of the first solution was for pytest-datadir.
I don't think pytest-datadir has anything to do with this issue. The fix should be within pytest-regressions... not sure exactly what it should be yet though.