scons icon indicating copy to clipboard operation
scons copied to clipboard

Added error message when failure to delete partial cache retrieval occurs

Open dmoody256 opened this issue 3 years ago • 5 comments

Mongodb was running into cases where the partial cache file still existed in a bad state after a bad retrieval. The lack of any error message around the related code in this PR made it difficult to understand what was happening. This error message should give some more obvious insight into what took place and why the failed retrieval file still exists after a failed retrieval.

Contributor Checklist:

  • [ ] I have created a new test or updated the unit tests to cover the new/changed functionality.
  • [x] I have updated CHANGES.txt (and read the README.rst)
  • [ ] I have updated the appropriate documentation

dmoody256 avatar Jan 13 '22 15:01 dmoody256

@dmoody256 Should this be an error instead of a warning?

bdbaddog avatar Apr 02 '22 20:04 bdbaddog

Can you recover? That is, any "bad cache retrieve" operation could just mean "forget about the cache entry, build instead", but is that possible at this point, or is there no way back once you've already committed to getting the object from cache? If it's recoverable, it should only be a warning

mwichmann avatar Apr 02 '22 21:04 mwichmann

@mwichmann good point. I think the answer is "it depends", if the file can be overwritten when the actual builder runs, then all's fine, but if it can't then it should error out at that point, which should be right after the warning in this PR..?

bdbaddog avatar Apr 02 '22 21:04 bdbaddog

@mwichmann good point. I think the answer is "it depends", if the file can be overwritten when the actual builder runs, then all's fine, but if it can't then it should error out at that point, which should be right after the warning in this PR..?

Yeah, it is "it depends". Most cases the builder will be able to overwrite the file and all's good, there are cases, like AR, where its not an overwrite, or even more rare, filesystem issues or something. I usually always prefer hard fails, so am for switching to an error.

dmoody256 avatar Apr 03 '22 04:04 dmoody256

@dmoody256 - I agree. my motto for tools is almost always "Fail Fast and immediately" when there's an error.

Only quandry is how to test this.. thoughts?

bdbaddog avatar Apr 03 '22 20:04 bdbaddog

I think this is a way to create a unit test. TaskmasterTests.py has TaskmasterTestCase.test_cached_execute() test, the new code will only be tested by having 2 targets, one of which is cached, and the for the node which is cached, when trying to unlink() it, a mocked node.fs.unlink() will have to raise an IOError or OSError, and then check for the warning in question to be issued.

bdbaddog avatar Dec 03 '22 21:12 bdbaddog