OpenROAD make write_* commands to write out files atomically

@maliberty @oharboe this PR addresses the #3658 issue. Could you please take a look? The implementation is done according to the suggestion in the issue #3658 description. The changes have been locally tested.

May 15 '24 12:05 annapetrosyan26

This feels like a dangerous change to me. Not all users have a tmp folder, and using a time based file name is also a bit dangerous if you have multiple instances running on the same machine. Usually you would use a UUID for this purpose.

I would strongly advocate for making this an option.

May 15 '24 17:05 QuantamHD

This feels like a dangerous change to me. Not all users have a tmp folder, and using a time based file name is also a bit dangerous if you have multiple instances running on the same machine. Usually you would use a UUID for this purpose.

I would strongly advocate for making this an option.

Also, isn't rename broken if tmp and the destination folder are on different devices? I think the filename has to be alongside the destination file.

delete destination file, if it exists
write a foo.odb.tmp
rename foo.odb.tmp to foo.odb

Better?

May 15 '24 17:05 oharboe

I like that change a lot better.

This feels like a dangerous change to me. Not all users have a tmp folder, and using a time based file name is also a bit dangerous if you have multiple instances running on the same machine. Usually you would use a UUID for this purpose. I would strongly advocate for making this an option.

Also, isn't rename broken if tmp and the destination folder are on different devices? I think the filename has to be alongside the destination file.

delete destination file, if it exists

write a foo.odb.tmp

rename foo.odb.tmp to foo.odb

Better?

I like that flow much better

May 15 '24 17:05 QuantamHD

I like that change a lot better.

This feels like a dangerous change to me. Not all users have a tmp folder, and using a time based file name is also a bit dangerous if you have multiple instances running on the same machine. Usually you would use a UUID for this purpose. I would strongly advocate for making this an option.

Also, isn't rename broken if tmp and the destination folder are on different devices? I think the filename has to be alongside the destination file.

delete destination file, if it exists

write a foo.odb.tmp

rename foo.odb.tmp to foo.odb

Better?

I like that flow much better

I'll change the code according suggested flow. @QuantamHD do you still prefer to make this optional?

May 16 '24 10:05 annapetrosyan26

I like that change a lot better.

This feels like a dangerous change to me. Not all users have a tmp folder, and using a time based file name is also a bit dangerous if you have multiple instances running on the same machine. Usually you would use a UUID for this purpose. I would strongly advocate for making this an option.

Also, isn't rename broken if tmp and the destination folder are on different devices? I think the filename has to be alongside the destination file.

delete destination file, if it exists

write a foo.odb.tmp

rename foo.odb.tmp to foo.odb

Better?

I like that flow much better

I'll change the code according suggested flow. @QuantamHD do you still prefer to make this optional?

Answering for @QuantamHD due to time-zones, any mistakes answering on his behalf are mine: no options are required with the apove approach.

May 16 '24 10:05 oharboe

also, I think this code should be a utility fn using RAII so that we can create a stream that is written atomically everywhere.

May 16 '24 10:05 oharboe

@oharboe I'd like to summarize what I'm going to do, that you can confirm if that will work for you.

create an utility WriteFile function which takes as argument filename and a lambda function, the WriteFile will do the following: a. check for the filename and filename.tmp files existence, if they exist, then delete them. If deleting fails, then throw an exception. b. call the lambda function with the filename.tmp as an argument. c. rename filename.tmp to filename.
encapsulate the code into lambda functions in all places where currently writing is done into the filename and call utl::WriteFile with the filename and the lambdas.

May 16 '24 14:05 annapetrosyan26

@oharboe I'd like to summarize what I'm going to do, that you can confirm if that will work for you.

create an utility WriteFile function which takes as argument filename and a lambda function, the WriteFile will do the following: a. check for the filename and filename.tmp files existence, if they exist, then delete them. If deleting fails, then throw an exception. b. call the lambda function with the filename.tmp as an argument. c. rename filename.tmp to filename.

encapsulate the code into lambda functions in all places where currently writing is done into the filename and call utl::WriteFile with the filename and the lambdas.

I recommend using the RAII pattern here, not lambda functions.

May 16 '24 14:05 oharboe

You need to flush/fsync the file before the rename (see https://www.slideshare.net/nan1nan1/eat-my-data)

May 16 '24 15:05 maliberty

Please also note you are missing DCO and need to run clang-format before the merge.

May 16 '24 15:05 maliberty

nits + write unit-tests

@oharboe I propose to add unit test in the CFileUtilsTest.cpp file, could you please clarify if this is the right approach? Also could you please provide some info regarding nits?

May 21 '24 14:05 annapetrosyan26

nits + write unit-tests

@oharboe I propose to add unit test in the CFileUtilsTest.cpp file,

Should be fine.

could you please clarify if this is the right approach? Also could you please provide some info regarding nits?

See my inline comments in this review.

May 21 '24 14:05 oharboe

clang-tidy review says "All clean, LGTM! :+1:"

May 23 '24 17:05 github-actions[bot]

Does this PR ensure that the tmp file has a unique name which isn't already in use? If a user already has a manually generated file named filename.tmp it should not be overwritten.

May 24 '24 17:05 rovinski

Does this PR ensure that the tmp file has a unique name which isn't already in use? If a user already has a manually generated file named filename.tmp it should not be overwritten.

I disagree. Keep it simple. OpenROAD will "own" .odb.tmp file as tmp file, dont put anything there

OpenROAD and ORFS overwrites lots of files, because by convention those files are used: reports, logs, objects, results, etc.

May 24 '24 18:05 oharboe

I 100% disagree. I often add random extensions to files like .tmp .bak .old etc. in order to make backup files that don't get overwritten. If I do write_db file.odb I would have no expectation or knowledge that file.odb.tmp would be overwritten in this process. std::tmpnam() should be used - it was made for this purpose.

May 24 '24 22:05 rovinski

The same is done in ORFS to write logs atomically, BTW.

May 24 '24 22:05 oharboe

The same is done in ORFS to write logs atomically, BTW.

I disagree with that too. I would ask why a log would need to be written atomically, but that's unrelated to this issue. Temporary files should use mktemp.

May 25 '24 02:05 rovinski

Assuming this change is the direction we want to go (not sure why this wouldn't apply to all files and not a few select ones, except that this is a fix to a limitation in ORFS in which case this feels overkill), I would think the proper naming scheme would be to use a hidden file so file.odb, would first go to .file.odb.tmp and then renamed to file.odb. This would avoid @rovinski concern (I think) while preserving the need to keep the file writing to a specific directory (which @QuantamHD suggested and I agree with).

May 25 '24 02:05 gadfort

@gadfort the issue stems from not checking that a file exists before overwriting, especially when that file is not the user-provided filename. Again, here's an example of unexpected behavior from the user side:

% ls
file.odb
# User thinks they're making a backup
% cp file.odb file.odb.tmp
% ls
file.odb file.odb.tmp
% write_db file.odb
# Backup file is now gone
% ls
file.odb

There is no checking that the temp file already exists, so it could be overwritten without prompting the user. Using the std/coreutil functions will guarantee that a unique filename is generated. This is what e.g. Google Chrome does when downloading files.

Good behavior:

% ls
file.odb
# User thinks they're making a backup
% cp file.odb file.odb.tmp
% ls
file.odb
% write_db file.odb
# Process is interrupted
^C
% ls
file.odb file.odb.tmp file.odb.tmp.D0zMeJoW9g
# temp file with guaranteed unique string is written

May 25 '24 02:05 rovinski

I already made the same suggestion above: std::tmpnam or std::tmpfile should be used.

May 25 '24 04:05 maliberty

clang-tidy review says "All clean, LGTM! :+1:"

May 25 '24 04:05 github-actions[bot]

I already made the same suggestion above: std::tmpnam or std::tmpfile should be used.

I thought @gadfort had a good suggestion, though I think the code is fine as-is. Although I think the tmpfile solves a problem that doesn't exist and creates a problem in ORFS for log files(I want to tail -f a partially written log file), if tmpfile makes this change paletable to the group, then that works for me. Unlike log files that are tee'ed, there are no such files written by openroad.

The reason we need a tmp file name and write atomically is to avoid strange problems with dependencies with a partially written file. ORFS has dependencies on .log and .odb files. The dependency on a log file might be reworked in ORFS, I forget the details.

May 25 '24 07:05 oharboe

I want to tail -f a partially written log file

I think this is orthogonal because this PR doesn't concern log files. You wouldn't tail -f an odb, def, etc.

I already made the same suggestion above: std::tmpnam or std::tmpfile should be used.

I thought I remember seeing that comment but somehow I missed it.

May 25 '24 08:05 rovinski

Some more color on the motivation for atomic writes: it avoids partial files upon error/crashes, which improves the user experience

Also, I like fixed tmp names, because random files do not accrue. Not a biggie, but I thought I would mention it. /tmp is full of orphaned temporary files sometimes....

May 25 '24 11:05 oharboe

POSIX has tempnam (not tmpnam) which allows a directory specification if that is helpful.

May 25 '24 17:05 maliberty

POSIX has tempnam (not tmpnam) which allows a directory specification if that is helpful.

You can give directory and file name prefix which could be ".odb". It seems like it satisfies the various concerns.

May 26 '24 01:05 maliberty

Thanks for contributing!

May 29 '24 20:05 maliberty

The nit picking can migrate to #5181

May 29 '24 22:05 maliberty