pip
pip copied to clipboard
PERF: 7% faster pip install, remove fsync call in temporary file, not necessary
Hello,
could I have a review on this code?
profiling on a run of pip install jupyter
_install_wheel() function is 3670 ms. that's the installation phase.
fsync is 594 ms out of that, about 15% of the install phase (or 7% of the whole pip run).
It is called here 3 times https://github.com/pypa/pip/blob/8eadcab329a765571c849efe370afd5d5bba425c/src/pip/_internal/operations/install/wheel.py#L672
that's the end of the wheel installation, in _generate_file() decorator:
to create the dist-info/INSTALLERfile that always contains "pip" stringto create the dist-info/direct-url-something.jsonthat contains a one line json file with the path/url.to create the dist-info/RECORDfile that contains a list of all installed files (allegedly the most important file)
the way it works: it creates a temporary file in the same directory as the final path + flush() + fsync() + close() + atomic rename() the file to the final path. I think the fsync call is not necessary.
fsync calls are surprisingly expensive, I am always seeing multiple milliseconds on various types of disks on my machine and even on /tmp. there are only 3 files fsync'ed per extracted wheel, yet they take almost as long as everything else. I think the run has 5564 files calls to save() to extract files + 116 calls to fsync() for metadata files + for a total of 58 wheels to install.
This looks reasonable, but I don't know enough about filesystem management to be sure that the
fsynccall is unnecessary. The history of the code doesn't give any indication.
This was added in 2019 with a comment "The file is created securely and is ensured to be written to disk after the context reaches its end."
My best guess, the guy who wrote this code went over-the-top to both flush() and fsync() to ensure the content is written. He probably through the record file is very important and went all in! :D
(The initial commit might have been bugged because it flushed the file result.file.flush() instead of the buffer result.flush())
As far as I understand, the fsync() is unnecessary because the flush() will flush AND the end of the with block will autoclose and flush too (closing is flushing).
If something crashes like a power loss before the write is complete and the rename is done, it's okay because the file is still a temporary filename. It's the atomic rename() after write that ensures consistency.
(By the way, I don't think that fsync() provides any guarantee that the data will be written to the media on modern hardware)
The code itself was probably a copy/pasted snippet from somewhere else (that didn't do atomic rename) or from previous code in pip (the commit message implies pip was writing in place before).
I think this patch is ready to merge.
@ichard26 per the link you gave, ext4 filesystem is ensuring that a rename can only be written after pending writes to the file have completed. the fsync is not necessary.
from the messages in the mailing list, they fix the filesystem because yum, the package manager on centos/rhel, was also doing rename after writes to ensure atomic changes and the filesystem had to handle it.
Isn't that a kernel/filesystem level hack? You know that we have other operation systems and filesystems to support. I'm not saying that we need to support durable installations, but if we want to, then relying on an ext4 feature is not the way to go.
reping. forgot to follow up.
would be nice to merge the patch to get the performance improvement! :D
@morotti looks like you forgot to respond to Richard's concern? It sounds like said guarantee is file system-specific and other file systems might behave differently. This might be important in the light of pip's cross-platform nature. Even under GNU/Linux distros, people tend to use different file systems, but beyond that, it's rarely possible to use the same file systems under different operating systems. Moreover, on GNU/Linux it's quite common for block devices to be layered on top of others (LUKS/LVM/btrfs). Do you know of implications this would have on them and file systems like NTFS/HFS/exfat etc.? What are the implications of sync on rename for those? Note that a rename across file system boundaries is typically implemented as copy+remove.
It sounds safer to do that fsync syscall than not.
sorry for delay, there is no issue as far as I am aware.
the long discussion above is misleading, it was mainly around a bug ticket on RHEL/CentOS/ext4. early versions of ext4 started reordering filesystem operations in unsafe ways and data caused corruption. this was noticed with yum install leaving empty files and that was quite a problem. it was fixed forever ago after the issue was understood. I'm not aware of any filesystem being affected by a similar issue.
by the way, the reason this code does a fsync() in the first place is because it is decades old code that used to write files in place which was unsafe (irrelevant of calling fsync or not). the code was fixed years ago to write to a separate file + flush + atomic rename, but they forgot to remove the fsync call.
but they forgot to remove the fsync call
You've asserted this a few times now, but what's your proof? You said yourself that this is your "best guess". But isn't it just as likely that the fsync was left in because it addressed some corner case (possibly on an obscure platform) that you're not aware of?
Your arguments seem to suggest that the fsync syscall is essentially useless. If that's the case, why does it even exist? It's clearly not a no-op, as it takes a non-trivial amount of time to execute.
While performance improvements are nice, we shouldn't make them at the expense of correctness. So the blocking question here is twofold:
- Do we have documentation explaining why the fsync call in this situation is unnecessary? It's not sufficient for you to just assert that's the case.
- Can we check for the conditions that make it unnecessary, and only omit the fsync in those cases? Note, for example, that on Windows, the
os.fsyncfunction calls the MS CRT function_commit. Neither of the articles you linked to above discuss Windows at all, so why do you assume the behaviour there is the same as Linux?
As a reminder - I originally said that I was OK with this change as long as the other maintainers didn't have concerns. It turns out the other maintainers do have concerns, and if you don't address them, this PR will remain blocked. I'm simply trying to clarify what the blocking concern is, but it's not me you have to persuade, it's @ichard26 and @pradyunsg