check if Python packages install overlapping files
If you have two packages in a workspace that install files at the same path, the last one wins. More crucially, the user is not informed of this and will likely not notice until runtime.
It would be nice to add a user-visible warning or error if any of the files installed are either:
- Outside the expected installation prefix (can this happen? Maybe if
data_filescontains ".."?) - Present before the package is installed. This can happen if the build directory has been deleted between builds so
_undo_installdoesn't work, or if building with--merge-installand so the package prefixes overlap.
https://github.com/colcon/colcon-core/blob/af5b8862136ed18fef529745613e16998c3af888/colcon_core/task/python/build.py#L182-L186
Since colcon doesn't have any insight what files a specific build system is installing it is difficult to perform that check generically. While the information is available for Python packages it would still be difficult to implement such a sanity check.
Please consider to contribute such a feature if you think it is valuable. Due to its complexity it might be good to share implementation ideas before hand.
I think we're on the same page. Here are my thoughts:
Run setup.py install twice, the first time with --dry-run --single-version-externally-managed --record install.log and the second time with --skip-build --single-version-externally-managed --record install.log. In between, check the contents of install.log to see which files will be installed (and maybe throw an error to abort the build?).
How does the cross package checks come in here? Do you imagine a complete 2 pass build process?
No, I still imagine builds running asynchronously from each other. If two packages intend to write the same file, the first will find it, and the second will see that file there and die.
Heck I might as well draft up a PR if I'm already working out the implementation details.
If two packages intend to write the same file, the first will find it, and the second will see that file there and die.
How would you distinguish if an already existing file has been installed by a different package or by the same package in a previous build?
How would you distinguish if an already existing file has been installed by a different package or by the same package in a previous build?
Say you build package A twice: Nothing to worry about. We already uninstall the previous build's packages based on the contents of the previous build's install.log file.
Say you build package A and package B and during B's build we find F exists: We abort the build of B and truncate the install.log file so the next build of B doesn't uninstall F.
Sounds reasonable to me.
This can happen if the build directory has been deleted between builds so
_undo_installdoesn't work,
How do you want to handle this case?
if building with
--merge-installand so the package prefixes overlap.
A warning would still be valid. Installed files should never overlap - otherwise you wouldn't be able to install both as Debian packages at the same time.
This can happen if the build directory has been deleted between builds so
_undo_installdoesn't work,How do you want to handle this case?
I think the best thing to do if the install.log file is missing and there are files that would be overwritten is to kill the build and tell the user something like:
The package could not be installed because the following files already exist. Please delete them manually:
/path/to/file1.py
/path/to/file2.png
This could even be suppressed if all files in the newly-created dry-run install.log exist, but I'm erring on the side of not doing that. I think it's fair that if the user wants to delete their build folder, they should delete their install folder as well before they can rebuild.
As an added benefit, this makes it possible (at least in theory, with a python-only workspace) to freely switch back and forth between merged and isolated workspace layouts.
Ok, lets go with a warning at first to not cause unexpected failures and give users a heads up. We can always escalate it into an error in the future.
@rotu Just checking if you are still interested in implementing this feature?
I would probably get around to it eventually, but if you or someone else want to take lead on it, I’d appreciate that