planemo
planemo copied to clipboard
consider exclude in shed update
by considering exclude when
- build and downloading tar balls
- diff
shed update should now consider exclude in .shed.yml
I checked this (https://github.com/galaxyproject/planemo/pull/1106/commits/a1e2cae620bf195c70ea97ae09f3ea0db8488861) in the following way on the TTS.
- added a file (exclude.txt) to a repo (https://testtoolshed.g2.bx.psu.edu/repos/mbernt/fail)
- shed_update
- added the exclude to shed.yml
- shed_update
Afterwards the file was gone again (on the TTS).
Would be great to have for updating the OpenMS tools (where we have quite a few extra files in the TS that shouldn't be there, i.e. we would like to remove them from the TS and not consider them for the question if a repo should be updated).
TODO(?):
- [ ] I guess we could/should do the same for include, but I'm not sure what the default '**' does
- [ ] not sure if the use of exclude in diff, os.walk, and tar is equivalent (which would be nice) .. also it should be consistent with the docs (https://planemo.readthedocs.io/en/latest/standards/docs/best_practices/shed_yml.html#shed-upload-includes-excludes) and I think it is not yet
- [ ] which other planemo commands no not consider exclude/include and should
may fix: https://github.com/galaxyproject/planemo/issues/554
This looks good to me, is it ready to be merged?
This looks good to me, is it ready to be merged?
Wondering about the questions in the TODOs. In particular question 2.
Is there anything that deserves a test in this changes? If yes ideas?
build_tarball
is also used in planemo/commands/cmd_shed_build.py
, so that needs to be also updated or exclude
made optional.
build_tarball
is also used inplanemo/commands/cmd_shed_build.py
, so that needs to be also updated orexclude
made optional.
No problem, but I'm still quite confused by the code. It seems that RawRepositoryDirectory
already considers excludes (https://github.com/galaxyproject/planemo/blob/38227c1c17137de3bb085f941f7bfd3e9a08aa77/planemo/shed/init.py#L1025) partially (for instance not in _realized_files) but RealizedRepositry not (btw. do we want to fix the typo).
Both tools essentially start by calling shed.for_each_repository
-> _realize_effective_repositories
-> there _find_raw_repositories
constructs RawRepositoryDirectory
s which are then turned into RealizedRepositry
. So I guess the excludes
could be ignored in the calls to build_tarball
from the commands.
Is this correct?
I don't remember all the details - but the idea is that the raw directory structures are materialized into effective repository structures with includes and excludes handled and tools and tool-data demultiplexed. So should any of this be needed? Maybe exclude is only used for automatic demultiplexing though?