community.general icon indicating copy to clipboard operation
community.general copied to clipboard

galaxy.yml: exclude non-collection files from collection build

Open dmsimard opened this issue 3 years ago • 19 comments

SUMMARY

Files that belong to the git repository such as git, github and azure pipeline configuration should not be included in the published collection tarball.

Excluding tests saves on what makes the bulk of the size of the collection, both in size and number of files.

In aggregate, excluding these files results in a meaningful improvement which will speed up the installation of the collection and reduce the size on disk.

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

galaxy.yml

ADDITIONAL INFORMATION

Before

> du -sh community-general-3.7.0.tar.gz 
2.2M community-general-3.7.0.tar.gz

> du -sh community.general
18M	community.general

> find . -type f | wc -l
2127

> time ansible-galaxy collection install /home/dmsimard/Downloads/community-general-3.7.0.tar.gz 
Starting galaxy collection install process
Process install dependency map
Starting collection install process
Installing 'community.general:3.7.0' to '/home/dmsimard/.ansible/collections/ansible_collections/community/general'
community.general:3.7.0 was installed successfully
________________________________________________________
Executed in   10.44 secs    fish           external
   usr time    9.91 secs    0.00 micros    9.91 secs
   sys time    0.35 secs  606.00 micros    0.35 secs

After

> du -sh community-general-3.7.0.tar.gz 
1.6M community-general-3.7.0.tar.gz

> du -sh community.general
9.1M community.general

> find . -type f | wc -l
738

> time ansible-galaxy collection install /home/dmsimard/dev/git/ansible_collections/community/general/community-general-3.7.0.tar.gz
Starting galaxy collection install process
Process install dependency map
Starting collection install process
Installing 'community.general:3.7.0' to '/home/dmsimard/.ansible/collections/ansible_collections/community/general'
community.general:3.7.0 was installed successfully
________________________________________________________
Executed in    6.75 secs    fish           external
   usr time    6.43 secs    0.00 micros    6.43 secs
   sys time    0.20 secs  614.00 micros    0.20 secs

dmsimard avatar Oct 05 '21 20:10 dmsimard

Marked as WIP while we discuss https://github.com/ansible-community/community-topics/issues/29

dmsimard avatar Oct 05 '21 21:10 dmsimard

Has there been any progress on this? What are the legal issues here?

gotmax23 avatar Aug 02 '22 11:08 gotmax23

There is no progress (to my knowledge). The legal problem still is (if I remember correctly): the collection tarball acts both as the source and the installable artefact. Excluding files used for the development from it would violate the GPL.

felixfontein avatar Aug 02 '22 17:08 felixfontein

@felixfontein IANAL but even including a URI to those files should be sufficient, the files and any changes must be available, but it does not mean every possible form of distribution of the product must have them ... otherwise distributing binary packages would be untenable.

bcoca avatar Aug 02 '22 17:08 bcoca

I'm not sure what the exact problems are, the community team has been in (non-?)communication with RH legal on this matter for ... quite a long time now. I don't know exactly what they asked, but so far there was no answer AFAIK.

felixfontein avatar Aug 03 '22 20:08 felixfontein

I've created https://github.com/ansible-community/community-topics/issues/126 to discuss this.

gotmax23 avatar Aug 10 '22 16:08 gotmax23

I'm not sure what the exact problems are, the community team has been in (non-?)communication with RH legal on this matter for ... quite a long time now. I don't know exactly what they asked, but so far there was no answer AFAIK.

This is a fair assessment -- I do not remember when because it's been so long now and I'm no longer employed at Red Hat to check my emails but I opened a ticket with Red Hat open source legal describing this issue and asking for advice but have not heard back.

There is a ticket number somewhere if someone would like to follow up internally but I do not know what it is. It is possible that @gundalow could be able to find it.

dmsimard avatar Aug 11 '22 20:08 dmsimard

Looks like we can continue with this, according to https://github.com/ansible-community/community-topics/issues/131. I would still keep the .md files, and changelogs definitely must stay.

felixfontein avatar Nov 23 '22 18:11 felixfontein

Also this needs a changelog fragment. IMO this is a breaking change and has to wait for the next major release. (Someone could depend on importing from ansible_collections.community.general.tests.unit for example.)

felixfontein avatar Nov 23 '22 18:11 felixfontein

The list I use for the Fedora ansible-collection-community-general package is:

  • .azure-pipelines
  • .github
  • .gitignore
  • .pre-commit-config.yaml
  • changelogs/.gitignore
  • changelogs/fragments
  • tests

gotmax23 avatar Nov 23 '22 19:11 gotmax23

Feel free to open a new PR, close this one and link to it for context.

I've been out of the loop for long enough and don't mind.

dmsimard avatar Nov 23 '22 22:11 dmsimard

Excluding tests saves on what makes the bulk of the size of the collection, both in size and number of files.

Sounds like collections really need an sdist vs a wheel separation too...

webknjaz avatar Nov 24 '22 18:11 webknjaz

Excluding tests saves on what makes the bulk of the size of the collection, both in size and number of files.

Sounds like collections really need an sdist vs a wheel separation too...

I strongly agree. The lack of collection source distributions presents a problem for the ansible package, because these files are removed from the ansible sdist if they are removed from the Galaxy artifact.

gotmax23 avatar Nov 24 '22 18:11 gotmax23

I also strongly agree (and I think I mentioned more than once during community meetings that this would be really helpful and solve most of our problems :) ).

felixfontein avatar Nov 24 '22 19:11 felixfontein

Sounds like collections really need an sdist vs a wheel separation too...

How can we move this forward? Should I create a community-topic? I guess we'd need agreement from the core team and the Galaxy team. IIRC, @jctanner mentioned that this would be difficult to implement with Galaxy NG's current architecture when we last discussed this.

gotmax23 avatar Nov 24 '22 21:11 gotmax23

@gotmax23 a community topic would be great. Another possible place would be https://github.com/ansible/proposals/, but since there should be something in https://github.com/ansible-community/community-topics/, even if just to coordinate everything, I guess we should start with that.

felixfontein avatar Nov 25 '22 20:11 felixfontein

I've opened https://github.com/ansible-community/community-topics/issues/161

gotmax23 avatar Nov 25 '22 22:11 gotmax23