conda-forge.github.io icon indicating copy to clipboard operation
conda-forge.github.io copied to clipboard

Request for clarification and documentation on applying licenses to packages that dynamically link against downstream GPL licensed dependencies

Open moorepants opened this issue 5 years ago • 19 comments

Some conda packages depend on dynamically linking to other conda packages that happen to be GPL licensed. Conda forge then distributes individual binaries for each package separately and the interrelated packages are then dynamically linked on a end user's computer when all the packages are installed and loaded.

The concern/question I have is whether the license of the conda package that is upstream of a dynamically linked GPL package has to be licensed with the GPL or not. And secondly, if so, what should the license include (e.g. licenses for everything that is dynamically linked) and even whether the source code has to be distributed with the conda package somehow.

This question arose here originally:

https://github.com/conda-forge/pyshtools-feedstock/issues/3

where we decided to apply the GPL to the pyshtools package even those the pyshtools source code is BSD licensed. pyshtools dynamically links against FFTW which is licensed "GPL 2 or later".

I then opened some issues on non-GPL licensed upstream dependencies of FFTW to warn those maintainers that they may need to re-license as GPL.

The maintainer of OpenMM, @peastman, responded with some careful explanations in which he argues that the binaries that Conda Forge distributes do not have to be GPL licensed but only GPL compatible.

See those explanations here:

https://github.com/conda-forge/openmm-feedstock/issues/24

My head has exploded trying to think about this, as it is quite confusing. Thus, I think it would be helpful to get a ruling on this and then add some documentation that clearly states the consequences of upstream packages dynamically linking to downstream GPL packages.

Summary of Research [WIP]

Note that this summary is a work in progress and it may be inaccurate.

TODO

  • [ ] Add references to each claim.

There at least two main interpretations of how to handle things:

  1. The GPL "infects" any software (source and binary) that is a derivative work of the GPL'd dependency. Using the API declared in the headers of a GPL'd library in your code constitutes a derivative work. The intent of the GPL authors [2] is that any software that links statically or dynamically to a GPL dependency, should have to be released under the GPL if distributed [c].

  2. Dynamic linking only occurs on the computer that loads the conda binary into memory alongside the GPL dependencies. The act of linking constitutes copyright "infection". Thus only if you distribute binaries together that are linked are you subject to the GPL. Since the conda binary does not include the GPL dependency when distributed, only references to the API, it can be licensed under licenses other than the GPL (the likely license being the one the primary binary's source code is using).

Points 1 and 2 are complicated by the fact that APIs may or may not be copyrighable [1]. The current ruling is that APIs can be copyrighted, but the case is still being appealed and open. It is very common that two pieces of software implement the same API but the code that backs the API are either not derivatives of each other or they are but the authors of the derivative library have permission to relicense. We get things like FFTW (GPL) and Intel's MKL which includes FFTW code (specially licensed to Intel and implements some of the FFTW API), making them drop in replacements of each other. If APIs are not copyrightable and you take the point 2 interpretation, then you are free to license your binary using any license you want to even though your binary may link to GPL'd binaries on someone else's computer. If APIs are not copyrightable and you take the point 1 interpretation, the FSF still intends for GPL to infect because there is the intention to dynamically link. If APIs are copyrightable, then we are all in big trouble.

For Conda Forge, we specify the exact dependency in the meta.yaml file and thus know that when installed on a user's computer the binary will most certainly link against the GPL'd dependency that conda installs alongside the primary binary. [a] In some sense, Anaconda.org is even distributing these software packages as a collection based on the dependency tree of any given package. So even though we are not distributing the GPL dependency alongside the primary binary in the same downloaded file, we do intend for the binaries to link.

Current conclusion

The legally safe approach is to license Conda binaries that dynamically link to GPL'd dependencies (optional or not) be distributed under the GPL as long as the license of the primary binary's source code is GPL compatible. [b] This also means we should be releasing the source code for these packages alongside the binary re-licensed under the GPL.

The more dubious approach, that essentially relies on a loophole to the GPL wording about linking, is to license the conda binary as the associated package source code is licensed.

References

[1] https://en.wikipedia.org/wiki/Google_v._Oracle_America [2] https://opensource.stackexchange.com/a/2163/17049

Footnotes

[a] It is possible that a user's conda install command pulls binaries from a variety of public and private conda channels, so we don't know what exactly will be installed by the user, but our intent in designing the meta.yaml file is that a particularly binary in the conda forge channel is installed. [b] The reason the license has to be GPL compatible is because you are required to release the source code along with the binary per the GPL. Any GPL compatible license can be relicensed as GPL. [c] The conda forge build process links the primary binary to the GPL dependency binary in the build process, thus the primary binary is considered a "combined work" by the FSF: http://www.gnu.org/licenses/gpl-faq.en.html#GPLStaticVsDynamic

moorepants avatar May 19 '20 19:05 moorepants

There seem to be a few interrelated questions. Pulling them apart to hopefully move the conversation forward. Also IANAL.

  1. What should the license of a library be if it depends on another library that is GPL?
  2. Does this change if the dependency is optional?
  3. What should the license of a package be?
  4. What are the license implications for end users?

AIUI 1 means the library itself should be GPL, but I could be wrong (related discussion suggests this is unclear). AIUI 2 depends on whether the library is linked or not.

With 3, I would propose we just mirror the packaged library's licensing as-is. It's up to the library to correctly reflect this in the license in terms of handling 1 and 2. Our goal is then to make sure the end user has this information available.

With 4, I think this is out-of-scope for conda-forge. It's up to companies, government agencies, institutions, etc. to determine the license implications for them and handle as needed. Should add when a collection of packages is installed onto a user's system, they have all of the licenses and license information available to them to determine next steps.

jakirkham avatar May 19 '20 19:05 jakirkham

cc @conda-forge/core (for more/other thoughts :)

jakirkham avatar May 19 '20 19:05 jakirkham

With 4, I think this is out-of-scope for conda-forge. It's up to companies, government agencies, institutions, etc. to determine the license implications for them and handle as needed. Should add when a collection of packages is installed onto a user's system, they have all of the licenses and license information available to them to determine next steps.

:100: to this. We are not lawyers and should not ever give any impression that we are supplying advice to others.

beckermr avatar May 19 '20 19:05 beckermr

What should the license of a library be if it depends on another library that is GPL?

I think that this should be "What should the license of a library be if it depends on and dynamically links against another library that is GPL during its build process?"

I would propose we just mirror the packaged library's licensing as-is. It's up to the library to correctly reflect this in the license in terms of handling 1 and 2.

I'm not sure these are the safest interpretations, because Conda Forge and Anaconda.org (where the binaries are stored) are distributing the binaries. I've long been under the impression that if a person or company or other entity distributes software they have to abide by the license during that act of distributing. Maybe its worth understanding who is it that distributes a conda built binary?

We are not lawyers and should not ever give any impression that we are supplying advice to others.

These kinds of statements are always made on seemingly every discussion of licenses, but I'm not sure that statements like this are helpful. As a member of society that has laws, we have to interpret the meaning of the laws and choose actions that abide by them to the best of our ability. If every response to license questions is "no comment, we are not lawyers", we'll never make it anywhere because there aren't enough lawyers to go around interpreting life for us (pro-bono or not).

moorepants avatar May 19 '20 20:05 moorepants

With 4, I think this is out-of-scope for conda-forge. It's up to companies, government agencies, institutions, etc. to determine the license implications for them and handle as needed. Should add when a collection of packages is installed onto a user's system, they have all of the licenses and license information available to them to determine next steps.

Yes, what an end user does with the software is not our responsibility. If they use it or distribute it, they have to abide by the licenses and spend their effort interpreting the licenses.

moorepants avatar May 19 '20 20:05 moorepants

These kinds of statements are always made on seemingly every discussion of licenses, but I'm not sure that statements like this are helpful. As a member of society that has laws, we have to interpret the meaning of the laws and choose actions that abide by them to the best of our ability. If every response to license questions is "no comment, we are not lawyers", we'll never make it anywhere because there aren't enough lawyers to go around interpreting life for us (pro-bono or not).

The alternative is for us to be giving legal advice which is a huge problem. Hence no comment on how users should interpret and use the license info.

beckermr avatar May 19 '20 20:05 beckermr

The alternative is for us to be giving legal advice which is a huge problem.

I've opened this issue as a request for advice on a matter of legality, not a request for "legal advice", i.e. advice from a lawyer (although if lawyers want to chime in that'd be helpful). So I hope that we can discuss this and decide how we should license binary packages built and distributed using Conda Forge's resources.

For me, if I create the build recipe and initiate the action of building the software and distributing it, I want to have some personal confidence that I'm applying and including the licenses correctly in the binary I helped distribute, because I could be held liable for that action.

moorepants avatar May 19 '20 20:05 moorepants

That's fine. I don't think you'll find anyone here willing to provide advice of this nature, whatever that nature is.

beckermr avatar May 19 '20 20:05 beckermr

For me, if I create the build recipe and initiate the action of building the software and distributing it, I want to have some personal confidence that I'm applying and including the licenses correctly in the binary I helped distribute, because I could be held liable for that action.

Agreed 100%. We worry too!

beckermr avatar May 19 '20 20:05 beckermr

I found a conversation about this same issue in the pyFFTW library's issue tracker. I've read it, there are various opinions, and I don't yet have a good summary, but I'll post it here for informational purposes:

https://github.com/pyFFTW/pyFFTW/issues/229

moorepants avatar May 20 '20 03:05 moorepants

Also this thread:

https://github.com/xtensor-stack/xtensor-fftw/issues/36

moorepants avatar May 20 '20 03:05 moorepants

The author of the FFTW library left this comment on the Julia bindings to FFTW:

https://github.com/JuliaMath/FFTW.jl/pull/41#discussion_r141968395

And the resulting message in the README of the Julia bindings is here:

The FFTW library will be downloaded on versions of Julia where it is no longer distributed as part of Julia. Note that FFTW is licensed under GPLv2 or higher (see its license file), but the bindings to the library in this package, FFTW.jl, are licensed under MIT. This means that code using the FFTW library via the FFTW.jl bindings is subject to FFTW's licensing terms. Code using alternative implementations of the FFTW API, such as MKL's FFTW3 interface are instead subject to the alternative's license. If you distribute a derived or combined work, i.e. a program that links to and is distributed with the FFTW library, then that distribution falls under the terms of the GPL. If you just distribute source code that links to FFTW.jl, and users have to download FFTW or MKL to provide the backend, then the GPL probably doesn't have much effect on you.

So, this seems to indicate if FFTW is downloaded separately from the software that links to it (as is done with conda) then the GPL does not infect the software.

moorepants avatar May 20 '20 04:05 moorepants

More related issues:

  • https://github.com/conda-forge/conda-forge.github.io/issues/209
  • https://github.com/conda-forge/pyfftw-feedstock/issues/24

moorepants avatar May 20 '20 04:05 moorepants

@jakirkham @moorepants Redirecting from the conversation that started at https://github.com/conda-forge/libtiff-feedstock/issues/74

After putting some more thought into this

I agree that is unreasonable to audit everything - but from a pragmatic approach each recipe is individually maintained and it is seems a reasonable approach to verify that the package's license is compatible with the licenses of the immediate dependencies. If every node in the tree does their due diligence no further problem exists.

This in fact seems to be a task that the cf-linter (or a new workflow could manage) and provide a non critical message to PRs for maintainers to review. I mention the cf-linter because it already flags licenses it doesn't understand. In essence the following could be added:

If the package license is already GPL then do no work otherwise the following:

  • Collect the host & run dependencies for the package(s)
  • Check if the any of them have a GPL license
  • Post a non-critical message asking for the reviewers to confirm the package licensing is correct

That approach should keep package dependency trees clean and then maintainers can easily make informed decisions on their recipes. It absolves conda-forge from having to police licensing and provides a generally clearer picture of what is going on in the tree to downstream packages.

bryan-hunt avatar May 24 '22 14:05 bryan-hunt

If the package license is already GPL then do no work otherwise the following:

That should be, "If the package license is GPL compatible then do no work." As long as your license is GPL compatible (MIT, BSD, LGPL...), there's no problem with linking to GPL libraries.

peastman avatar May 24 '22 14:05 peastman

If the package license is already GPL then do no work otherwise the following:

That should be, "If the package license is GPL compatible then do no work." As long as your license is GPL compatible (MIT, BSD, LGPL...), there's no problem with linking to GPL libraries.

That is only true for LGPL and when there is a GPL exception in place for linking (such as the famous linux userspace linking exception). The GPL is explicitly designed to propagate. You can include lesser license like MIT/BSD/LGPL etc code within a GPL project - you can't relicense GPL code to a less restrictive license. Compatibility is not bi-directional - many of the libraries that are GPL do make this explicit as well. See https://www.gnu.org/licenses/gpl-faq.en.html#WhatDoesCompatMean

Any time a lesser license links to a GPL library you need to confirm that the author of that library does not intend for the license to propagate. If there is no explicit declaration that it does not supersede the license of the resulting linked code then you must consider that it does. Just because it's a library and not a full application does not mean you can get away with ignoring the GPL - if the resulting library is linked it is covered by the terms of the GPL that it is linked to.

Some additional references: https://wikipedia.org/wiki/GPL_linking_exception

bryan-hunt avatar May 24 '22 15:05 bryan-hunt

This in fact seems to be a task that the cf-linter (or a new workflow could manage)

Ultimately, you have to remember that conda-forge is a community project with many of us donating our time.

So the issue isn't that this is not a good idea. But rather that implementing it not trivial and takes time.

hmaarrfk avatar May 24 '22 16:05 hmaarrfk

That's not correct. You are free to distribute your own code under any license you want. It doesn't include any GPL code, so there's no problem.

When your program gets loaded into memory, and links to a separately installed GPL library, that linked executable in memory becomes a "derived product" under the terms of the GPL and must be treated as GPL licensed. Your own library must therefore have a license that permits doing that. This is what's meant by "GPL compatible".

https://www.gnu.org/licenses/gpl-faq.html#LinkingWithGPL

peastman avatar May 24 '22 16:05 peastman

That's not correct. You are free to distribute your own code under any license you want. It doesn't include any GPL code, so there's no problem.

When your program gets loaded into memory, and links to a separately installed GPL library, that linked executable in memory becomes a "derived product" under the terms of the GPL and must be treated as GPL licensed. Your own library must therefore have a license that permits doing that. This is what's meant by "GPL compatible".

https://www.gnu.org/licenses/gpl-faq.html#LinkingWithGPL

Yes - under most circumstances the packages we're talking about are compiled libraries and not severable from their dependency. This is why there is the recipe license and the package license - the package license reflecting the source code license rather than the derived work license is a misrepresentation.

This gets weird when talking about python - but when it comes to compiled libraries it ends up being pretty clear.

bryan-hunt avatar May 24 '22 16:05 bryan-hunt