[POC] license ApacheV2
After #558
Context
The current license for xdem is the MIT License. We would like to transition to the Apache License 2.0 for the following reasons:
-
The Apache License 2.0 is a comprehensive license, whereas MIT and BSD licenses are more like copyright notices that need to be adjusted on a case-by-case basis.
-
The Apache License 2.0 is a newer license compared to the older MIT and BSD licenses, which were created in the 1980s. The Apache License 2.0, released in 2004, is better structured and more explicit about the rights granted. The MIT and BSD licenses originated at a time when software legal frameworks were less developed, with international harmonization only occurring in 1996 through the WIPO Copyright Treaty.
-
The Apache License 2.0 includes an "anti-patent" clause (a feature it shares with GNU GPL, LGPL, and AGPL v3.0 licenses released in 2007). This clause protects users from contributors who might hold patents and threaten legal action in the future.
Implementation
- [ ] Audit the tool and verify license compatibility
- [ ] Manage richdem [todo] #558
- [ ] Obtain consent from all rights holders
- [ ] Obtain copyright information from all rights holders
- [ ] Update the license file
- [ ] Update file headers, e.g.:
# Copyright (c) 2024 Centre National d'Etudes Spatiales (CNES).
#Copyright (c) 2024 xdem developpers
#
# This file is part of xdem project:
# https://github.com/glaciohack/xdem
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
- [ ] Update documentation
- [ ] Update README files
- [ ] Implement an information campaign
- [ ] Verify file compliance
- [ ] Create an AUTHORS.md file
estimation
1d
Waiting for richdem issue
After discussions with @erikmannerfelt, we agreed that having a long header at the beginning of each file and listing all contributors to that file is a bit of a hassle and we would prefer to find an easier solution if that is possible. Looking at other projects with Apache v2 license, like xarray or OpenCV, none of them seemed to follow the structure suggested in this issue and used in dempcomare.
- Regarding the header: xarray does not include headers, OpenCV has the Apache v2 header in some files (e.g., this one) but shorter headers simply pointing to the license file in others (e.g. this one)
- Regarding the copyright: In most open source projects that we use, listing individual contributors by name seem quite uncommon. For Numpy, some author names are explicitly stated while sometimes it is referred to as the "NumPy developers" (see a list of copyright occurrences), xarray seem to also use "xarray Developpers" in several occurrences, pandas requires using "Copyright (c) 2012, PyData Development Team" etc
In short, our preference would be to use a short header like OpenCV pointing to the license file, and to have a copyright to "the xdem developers" (possibly with a list of contributors somewhere in a file) and CNES. We could discuss with Sebastien about the drawbacks/problems of this approach.
As @adehecq referred to me, I'll take the liberty of replying here.
Firstly, a distinction must be made between the contributor and the copyright holder / owner.
- Contributor: The author (individual) of a work (source code, documentation, sound or graphic work, etc.) provided to the project, so that it can be integrated into the project and distributed with it.
- Copyright holder / owner: The individual or legal body who holds the copyright on the contribution. In France and the European Union, when an employee develops software as part of his or her job, he or she is the author of the work, but not the copyright holder. The law automatically assigns copyright to the employer (researchers working in public research have a different legal status).
The Apache Foundation recommends that the copyright header should mention the names of the copyright holders, not the contributors (i.e. the authors of the contributions).
It is quite possible to group them together under a generic term such as:
- XXX development team (community approach, deliberately avoiding distinguishing merits, which may fluctuate over time).
- YYY and others (if YYY has had a truly decisive role in this project compared to the others).
The Apache Foundation recommends the copyright header below:
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
In my humble opinion, the header should also mention the project from which the file comes, so that you can trace its origin and make it self-supporting:
Copyright [yyyy] [name of copyright owner]
This file is part of xdem project:
https://github.com/glaciohack/xdem
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
The main authors and other contributors can be listed in an AUTHORS (or equivalent name) file, provided in the root folder of the Git repository. Cf.:
- https://github.com/psf/requests/blob/main/AUTHORS.rst
- https://github.com/nedbat/coveragepy/blob/master/CONTRIBUTORS.txt
- https://github.com/apache/xerces-c/blob/master/CREDITS
Last but not least, it is strongly recommended to announce the license in a dedicated section of the README file, which points to the license file:
- https://www.freecodecamp.org/news/how-to-write-a-good-readme-file/
Here are some examples:
- https://gitlab.orekit.org/orekit/orekit#license
- https://github.com/RasaHQ/rasa/blob/main/README.md#license
- https://github.com/create-go-app/cli/blob/main/README.md#%EF%B8%8F-license
@sdinot, you are of course more than welcome to write on this project! Thank you for the clear details. As discussed yesterday in a meeting with @adebardo and @duboise-cnes, the preferred option would indeed to have an "AUTHORS.rst" file or equivalent. I like the example from request that you gave, which lists contributors by current, past and more minor contributors categories. Something like this would be easy to add and maintain for the geoutils/xdem.
What is still unclear to me is what copyright holders should appear in the files' headers. It is clear for CNES or CS contributions, but what about contributions from current Glaciohack members, or future "independent" contributors? Ideally we would like something generic, that may point to a more detailed list of copyright holders, e.g. in the authors file. And of course it would be good to prevent future contributors to add a copyright that would make future changes very difficult... Finally, in practice, in which situation would the copyright statement become important? With the permissive license, almost everything is allowed and we do not plan to switch to a less permissive license in the future.
And yes, I agree it would be good to include a mention to the license in the README.
Finally, in practice, in which situation would the copyright statement become important? With the permissive license, almost everything is allowed and we do not plan to switch to a less permissive license in the future.
There are many reasons for tracing copyright holders correctly, and they are all the more important in a world that is tending to become increasingly litigious and to make intellectual property sacred.
We must never forget that distributing a work under a free and open source license is not a surrender of the rights granted by law, but a sharing of some (not all) of those rights.
Even when you publish a project under a permissive license such as the Apache v2.0 license (or even the MIT or BSD license), you retain some of your rights exclusively, which allow you to make decisions and choices that are forbidden to others.
For example, let's imagine that one day a legal loophole or weakness is detected in version 2.0 of the Apache license, leading the Apache Foundation to publish a new version (3.0) of its license. In that case, it would certainly be worthwhile for the xdem project to adopt this new version of the Apache license and declare that from this day forward, xdem is distributed under the Apache v3.0 license. Only the copyright holders can do this.
Here's another example of how being recognized as a copyright holder helps, which CS encountered with the Orekit project (released under the Apache v2.0 license). A third party had copied Orekit's source code into his project (which he had decided to turn into a proprietary tool), removing the copyright notices and replacing them with his own, thus claiming to be the sole author and copyright holder of this library. We discovered this illegal action by chance and, because we were the copyright holders of Orekit, we were able to force this third party to respect the license. They had to restore the copyright notices in their software, and indicate in their software documentation that they were using Orekit. Here again, only the copyright holders can take legal action if necessary.
It is therefore in your interest to make it known that you are the copyright holder.
Ideally we would like something generic, that may point to a more detailed list of copyright holders
Yes, you can do that. You can fill two files: AUTHORS (or CONTRIBUTORS) and COPYRIGHTOWNERS. GitHub has attempted to create a de facto standard through the CODEOWNERS specification, but in fact the purpose of this file is a little different from ours.