grimoirelab icon indicating copy to clipboard operation
grimoirelab copied to clipboard

Automate the process of updating the license

Open vchrombie opened this issue 5 years ago • 8 comments
trafficstars

As of now, we have a manual process of updating the license at least updating the year.

Example: https://github.com/chaoss/grimoirelab-perceval/commit/076953e95735401b4d9266562f9ae406a30751a0

We can automate this as this applies to all the components in the grimoirelab. Maybe a small script that updates the year, which can be executed yearly once.

This can be extended to updating the Authors in the license too.

Inspired by https://github.com/Bitergia/prosoul/issues/13#issuecomment-591940810. :slightly_smiling_face:

vchrombie avatar Feb 27 '20 12:02 vchrombie

This issue is open for discussion. Once we decide on what all things we need to automate, then we can proceed on for implementation.

I would like to help with the implementation part too. :slightly_smiling_face:

vchrombie avatar Feb 27 '20 12:02 vchrombie

Thank you for opening this issue @vchrombie!

Please could you propose:

  • the metadata information to be updated within a file
  • a possible implementation/approach to update the file metadata information

We can use your proposal as a baseline for discussions.

valeriocos avatar Feb 27 '20 12:02 valeriocos

Hi @valeriocos

  • the metadata information to be updated within a file
  • We have to update the copyright year period in each file. perceval/init.py#L3

  • I am not sure about the Author field, as I have a doubt. My question is how do you define author over here? Is it like only the person who created the file in the starting or something like a contributor to that file? perceval/init.py#L18

    I would like to know a few inputs on this, if possible. cc @germonprez @GeorgLink @jsmanrique @jgbarah


  • a possible implementation/approach to update the file metadata information

I have got two implementations for now.

  1. One practical approach is to replace the particular line. To be specific, replace old_text with new_text. You can refer to the gist. Though it kinda weird approach, but it works fine. You can see the changes here https://github.com/chaoss/grimoirelab-perceval/compare/master...vchrombie:license-automation.

  2. There is an existing project specifically this purpose, johann-petrak/licenseheaders. This works really well if you consider just years.

python3 licenseheaders.py -y 2015-2020 -d perceval/backends

This would change the copyright year period in all the code files in the backend folder. There is no support for the Authors as of now. But, I think we can have a fork and change the project as required for the chaoss organization.

vchrombie avatar Feb 28 '20 09:02 vchrombie

I am not sure about the Author field, as I have a doubt. My question is how do you define author over here? Is it like only the person who created the file in the starting or something like a contributor to that file? perceval/init.py#L18

I would go for a simple and common definition: An author is anyone that at some point has edited/authored the file. For instance, in the case of perceval/init.py, we should have 3 authors (as pointed by the GitHub UI).

I have got two implementations for now.

Why solution 1. is weird? If I understand the approach correctly, it looks for some text in a given file and replace it, right?

Solution 2. seems to be too much for what we are looking for (that tool uses templates and focuses on licences), however we can have a look at it as a source of ideas.

There is no support for the Authors as of now

Maybe we could use the Git backend of Perceval to get the commits of a repository and then extract the authors of the commits in a given file. Another option could be to use the GitHub commits API to get the same information. WDYT?

valeriocos avatar Feb 28 '20 12:02 valeriocos

I would go for a simple and common definition: An author is anyone that at some point has edited/authored the file. For instance, in the case of perceval/init.py, we should have 3 authors (as pointed by the GitHub UI).

Okay.

Why solution 1. is weird? If I understand the approach correctly, it looks for some text in a given file and replace it, right?

The approach involves making a temporary file, writing all contents of the source file in the temporary file and substituting the string and at the last replacing the source file with the temporary file. I just felt weird because it is a lot of process, nothing else. :sweat_smile:

Solution 2. seems to be too much for what we are looking for (that tool uses templates and focuses on licences), however we can have a look at it as a source of ideas.

Yes, exactly.

Maybe we could use the Git backend of Perceval to get the commits of a repository and then extract the authors of the commits in a given file. Another option could be to use the GitHub commits API to get the same information. WDYT?

This seems to be a perfect idea. :smiley:

vchrombie avatar Feb 28 '20 12:02 vchrombie

Thank you for the quick reply!

The approach involves making a temporary file, writing all contents of the source file in the temporary file and substituting the string and at the last replacing the source file with the temporary file. I just felt weird because it is a lot of process, nothing else. sweat_smile

I see :) it's a PoC, it can be improved in the next iteration

valeriocos avatar Feb 28 '20 12:02 valeriocos

Hi @valeriocos

Can you please check this repository, vchrombie/grimoirelab-scripts when you have time. I have written a script which fetches the authors names and updates that with the copyright information using a template.

This is the result when I executed the script for the backend.py.

# Copyright (c) 2015-2020 Bitergia
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# Authors:
#     animesh <[email protected]>
#     Valerio Cosentino <[email protected]>
#     JJMerchante <[email protected]>
#     Santiago Dueñas <[email protected]>
#     Harshal Mittal <[email protected]>
#     Jesus M. Gonzalez-Barahona <[email protected]>

As of now, most of them are hard-coded. But, it can be improved by having some iteration through the files using the os module maybe.

Also, the next step here is to remove the initial content (existing copyright information in the file) and add the new content (generated one). This should not affect much as git shows only additions and deletions, but not how you did.

vchrombie avatar Mar 07 '20 21:03 vchrombie

Hi @valeriocos

A small update.

I completed the script, at least it works now, but can be improved more. I tried updating the source code files too and I sent a draft PR testing the script, https://github.com/chaoss/grimoirelab-perceval/pull/623.

vchrombie avatar Mar 08 '20 10:03 vchrombie