bee-agent-framework icon indicating copy to clipboard operation
bee-agent-framework copied to clipboard

copyright.sh could use some improvement for open source project

Open markstur opened this issue 1 year ago • 3 comments

Describe the bug

There are 3 (debatable) problems with automatic adding of copyright with copyright.sh.

  1. Using "IBM Corp." in the copyright is not very open source community friendly. There are many opinions, but a good Linux Foundation suggestion is "Copyright The XYZ Authors." where XYZ is the project name (https://www.linuxfoundation.org/blog/blog/copyright-notices-in-open-source-software-projects)
  2. The one-line SPDX-License-Identifier is becoming preferred over text blobs to reduce the slight variations that happen in text blobs (potentially significant rewording).
  3. I got an error because "jq" was not installed. We can avoid that (or else it should be documented dev setup somewhere).

Maybe this debate was already settled, but I'm suggestion we make those changes. It looks like this would affect our other repos (same script to change).

TBD: Whether to change existing copyrights. I don't like touching existing copyrights, so I'd focus on fixing this for added files, but the project is new enough that updating copyrights from IBM to project authors, but be a good thing to do.

To Reproduce Steps to reproduce the behavior:

  1. Add a new file w/o copyright and license
  2. Commit it
  3. See the copyright/license header that was added (if you have jq)
  4. Also you could just look in our existing files already added

Expected behavior

  1. An open source community project should ideally have an open license and avoid individual person or company copyrights.
  2. A per-file license might be a best practice, but it can be abbreviated to be succinct and consistent
  3. New devs should not get "jq" not found errors when doing a commit (unless we add that to our contributing guide)

Screenshots / Code snippets If applicable, add screenshots or code snippets to help explain your problem.

Set-up:

  • Bee version: [e.g. v0.0.3]
  • Model provider [e.g. watsonx]

Additional context Add any other context about the problem here.

Feel free to have opinions on this. The current process was probably leveraged from other projects. There's more than one way.

markstur avatar Sep 20 '24 23:09 markstur

/assign me

markstur avatar Sep 20 '24 23:09 markstur

We could switch to license-eye (installable via brew, written in Go).

Usage:

.licenserc.yaml

header:
  license:
    spdx-id: Apache-2.0
    copyright-owner: IBM Corp.
  paths: ["{src,dist,tests,scripts}/**.{ts,js}"]

Commands:

license-eye header check
license-eye header fix

From the next library, I would expect the following features:

  • easy to install
  • easy integration to the pre-commit check (CLI must exit with non-zero code if there is some file without appropriate header, unfortunately nwa does not support that - it always exits with zero code)
  • easy integration to GitHub Actions

Tomas2D avatar Sep 21 '24 09:09 Tomas2D

Regarding jq and other required local tools (like nwa / license-eye) -- we could set up mise-en-place which is something like nvm but for everything (https://mise.jdx.dev/dev-tools/backends/). So one would only need brew install mise and then entering the folder would automatically put the correct set of tools into PATH (or prompt to install them).

For users who don't want to use mise, we would just document that they need to install the tools listed in .mise.toml in any way they like.

JanPokorny avatar Sep 24 '24 08:09 JanPokorny

This issues is lingering in the open status. Do we need a quick circle up to resovle, or can we close?

geneknit avatar Nov 05 '24 22:11 geneknit

@geneknit Maybe we should reevaluate whether we need the per-file licence header at all? Many projects just use a single license file per repository, including e.g. Microsoft projects (example: https://github.com/microsoft/autogen/blob/main/python/packages/autogen-core/src/autogen_core/components/_closure_agent.py)

JanPokorny avatar Nov 06 '24 08:11 JanPokorny

This issues is lingering in the open status. Do we need a quick circle up to resovle, or can we close?

I don't mind closing this if we have no immediate consensus but the current IBM copyright being automatically applied to non-IBM contributions is very questionable. Apache2 makes the actually copyright have little legal affect anyway ("2. Grant of Copyright License") whether it is IBM or project or individual contributor.

InstructLab settled on just a one-line "# SPDX-License-Identifier: Apache-2.0" w/ no copyrights in files. caikit uses "# Copyright The Caikit Authors" followed by the Apache2 block comment.

markstur avatar Nov 06 '24 19:11 markstur

Does this sound okay to you @ismaelfaro?

geneknit avatar Nov 08 '24 01:11 geneknit