polars icon indicating copy to clipboard operation
polars copied to clipboard

RFC: conventional commits

Open ritchie46 opened this issue 3 years ago • 4 comments

I believe @universalmind303 already suggested this once. The polars repo is very fast paced and keeping a changelog has proven to be a maintenance burden that is too high.

To make it possible to automate creating a changelog from commit messages we need to have a standard format. Therefore I want to propose using conventional commit messages.

Ideally they can even be enforced in CI.

Proposal

types:

  • feat
  • fix
  • chore
  • ci
  • docs
  • perf
  • test
  • refactor
  • revert

Scopes:

May be comma delimited

Required:

  • python
  • rust

Optional

...

Breaking changes

Add ! Just before the colon : that prefixes the description.

Add a BREAKING CHANGE footer, followed by an optional description.

<type>[optional scope(s)]: <description>

[optional body]

[optional footer(s)]

ritchie46 avatar Aug 13 '22 14:08 ritchie46

I am a fan of using conventional commits. I do this in all my repos, even if I have never used it to auto-generate a changelog.

Not sure how to enforce it though, since you are using the GitHub UI to squash & merge the changes. You'll have to manually specify the commit name. We can ask people to name their PR as they would name the commit - then GitHub autofills it I believe.

stinodego avatar Aug 13 '22 19:08 stinodego

We can ask people to name their PR as they would name the commit - then GitHub autofills it I believe.

This is what I was going to suggest

hpux735 avatar Aug 17 '22 16:08 hpux735

An alternative is to mimic pandas -- they ask contributors to manually update a whatsnew/vX.Y.Z.rst file as part of a PR. This has some benefits over an auto-generated changelog because it is curated to contain only important changes. The auto-generated changelog will be filled with various chore-type commits as well.

An auto-generated changelog is definitely less of a burden for contributors. AFAIK commitizen is the standard tool for generating changelogs.

Not sure how to enforce it though, since you are using the GitHub UI to squash & merge the changes.

Are we open to the idea of no longer squashing commits? If so, using commitizen would be fairly straightforward.

matteosantama avatar Aug 20 '22 11:08 matteosantama

Are we open to the idea of no longer squashing commits? If so, using commitizen would be fairly straightforward.

That's quite intrusive IMO, and squashing removes a lot of commit bloat. I believe that being consistent with the commit tags should be a start. Then we could write a tool later that parses the commits and aggregates them to a markdown file.

ritchie46 avatar Aug 20 '22 12:08 ritchie46

I've seen release-drafter in action and it's super pretty. I'm going to see if we could make this work with the added complexity of having separate Rust and Python releases.

The idea is that a new draft release is created as new commits are being merged into the main branch, building the changelog as you go, and at some point you push 'release' and that's it.

stinodego avatar Sep 24 '22 16:09 stinodego

I've seen release-drafter in action and it's super pretty. I'm going to see if we could make this work with the added complexity of having separate Rust and Python releases.

The idea is that a new draft release is created as new commits are being merged into the main branch, building the changelog as you go, and at some point you push 'release' and that's it.

Wow.. if that works, that would be a huge quality improvement.

ritchie46 avatar Sep 24 '22 16:09 ritchie46

Well, it works! I created a repo to test release-drafter for drafting multiple releases, and it works like a charm.

Here's how it works:

  • Everything works based on PR tags. No fiddling with conventional commits required.
  • An auto-labeler assigns rust/python tags based on files changed (we can use the one we already have).
    • These determine which release the PR ends up in. A PR can be in multiple releases.
  • Another auto-labeler assigns tags for changelog sections, based on branch names.
    • For example, a branch called feat/cool-stuff will be tagged feature and put under the "Features" header.
    • We should come up with sensible headers. Features / bug fixes / API changes / ...? Maybe we can use the latest pandas release as inspiration.
  • Multiple tags are allowed. For example, you could create a "Highlights" section and manually add a highlight tag to important PRs.
  • Versions are automatically incremented based on the previous release of the same type. Adding a breaking tag will trigger a minor version bump (as we are in pre-1.0.0 still).
  • Changelog entries contain the PR name, so this name should be descriptive (which is great; the convential commit tags are not very readable)
  • When ready to release, go to Releases -> Edit release -> Publish release. This creates the release and the tag.
  • The tag will trigger the release workflow for publishing wheels/crates.

I will show you everything on Friday; but I believe this is exactly what we need!

stinodego avatar Sep 25 '22 09:09 stinodego

A small note going forward: we should start using (parentheses) instead of [brackets] for defining the scope (rust, python). This is the official convention, and the new autolabeler will only properly tag the scope if the convention is followed.

stinodego avatar Oct 01 '22 03:10 stinodego