add CHANGELOG.md file
@rmzelle has mentioned this in the past, and it's time we figure it out.
Obvious is to add a CHANGELOG.md file, which I have just done, along with some basic content.
Questions are:
- how to populate and maintain it?
- what the precise format should be?
One option going forward is to automate based on the commit history.
But it might be easier to just manually maintain, and ask PRs to include updates to the file.
Please see these two links:
- https://keepachangelog.com/en/1.0.0/ (general)
- https://www.freecodecamp.org/news/a-beginners-guide-to-git-what-is-a-changelog-and-how-to-generate-it/ (discussion of tool options)
Yeah, a changelog is necessary. I'm not sure doing this automatically is the best way in that case.
See https://docs.citationstyles.org/en/1.0.1/release-notes.html and https://docs.citationstyles.org/en/1.0/release-notes.html for the existing changelogs.
Question @bdarcus: Is this about a changelog for the released version (like the linked release notes)? Or about a work in progress log?
Question @bdarcus: Is this about a changelog for the released version (like the linked release notes)? Or about a work in progress log?
Both, per recommendation on the "keep" site.
I just didn't add the earlier stuff before we settle details.
See https://docs.citationstyles.org/en/1.0.1/release-notes.html and https://docs.citationstyles.org/en/1.0/release-notes.html for the existing changelogs.
Oh wow; those are quite detailed!
I was hoping if we're manually maintaining this file to just use something like this, at least as a starting point, with links to PRs.
❯ git log --pretty="- %s"
- Add content to CHANGELOG
- Create CHANGELOG.md
- Use variables.standard pattern, format
- Add trang script for rnc validation and formatting
- Simplify and add structure to csl.rnc formatting (#210)
- Add .rng to .gitgnore
- Add terms for "no-place" and "no-name" (#206)
- Add "document" type (#207)
- Add new item types; classic, hearing, etc (#194)
- Add additional locator variables (#201)
... which of course would convert to this in rendered markdown (edited to show what the fragment ideally looks like):
Unreleased
- Add terms for "no-place" and "no-name" (#206)
- Add "document" type (#207)
- Add new item types; classic, hearing, etc (#194)
- Add additional locator variables (#201)
The better and more consistent our commit and commit message approach, the easier this is to manually maintain going forward.
If we used an automated tool, we would settle on some conventions to prepend the commit messages with certain tags (say "Add") to categorize changes (though others can base inclusion or sectioning based on the labels associated with them).
Those could be valuable regardless, of course, to ensure consistency.
To go to my question on the format, then, one question is whether is follow a flatter, traditional, changelog format (as this example), or do something more structured (as with what I put in the initial file).
The former is easier to manage, and I think we should go with easier.
Here's an example using auto-changelog.
Some of the links to issues are broken because I ran on my fork, and it assumes the main repo.
Below is the config file, which defines a couple of strings as keywords that will modify output based on the commit message.
So if you were to add tweak it would suppress the output, and if you added breaking it would highlight the commit as a breaking change.
The "release-summary" option would look in merge commits on a tagged version, and use the commit as that summary.
{
"_comment": "config file for CSL schema repo",
"unreleased": true,
"commitLimit": 15,
"template": "keepachangelog",
"starting-version": "v1.0.1",
"release-summary": true,
"breaking-pattern": "breaking",
"ignore-commit-pattern": "tweak"
}
~
Oh wow; those are quite detailed!
I think the target audience of a CSL schema/specification changelog should include CSL style authors, CSL processor implementors, and downstream implementations like Zotero. So I would try to spend some time to create something that's accessible.
(and IIRC, I wrote the original changelogs once each release was feature-frozen, by trawling through all the commits to make sure I hadn't missed anything, so a tool like "auto-changelog" might certainly come in handy and help link the changes to their PRs)
In that case, maybe we can maintain a clean, developer-oriented, changelog file here, created with auto-changelog, but manually-edited where we need.
You (and whoever else might have time and interest) can then build wider-facing release note when time permits, based on that repo changelog (which as you note will include important links and such).
I've added the config file I used to the repo, and updated the file to use its output. When we do the releases, we can fix the content.
We can also leave this issue open as we figure out the details; on customization, and on commit message conventions going forward, and then include what we conclude in the contributing.md doc.
Key things to think about:
- for breaking changes in CSL style or input files, we should simply include the word "breaking" or phrase "breaking change" in the commit message, and let the tool pick that up to flag
- we need one-or-more keywords as well for minor changes to ignore; I think "tweak" might be a good one? See this commit message, for example.
- we otherwise might categorize areas of change; I saw @dhimmel used a convention in his PRs of prepending "CI:" and "CSL JSON." We could do that in the short summary line, and/or in keywords within the longer body of the message.
We probably want a custom template to organize the output for our needs, and commit-list helper (see also this comment) to define the regex patterns to look for in the commit messages to pass to the template; like so (note the "Additions" section, which is matching on [New]):
{{#each releases}}
### [{{title}}]({{href}})
{{! List commits with `Breaking change: ` somewhere in the message }}
{{#commit-list commits heading='### Breaking Changes' message='Breaking change: '}}
- {{subject}} [`{{shorthash}}`]({{href}})
{{/commit-list}}
{{! List commits that add new features, but not those already listed above }}
{{#commit-list commits heading='### Additions' message='[New]'}}
- {{subject}} [`{{shorthash}}`]({{href}})
{{/commit-list}}
{{/each}}
Note, also, auto-changelog can dump the data as json (not using syntax-highlighting here because it will flag some of those keywords).
{
"id": "57",
"message": "Slight update to embedded citation JSON - now using ADDIN (wdFieldAddin) field type",
"href": "https://github.com/citation-style-language/schema/pull/57",
"author": "bdarcus",
"commit": {
"hash": "06f663099afcd619cd5edbb743a533952ebd74d0",
"shorthash": "06f6630",
"author": "bdarcus",
"email": "[email protected]",
"date": "2011-06-29T11:44:05.000Z",
"subject": "Merge pull request #57 from SteveRidout/master",
"message": "Merge pull request #57 from SteveRidout/master\n\nSlight update to embedded citation JSON - now using ADDIN (wdFieldAdd>
"fixes": null,
"href": "https://github.com/citation-style-language/schema/commit/06f663099afcd619cd5edbb743a533952ebd74d0",
"breaking": false
}
}
Proposal, for commit message conventions that will support auto changelog maintanance.
Summary Format
The basic template would be [type](scope): [Keyword verb] [short detail] [(GH PR number].
Types:
Per this (though I''m sure I could find better sources):
- feat (feature additions)
- chore (stuff related to the repo other than the core code, and test)
- refactor (internal schema changes that have no outward impact)
- deprecate (patterns or strings marked for removal from schemas in a future release)
- remove (patterns or strings previously marked deprecate, now deleted)
- fix (bug fix)
- style (strings, whitespace, etc.)
- test (CI-related)
An optional scope can also be added.
Examples:
-
feat: Add 'foo' variable -
feat: Add 'barvariable` -
deprecate: A -
test(input): Add validation for x
Keyword Verbs
- add
- deprecate
- fix
- modify
- remove
Alternative
We could follow a convention like org-mode uses, which puts the scope first: scope: [Keyword verb] [short detail] [(GH PR number]. So
-
schemas: Add foo variable -
schemas(input): Modify bar
Or simply remove the xxx: bit entirely, and do more as less as we've been doing, but just adopt consistent verbs in the commit message?
- deprecate (patterns or strings dropped from schemas, with processing implications)
We'll need to settle on the meaning of "deprecate"... Is it: "Removed and invalid from now on." Or: "Don't use this no longer as we'll remove this in an upcoming release." (I'd say it is the latter, and I think that was also @bwiernik's take on this. Right?)
Good point.
As I've been using it in commit messages so far, it is the former.
But I'm not wedded to that. I agree, we need to decide.
Yes, it’s the latter. Deprecate means “still here for compatibility, but don’t use”. If it is being removed from the schema/spec, just remove it and the change log says “removed”.
OK, I updated the draft, but keeping as draft, since I'm still thinking about how best to do this.
Removed the draft on this post.
I'm not super sure myself, in part because there's duplication.
But I'm shooting for conventions that will yield clear git logs, milestones, changelog going forward.
I experimented with the ideas on some of the current open PRs.
Thoughts?
So also this github action, which would allow us to integrate auto-labeling of issues and PRs based on keywords as well.
https://github.com/marketplace/actions/issue-auto-labeling-and-assigning
The change log is for people who aren't closely following the development. It is fine for it to be somewhat redundant with the commit log--we shouldn't expect someone writing or updating a citation processor to read through commit logs.
Exactly. What I'm saying is if we adopt the right conventions for commit log messages, then the auto-updated changelog will actually be useful for those folks, and the auto-label assignment should also just work.