commitizen icon indicating copy to clipboard operation
commitizen copied to clipboard

Incorporate --version-files flag to cz bump instead of hard-reading it from the config

Open andres-fr opened this issue 5 years ago • 8 comments

Description

Bump mechanism requires version to be in multiple files.

Possible Solution

Incorporate --version-files flag to cz bump instead of hard-reading it from the config.

Here we can see that version_files is directly read from the settings: https://github.com/commitizen-tools/commitizen/blob/14b6c382ff96e37ba118ba1b75073b0d2313ee91/commitizen/commands/bump.py#L86

I see that Bump has a parameterless __call__, which may complicate things. I was thinking about extending it with an own class in a script, but again I would really like to keep open the chance of using the commitizen-action if possible.

Any suggestions? Maybe I'm missing the obvious? Thanks again and cheers, Andres

Additional context

Hi everyone! Python user here. Commitizen is amazing. I have a feature request regarding the bump command.

In Python it is usually not straightforward to keep the version in a unique place, for several reasons. Many just don't, but in my case I really would like to keep it that way for several reasons. I decided to go for this approach of having the metadata in a python file that is part of the package but can be easily parsed. That is the only place with the version, and so far all systems are working.

The problem comes when I want to auto-bump using commitizen. This seems to require a config file in the repo root (.pyproject.toml) and a set of hardcoded values with it. I would rather avoid that. See "possible solution" for more details.

Related Issue

andres-fr avatar Oct 26 '20 23:10 andres-fr

This MWE is a quick&dirty implementation of what I meant. I guess at this point it is not worth integrating it with the GH action and better just calling this script from the workflow.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
python ci_scripts/commitizen_bump.py -P ml_lib/_metadata.py -v 0.1.0
"""

import os
import argparse
#
from commitizen.config import BaseConfig
from commitizen.commands.bump import Bump


if __name__ == "__main__":
    parser = argparse.ArgumentParser("Bump with custom filepaths")

    parser.add_argument("-P","--v_paths", nargs="+", required=True, type=str,
                        help="Paths to the to-be-updated version files")

    parser.add_argument(
        "-v",
        "--current_version",
        type=str,
        required=True,
        help="Semantic version like 1.0.23"
    )

    args = parser.parse_args()
    #
    V_PATHS = [os.path.normpath(p) for p in args.v_paths]
    VERSION = args.current_version


    # Update config and args with our custom args
    CONFIG = BaseConfig()
    CONFIG._settings["version_files"].extend(V_PATHS)
    CONFIG.update({"version": VERSION})
    ARGUMENTS = {'tag_format': None,#
                 'prerelease': None,#
                 'increment': None,
                 'bump_message': None,#
                 'changelog': True,#
                 'no_verify': False,#
                 'check_consistency': True,
                 #'name': None,
                 # 'debug': False,
                 'dry_run': True,
                 'files_only': False,
                 'yes': False,
                 }

    # Run the bump action
    bmp = Bump(config=CONFIG, arguments=ARGUMENTS)
    bmp()
    # import pdb; pdb.set_trace()

A small remark is that BaseConfig.update() doesn't allow to extend the "version_files" attribute , since "lists are not hashable", so I grabbed the protected _settings attribute. Inheriting would be cleaner.

andres-fr avatar Oct 27 '20 00:10 andres-fr

Hi, thanks for the suggestion! If we want to add the flag --version-files, I think maybe we can handle it here like other arguments and add it here.

Lee-W avatar Oct 27 '20 01:10 Lee-W

Thanks for the very quick reaction!

For the moment, using my quickfix, it seems that the following is equivalent to the commitizen-action (if I understood correctly it just pushes the tag after the Bump call):

          python ci_scripts/cz_bump.py --yes --changelog -P "$LIB_NAME"/"$VERSION_FILE" -v "$VERSION"
          remote_repo="https://${GITHUB_ACTOR}:${INPUT_GITHUB_TOKEN}@github.com/${REPOSITORY}.git"
          git push "${remote_repo}" HEAD:${INPUT_BRANCH} --follow-tags --tags;

This way, everything can be functionally handled from the CI runner via the corresponding environment variables, and there is no room for inconsistencies. This will have to do for the moment. Thanks again for the project and the code, it was a pleasure to dig in, I will consider the PR when I have some time and nobody else jumps in.

andres-fr avatar Oct 27 '20 01:10 andres-fr

Hey @andres-fr thanks for the feedback. Can u expand it a bit more?

The approach you linked is kind of fine, but I'm not a fan.

The intention for commitizen is to be the source of truth for the version.

This means that yes, under pyproject.toml or .cz.toml you'll set the version, and from there, it's propagated wherever is needed, like a VERSION file.

For your case, you could have a .cz.toml file:

version = "0.1.0"
version_files = [
  "lib/VERSION"
]

and that's it.


But honestly, I think commitizen paradigm is a bit different.

Commitizen doesn't care about "reusing" a variable, which adds extra complexity and makes it hard to read, like this:

import re
VERSIONFILE="myniftyapp/_version.py"
verstrline = open(VERSIONFILE, "rt").read()
VSRE = r"^__version__ = ['\"]([^'\"]*)['\"]"
mo = re.search(VSRE, verstrline, re.M)
if mo:
    verstr = mo.group(1)
else:
    raise RuntimeError("Unable to find version string in %s." % (VERSIONFILE,))

at the end of the day, we are just specifying the version, it should be simple and clear. If the version appears in multiple places, the main concern is people forgetting to update it.

if you specify a version_files like this:

version_files = [
  "setup.py",
  "src/__version__.py"
]

They will all look simple and won't need maintainance:

# __version__.py
__version__ = "0.1.0"

I'm not sure if I'm willing to add the flag, the tools is a already quite opinionated and offers some flexibility (for our sanity to maintain it).

But what we can do is to make version_files from bump.py into a function so you can inherit Bump.

Thoughts?

woile avatar Oct 27 '20 11:10 woile

Hi @Woile, thanks for your detailed thoughts.

To expand a little more: apart from uniqueness location I would like the version number to be a part of the Python code. Also, the module cannot be imported from setup.py to avoid race conditions. This IMO leaves very little room but it works OK. Like you, I'm also concerned about people forgetting to update. This gets only worse the more places the version is in. You could argue that CZ checks automatically for consistency, but (maybe this is where my use case diverges) I am working on templated repositories via cookiecutter and then you still have the problem of forgetting to add all the sources into the template, or editing the project afterwards. Unique location is the only way to sleep peacefully in my book. I totally reckon that this is in the first place a problem with Python packaging and not with CZ.

Regarding your proposed solution, I don't think it quite solves it. I mean, if we get down to Python, this can already be solved via inheritance without modifying the lib.

A quick side-rant that you can totally ignore: I personally prefer explicit interfaces instead of passing "config" objects around since they create dependencies between interface and implementation, and IMO that difficults to add features like this one in a clean way. For example, in this case Bump requires a Config object and an arguments dictionary. To me, both are just part of the same parameter list, and I don't care where they come from. But I have no easy way to know what interface I'm expected to fulfill. BaseConfig lacks implementations for required methods and TomlConfig has dependencies that I don't want, like file I/O. If I had an exposed parameter list, I wouldn't have to

  1. Figure out what is the actual interface that I have to satisfy in my particular case
  2. Implement my own class, making sure that it provides methods for things that I may/may not know or need.

But I understand and respect that this is a matter of taste and paradigm, I can roll with that. Please don't take this as bashing, I really think CZ is an amazing tool, I am using it daily and if I have a more constructive proposal in terms of more work and less words I will gladly try something.

Thanks again! Andres

andres-fr avatar Oct 27 '20 21:10 andres-fr

No problem, I understand what you mean, I also prefer explicit interfaces, everything else in the code is mostly explicit, no moving parts. Commands and configs are not the easy part of the source code I agree, working with configs is also not easy haha (we have a refactor in progress but with no deadlines).

To me, both are just part of the same parameter list, and I don't care where they come from

But that's the place where we specifically need to merge and where we do care where things come from. The original intention was not to be inherited, I tried to keep it simple, we needed a place where these things happened:

  1. collect data from different sources (init), config and arguments parsed, skip some, etc.
  2. combine the moving parts (call)

If you move point 1 outside the command Bump, where do you move it and still make it easy to understand what needs to be collected for this particular command? Better approaches are more than welcome.

To expand a little more: apart from uniqueness location I would like the version number to be a part of the Python code. Also, the module cannot be imported from setup.py to avoid race conditions. This IMO leaves very little room but it works OK. Like you, I'm also concerned about people forgetting to update. This gets only worse the more places the version is in. You could argue that CZ checks automatically for consistency, but (maybe this is where my use case diverges) I am working on templated repositories via cookiecutter and then you still have the problem of forgetting to add all the sources into the template, or editing the project afterwards. Unique location is the only way to sleep peacefully in my book.

Yes, but the problem is over time, not over the number of locations, you can accept a threshold of errors, maybe once or twice, when you are building the "template", you can forget, until you fill them all, this is not the real issue to me. By over time I mean, if different people have to update different versions in an undefined period, you'll have more and more chances of people forgetting, with commitizen, you remove that task from humans. Communication is also important: "hey teams, do not update the version, don't even think about it"

You are adding code for each location where VERSION must be used, the more code -> the more bugs. I'd rather not think about race conditions. Instead of coding those parts, commitizen "proposes" to track them in .cz.toml.

I like this quote from Sandi Metz [0]:

duplication is far cheaper than the wrong abstraction

Think about maintaining and fixing, imagine if someone has to fix a bug in the I/O operations to read the VERSION vs just updating the string from wrong version to good version.

But yes, it's quite opinionated, of course I'd like it to be more like a generic tool, but it's a heavier burden. I hope anything I said was useful haha and feel free to discuss everything, no issue at all :+1:

woile avatar Oct 28 '20 09:10 woile

Thanks for your points and your time!


If you move point 1 outside the command Bump, where do you move it and still make it easy to understand what needs to be collected for this particular command?

I think the rule of thumb here is given by dependency inversion, in this cases it translates to keeping "lib-space" separated from "app-space". IMO Bump corresponds to a relatively abstract "purpose" that finds many applications, whereas Config is just one way among several of feeding data to an interface. A way of doing this is by enforcing an explicit interface onto Bump, Check, and the other actions, move them out of commands into the core lib, and then have a "dispatcher" very close to the entry point of the cz CLI that imports the lib classes and feeds them with arguments from config, CLI, and whatever else you may encounter in the future. Your lib code won't need to be touched for that.

I reckon that you know all of this already, and that the decision here was to keep everything in app-space since everything is in commands. I can see the advantage of having a thinner API, but if things like Bump aren't abstracted from the application, you are forcing people that have different applications to basically reimplement things or go hacky road.


Regarding redundancy:

In my case the lib version is in the python code. Every Python programmer worth a gram of salt will be able to track down things like from _metadata import __version__. For the CI, I just have 1 function that parses it, bugs can only appear there. Race conditions are not an issue because operations on the git DB are atomary, and CI pipelines are based on those. What kind of scenario do you envision that causes RCs?

In any case, maybe the trees don't let us see the forest here. The quote by Sandi Metz is genius, but with a little perspective, you would agree with me that it is ridiculous that the best option is to have redundant version sources. Again, big part of the fault is by Python's packaging system, one of the corners where I miss Java most.

andres-fr avatar Oct 29 '20 13:10 andres-fr

This is what is currently working for me (gist):

https://gist.github.com/andres-fr/823001f707d713c92c57980fe8c6e3e9

Maybe worth mentioning is that I renamed the is_yes parameter into the more informative autoconfirm_initial_tags and wrapped the BumpCommitFailedError in a try-except, in order to roll-back the update_version_in_files in case of a commit error (e.g. due to pre-commit issues). This allows to repeat the operation without having to manually edit down the version tag.

Apart from some details that I left due to lazyness/convenience, this is close to what I meant before. This could be lib code, and the "command" would simply grab the config parameters, the arguments, and call ConfiglessBump from the lib.

Hope this helps in any way! Cheers, Andres

andres-fr avatar Oct 31 '20 16:10 andres-fr

This should be possible with version providers: https://commitizen-tools.github.io/commitizen/config/#custom-version-provider

Though a .cz.toml would still be needed.

woile avatar Apr 28 '23 10:04 woile