Combine with other project, make better than either
Hi
I'm the author of the (an?) other git-test (spotify/git-test), and I think this is great. Interestingly, it has completely different strength to the monstrosity I wrote, and I think there's a very nice opportunity to make something better than either tool on its own.
To wit,
- The caching-by-tree should combine with test-tree skip feature, to connect tests with trees other than the repo root.
- Checking out tests separately from checking out code, to facilitate running a new test against history.
- Implementation in Python, (yours, to start, I think) using shell script was a real lesson.
- Test-output caching link farm, (mine) but redone to make it less of a monster, maybe keep last three by default?
...and more. I don't really have to think about all of it right now, but I wanted to make sure I try to establish the contact before either forgetting about it, or going overboard :-)
cheers Anders
@aes this is an amazing project but it's fairly inactive. If you look in [the network graph] there are quite a few forks and pull requests with additional features that are unmerged.
I'm extremely grateful to @mhagger for building this tool. I use it every day. I know open source maintainers have no obligation.
@aes I think if you want to create an active fork that there's a lot of enthusiasm, and you could get a lot of help. For myself, I'm interested in adding features to run different tests on different commits in parallel, each in its own git worktree.
I'm extremely grateful to @mhagger for building this tool. I use it every day. I know open source maintainers have no obligation.
Aside from not making foot-guns and attractive nuisances, sure.
@motlin That's... hmm.. 🤔 I should think about this, and jot down some notes. I think (re-)implementing a lot of the features is surprisingly manageable. An ungodly fraction of the effort in my git-test went into shell compatibility. Let's not do that again. Ever.
@aes: Thanks for the contact. As you can tell, I haven't had much time at all to work on this project recently (though I use it daily and it pretty much does the things that I need of it). I don't really expect that situation to change anytime soon.
On top of that, I am not using much Python anymore day-to-day and am not keeping up to date on that ecosystem, which means friction for me whenever version compatibility or packaging issues come up. At the same time I find Python packaging pretty kludgy. If I were going to resume work on this project, then it would probably be to add parallelism features, because that would be pretty cool. And in that case, a conversion to Go might be my first step.
I have no objection at all to your forking this project and adding new features, or cannibalizing code or ideas from it into your project. Note that this project is under GPL-2.0 license, which I think prevents importing this code directly into an Apache-2.0 licensed project. I personally have no problem with my own contributions being relicensed as Apache-2.0, but you'd have to consider other contributors as well. It might be easier to license the new tool as GPL-2.0 if you want to copy code wholesale.
I noticed that the spotify/git-test repo is archived. Would it be re-activated, or would you work on the combined project somewhere else? Would the work be done under the aegis of Spotify?
One meta-issue that I've had in the back of my mind is that I think it would be nice to have a standard convention for defining tests in gitconfig, that would be recognized across tools. This project uses test.$NAME.command for defining Shell-based tests and --test=$NAME to select which test to run, with test.default.command being the default. For example, I think it would be cool if git bisect had a feature to run tests that are defined like this. So if you work on a new tool, consider that.
(Somehow two weeks went by... 'copious free time', indeed.)
In random order:
Project meta
- It's been a long time since I worked at Spotify, and while I do know people currently in I/O, I think the most I could accomplish is to get that readme updated.
- Python packaging is disaster, always has been. That could be changing, but don't hold your breath. I suggest a single script written in a conservative style.
- I think Python is easy to get things done in and I don't like Golang, but it's not a big deal. I do like Rust, if that makes any difference? Anyway, I don't think implementation language will make much difference, as long as it's something reasonably pragmatic.
- Parallelism in Python is sad, but not necessarily as terrible as you might think, especially if it's mostly waiting for an external command.
- License issues are nonsense, for any number of reasons, good and bad, but ultimately, re-implementation is sufficient to make it moot.
Feature plans
- I agree with the test definition and selection point. I think there's more win to be found. For example, an idea I've been thinking about is the case where you discover a new bug, make a test for it, and ask, "How long was it like this?" In that case, you'd want to run a new test against old code, checking out different versions of subject and test code.
- I would like to make it possible to tighten subject and test tree mappings, so that changes in another part of the repo doesn't trigger re-testing unchanged tests of unchanged subjects. So, for example, writing more documentation should not trigger several minutes of testing. Ideally, we'd all be using something like Bazel or Buck, but until those become easier to use, ahem, no.
- I like parallelization, but controlling it is maybe no-trivial? I'm thinking of false positives from resource exhaustion or contention. An important safety versus optimization point here would be to know if they can run in the same work tree or not, so this might be incompatible with keeping complexity manageable. Describing how to first build an artefact, and then running multiple tests in parallel also sounds like a YAML blivet.
I think I'll need to start feeling my way ahead before I add anything more, but perspectives are more than welcome.
(Sorry to reopen the issue, but I'm suspicious of GitHub managing communications correctly otherwise.)