Remove git-commit-id-maven-plugin
What is the problem the feature request solves?
Currently building comet on environment without access to github.com (like an enterprise environment, closed by FW) is very hard because of git-commit-id-maven-plugin that requires having content of .git folder and an access to github.com.
Describe the potential solution
Remove git-commit-id-maven-plugin to allow build from zip disctribution, downloaded from github (without .git folder and without an access to github.com).
Additional context
To build comet for myself I just deleted this plugin from common/pom.xml and it worked fine.
git-commit-id-maven-plugin is an optional feature so we can consider removing it, cc @snmvaughan
Can I make a PR?
Sure, go ahead. Thanks @SemyonSinchenko
The version information is reported by the library, and used by groups to track the specific version. I would not want to see it removed unless it is replaced by equivalent functionality.
@SemyonSinchenko This is a common function. Spark itself uses a script build/spark-build-info for this purpose. We should look to enable local compilation but without removal of the functionality
What if I create something like spark build info? It looks like only branch, build, commit and remote information are included. In this case such a bash script may be an optional step, but not a mandatory call of the plugin from maven.
These are the currently used values:
val COMET_VERSION = CometBuildInfo.cometVersion
val COMET_BRANCH = CometBuildInfo.cometBranch
val COMET_REVISION = CometBuildInfo.cometRevision
val COMET_BUILD_USER_EMAIL = CometBuildInfo.cometBuildUserEmail
val COMET_BUILD_USER_NAME = CometBuildInfo.cometBuildUserName
val COMET_REPO_URL = CometBuildInfo.cometRepoUrl
val COMET_BUILD_TIMESTAMP = CometBuildInfo.cometBuildTimestamp
See common/src/main/scala/org/apache/comet/package.scala
This short bash script can generate everything like in common/target/classes/comet-git-info.properties:
#!/usr/bin/bash
echo_build_properties() {
echo "git.branch=$(git rev-parse --abbrev-ref HEAD)"
echo "git.build.host=$(uname -n)"
echo "git.build.time=$(date -u +%Y-%m-%dT%H:%M:%S%z)"
echo "git.build.user.email=$(git config user.email)"
echo "git.build.user.name=$(git config user.name)"
echo "git.build.version=${1}"
echo "git.commit.id.abbrev=$(git rev-parse --short HEAD)"
echo "git.commit.id.full=$(git rev-parse HEAD)"
echo "git.remote.origin.url=$(git config --get remote.origin.url | sed 's|https://\(.*\)@\(.*\)|https://\2|')"
}
echo_build_properties $1
In this case VERSION should be passed into the script. It may be got from mvn but it is like a magic: ./mvnw -q -Dexec.executable=echo -Dexec.args='${project.version}' --non-recursive exec:exec. I can add into the script, but in spark bash script that was my inspiration, version is passed.
@snmvaughan Does it look OK for you? If so, I can open a PR that adds this bash-script into, for example bin/ and name it comet-build-info. Also, I can add a call of that script into Makefile as something like .PHONY build_props and also add it to all targets that build JAR (like jvm: -> jvm: build_props, release: -> release: build_props, etc.)
Instead of removing git-commit-id-maven-plugin, I think a more practical approach would be conditional generating the build property files.
You can move the git-commit-id-maven-plugin plugin into a separate profile which should be activated by default. The maven build system could be setup to automatically disable that profile if a targeting build properties already existed. In that way you can pre create the build property file via the script above and disable the git-commit-id-maven-plugin plugin.
I found an example for profiles. Also, an author of the plugin mentioned, that it may be disabled via -Dmaven.gitcommitid.skip=true. But I have no idea how to make it works like "skip only if file exists"... Anyway, with this command -Dmaven.gitcommitid.skip=true the issue is resolved, I'm sorry, I did not know about that option. In this case anyone, who wants to build Comet from source distribution (without .git folder) can do it with this single option.
May I make a PR with updates to README.md related to building from source (not from cloned repo)? It may save someone couple of hours of life :)