datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

Remove git-commit-id-maven-plugin

Open SemyonSinchenko opened this issue 1 year ago • 10 comments

What is the problem the feature request solves?

Currently building comet on environment without access to github.com (like an enterprise environment, closed by FW) is very hard because of git-commit-id-maven-plugin that requires having content of .git folder and an access to github.com.

Describe the potential solution

Remove git-commit-id-maven-plugin to allow build from zip disctribution, downloaded from github (without .git folder and without an access to github.com).

Additional context

To build comet for myself I just deleted this plugin from common/pom.xml and it worked fine.

SemyonSinchenko avatar Mar 11 '24 18:03 SemyonSinchenko

git-commit-id-maven-plugin is an optional feature so we can consider removing it, cc @snmvaughan

sunchao avatar Mar 12 '24 04:03 sunchao

Can I make a PR?

SemyonSinchenko avatar Mar 12 '24 06:03 SemyonSinchenko

Sure, go ahead. Thanks @SemyonSinchenko

sunchao avatar Mar 12 '24 15:03 sunchao

The version information is reported by the library, and used by groups to track the specific version. I would not want to see it removed unless it is replaced by equivalent functionality.

snmvaughan avatar Mar 12 '24 16:03 snmvaughan

@SemyonSinchenko This is a common function. Spark itself uses a script build/spark-build-info for this purpose. We should look to enable local compilation but without removal of the functionality

snmvaughan avatar Mar 12 '24 16:03 snmvaughan

What if I create something like spark build info? It looks like only branch, build, commit and remote information are included. In this case such a bash script may be an optional step, but not a mandatory call of the plugin from maven.

SemyonSinchenko avatar Mar 12 '24 17:03 SemyonSinchenko

These are the currently used values:

  val COMET_VERSION = CometBuildInfo.cometVersion
  val COMET_BRANCH = CometBuildInfo.cometBranch
  val COMET_REVISION = CometBuildInfo.cometRevision
  val COMET_BUILD_USER_EMAIL = CometBuildInfo.cometBuildUserEmail
  val COMET_BUILD_USER_NAME = CometBuildInfo.cometBuildUserName
  val COMET_REPO_URL = CometBuildInfo.cometRepoUrl
  val COMET_BUILD_TIMESTAMP = CometBuildInfo.cometBuildTimestamp

See common/src/main/scala/org/apache/comet/package.scala

snmvaughan avatar Mar 12 '24 17:03 snmvaughan

This short bash script can generate everything like in common/target/classes/comet-git-info.properties:

#!/usr/bin/bash

echo_build_properties() {
	echo "git.branch=$(git rev-parse --abbrev-ref HEAD)"
	echo "git.build.host=$(uname -n)"
	echo "git.build.time=$(date -u +%Y-%m-%dT%H:%M:%S%z)"
	echo "git.build.user.email=$(git config user.email)"
	echo "git.build.user.name=$(git config user.name)"
	echo "git.build.version=${1}"
	echo "git.commit.id.abbrev=$(git rev-parse --short HEAD)"
	echo "git.commit.id.full=$(git rev-parse HEAD)"
	echo "git.remote.origin.url=$(git config --get remote.origin.url |  sed 's|https://\(.*\)@\(.*\)|https://\2|')"
}

echo_build_properties $1

In this case VERSION should be passed into the script. It may be got from mvn but it is like a magic: ./mvnw -q -Dexec.executable=echo -Dexec.args='${project.version}' --non-recursive exec:exec. I can add into the script, but in spark bash script that was my inspiration, version is passed.

@snmvaughan Does it look OK for you? If so, I can open a PR that adds this bash-script into, for example bin/ and name it comet-build-info. Also, I can add a call of that script into Makefile as something like .PHONY build_props and also add it to all targets that build JAR (like jvm: -> jvm: build_props, release: -> release: build_props, etc.)

SemyonSinchenko avatar Mar 12 '24 19:03 SemyonSinchenko

Instead of removing git-commit-id-maven-plugin, I think a more practical approach would be conditional generating the build property files.

You can move the git-commit-id-maven-plugin plugin into a separate profile which should be activated by default. The maven build system could be setup to automatically disable that profile if a targeting build properties already existed. In that way you can pre create the build property file via the script above and disable the git-commit-id-maven-plugin plugin.

advancedxy avatar Mar 13 '24 13:03 advancedxy

I found an example for profiles. Also, an author of the plugin mentioned, that it may be disabled via -Dmaven.gitcommitid.skip=true. But I have no idea how to make it works like "skip only if file exists"... Anyway, with this command -Dmaven.gitcommitid.skip=true the issue is resolved, I'm sorry, I did not know about that option. In this case anyone, who wants to build Comet from source distribution (without .git folder) can do it with this single option.

May I make a PR with updates to README.md related to building from source (not from cloned repo)? It may save someone couple of hours of life :)

SemyonSinchenko avatar Mar 13 '24 16:03 SemyonSinchenko