strimzi-kafka-operator icon indicating copy to clipboard operation
strimzi-kafka-operator copied to clipboard

KafkaConnect build does not use custom repository for parent maven dependency resolution

Open moronito opened this issue 1 year ago • 12 comments

Describe the bug When specifying a custom repository for a maven dependency in the build section of a KafkaConnect resource, the custom repository it is not used for the resolution of the parent pom.

To Reproduce Steps to reproduce the behavior:

  1. Create a KafkaConnect resource like the following:
    apiVersion: kafka.strimzi.io/v1beta2
    kind: KafkaConnect
    metadata:
      name: test
    spec:
      replicas: 1
      bootstrapServers: kafka-bootstrap:9092
      build:
        output:
          type: docker
          image: registry.intesys.it/test/test-connect:latest
          pushSecret: registry-test-creds
        plugins:
          - name: protobuf
            artifacts:
              - type: maven
                repository: https://packages.confluent.io/maven
                group: io.confluent
                artifact: kafka-connect-protobuf-converter
                version: 7.2.1
    
  2. See error:
    INFO[0015] Running: [/bin/sh -c 'curl' '-f' '-L' '--create-dirs' '--output' '/tmp/protobuf/112dfc4c/pom.xml' 'https://packages.confluent.io/maven/io/confluent/kafka-connect-protobuf-converter/7.2.1/kafka-connect-protobuf-converter-7.2.1.pom'       && 'mvn' 'dependency:copy-dependencies' '-DoutputDirectory=/tmp/artifacts/protobuf/112dfc4c' '-f' '/tmp/protobuf/112dfc4c/pom.xml'       && 'curl' '-f' '-L' '--create-dirs' '--output' '/tmp/artifacts/protobuf/112dfc4c/kafka-connect-protobuf-converter-7.2.1.jar' 'https://packages.confluent.io/maven/io/confluent/kafka-connect-protobuf-converter/7.2.1/kafka-connect-protobuf-converter-7.2.1.jar'] 
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    
      0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
     6877  100  6877    0     0   291k      0 --:--:-- --:--:-- --:--:--  291k
    [INFO] Scanning for projects...
    Downloading from central: https://repo.maven.apache.org/maven2/io/confluent/kafka-schema-registry-parent/7.2.1/kafka-schema-registry-parent-7.2.1.pom
    [ERROR] [ERROR] Some problems were encountered while processing the POMs:
    [FATAL] Non-resolvable parent POM for io.confluent:kafka-connect-protobuf-converter:7.2.1: Could not find artifact io.confluent:kafka-schema-registry-parent:pom:7.2.1 in central (https://repo.maven.apache.org/maven2) and 'parent.relativePath' points at wrong local POM @ line 6, column 13
     @ 
    [ERROR] The build could not read 1 project -> [Help 1]
    [ERROR]   
    [ERROR]   The project io.confluent:kafka-connect-protobuf-converter:7.2.1 (/tmp/protobuf/112dfc4c/pom.xml) has 1 error
    [ERROR]     Non-resolvable parent POM for io.confluent:kafka-connect-protobuf-converter:7.2.1: Could not find artifact io.confluent:kafka-schema-registry-parent:pom:7.2.1 in central (https://repo.maven.apache.org/maven2) and 'parent.relativePath' points at wrong local POM @ line 6, column 13 -> [Help 2]
    

Expected behavior I would expect the custom repository to be used for all the resolutions of the maven artifacts, including the parent pom and the transitive dependencies. A better alternative could be to allow specifying an array of repositories, all of which are used for artifact resolution.

Environment (please complete the following information):

  • Strimzi version: 0.30.0
  • Installation method: YAML files
  • Kubernetes cluster: Kubernetes 1.24.4
  • Infrastructure: On-premise cluster created with kubeadm

moronito avatar Sep 01 '22 09:09 moronito

I don't think this feature really works they way you expect it to. It pulls the pom.xml file from the custom location and then pull its dependencies through Maven. You can check the generated Dockerfile to see the exact commands.

scholzj avatar Sep 01 '22 10:09 scholzj

It seems indeed to do so, but before trying to download the dependencies it tries to resolve the parent pom. While doing this the operator does not use the custom repository, but instead uses Maven Central (as you can see in the logs). At least this is what I understood by looking at the logs.

I am using the feature in order to build a Kafka Connect image with Protobuf serialization support. In order to do so I could manually specify the main jar (referenced by the artifact kafka-connect-protobuf-converter) and its dependencies, or I could just specify kafka-connect-protobuf-converter as the only artifact and let the operator download its dependencies as specified in its pom.xml file. Is my expectation correct or am I missing something?

moronito avatar Sep 01 '22 11:09 moronito

Well, given how it works the support for third party repositories is clearly limited. I think this is a valid point for something what can be maybe enhanced in the future. But as the code looks today, it will not work the way you try to use it. I think @sknot-rh implemented this originally. I'm not sure if this is a bug or if the use-case was originally different TBH.

scholzj avatar Sep 01 '22 11:09 scholzj

Ok, now I understand your point. Since full support for third party repositories would be very useful for me, I will try to look at the code and see if I can do something. In the meantime thanks for your help.

moronito avatar Sep 01 '22 11:09 moronito

I have the same issue when trying to get a JAR from a third party repository. @scholzj you said the support for third party repository was not designed to be used by all the Maven process but only the retrieval of the initial pom/library ? If so it is incomplete as it is often the case that Maven library have parent pom and depdendencies all in the same repository.

Is a fix for this planned ?

loicmathieu avatar Sep 15 '22 14:09 loicmathieu

If you want to contribute some improvements, we are for sure open to it. I'm not aware of anyone else planning to work on this anytime soon, but I would expect this to be kept as enhancement or bug until someone finds some time for it.

scholzj avatar Sep 15 '22 14:09 scholzj

@scholzj can you point me to the code so I can have a look ? I'm not sure I'll be able to work on it but I can try.

loicmathieu avatar Sep 15 '22 14:09 loicmathieu

I do not know out of my head where the logic for this is, sorry. The Dockerfile used in the build is generated here: https://github.com/strimzi/strimzi-kafka-operator/blob/main/cluster-operator/src/main/java/io/strimzi/operator/cluster/model/KafkaConnectDockerfile.java

scholzj avatar Sep 15 '22 14:09 scholzj

I guess this part does the whole thing for Maven with repositories: https://github.com/strimzi/strimzi-kafka-operator/blob/main/cluster-operator/src/main/java/io/strimzi/operator/cluster/model/KafkaConnectDockerfile.java#L145-L169

scholzj avatar Sep 15 '22 14:09 scholzj

Hi all, I have developed something partially working. @loicmathieu if you want I can share with you what I've found so far.

moronito avatar Sep 15 '22 14:09 moronito

@moronito yes please

loicmathieu avatar Sep 15 '22 14:09 loicmathieu

@loicmathieu the only way I've found to make Maven always consider a third party repo is to create an appropriate maven-settings.xml with the extra repos and then run the mvn command with -s flag. See this gist for an example of a pom and a related maven-settings.xml. Unfortunately I didn't have enough time to integrate the generation of maven-setttings.xml file in strimzi code.

I hope this can be of help to you.

Edit: this approach could possibly be extended to include multiple third party repositories, by including all of them into the generated settings file.

moronito avatar Sep 15 '22 17:09 moronito

Triaged on the community call on 3rd November: This should be improved / fixed and all artifacts should be pulled form the custom repo.

scholzj avatar Nov 03 '22 16:11 scholzj