gatk
gatk copied to clipboard
DDO-4137 update to read artifacts from GAR
JIRA ticket https://broadworkbench.atlassian.net/browse/DDO-4137
Summary of changes Updated the artifact source for this service to read artifacts from Google Artifact Registry instead of JFrog.
What Modified the config to pull artifacts from GAR. No publishing changes were made.
Why This is part of Phase 2 of the artifact migration effort to deprecated JFrog and consolidate artifacts in GAR.
Testing these changes Relied on passing CI checks and tests. Artifact reading-related issues are expected to surface as pipeline failures. For example these jobs here and here failed due to access issues (which have since been resolved).
@ungwudik Thanks for this. It doesn't work though without more changes. Do you know how authentication is handled for it? Is this a google repository? Is it public? I can't view anything at https://us-central1-maven.pkg.dev/dsp-artifact-registry/libs-snapshot or https://us-central1-maven.pkg.dev/dsp-artifact-registry/`. It just gives me 404.
I tried publishing to their and I get:
Execution failed for task ':publishGatkPublicationToArtifactoryRepository'. > Failed to publish publication 'gatk' to repository 'Artifactory' > Could not write to resource 'https://us-central1-maven.pkg.dev/dsp-artifact-registry/libs-snapshot/org/broadinstitute/gatk/4.6.2.0-7-g342fa26-SNAPSHOT/gatk-4.6.2.0-7-g342fa26-20250605.165244-1.jar'. > Broken pipe
Sorry, I had not intended to change the publishing part. This PR is for migrating only the "read" part. Publishing for services/repo will be addressed in a later phase by the service/repo-owning teams, with guidance from DevOps. A lot more details/context in this document
For read access check, please try curling a specific artifact such as https://us-central1-maven.pkg.dev/dsp-artifact-registry/libs-snapshot/org/broadinstitute/gatk/4.6.0.0-20-g2f15726-SNAPSHOT/gatk-4.6.0.0-20-g2f15726-20241011.180722-1-javadoc.jar
OR
try accessing the repositories via cloud console and let me know if you can view the pages. But please note that access may be delayed for some people until this ticket is completed. https://console.cloud.google.com/artifacts/maven/dsp-artifact-registry/ https://console.cloud.google.com/artifacts/maven/dsp-artifact-registry/us-central1/libs-snapshot/
Thanks for reviewing!
@ungwudik If I understand correctly, it sounds like publishing doesn't work at this time ? Why would we want to read from a repo if we can't publish other artifacts to it ? We rely on the ability to publish temporary pre-release artifacts in order to test cross-repo dependencies before we do releases. Unless I'm misunderstanding, these PRs seem to break that. If this is correct, it there a reason we can't wait until the publish part is working ?
@ungwudik If I understand correctly, it sounds like publishing doesn't work at this time ? Why would we want to read from a repo if we can't publish other artifacts to it ? We rely on the ability to publish temporary pre-release artifacts in order to test cross-repo dependencies before we do releases. Unless I'm misunderstanding, these PRs seem to break that. If this is correct, it there a reason we can't wait until the publish part is working ?
Publishing should continue to work as expected. Nothing is expected to change on that front at this stage. We are approaching the artifact migration in phases due to the complexity of the overall project. Reading from GAR is relatively straightforward, whereas publishing involves more coordination and setup. That's why this phase focuses solely on migrating reads, while publishing will still go to JFrog for now.
This approach works because, in the previous phase of the migration, we set up a Kubernetes job that automatically replicates new artifacts published to JFrog into GAR within 5 minutes. This ensures consistency between the two systems during the transition.
We don't want to wait until the publish migration is ready because, once all services are confirmed to be reading from GAR, we want to safely delete historical artifacts from JFrog. This allows us to begin realizing cost savings (which is one of the drivers for this migration project) sooner, even while publishing in still in transition.
This PR covers only the read migration. Updating publishing configurations will be handled in the next phase by the respective service or repository owners, with guidance from DSP DevOps.
This document may also be helpful for providing additional context.
Please let me know if you have any further question or need more clarification.
@ungwudik I see - thanks for the explanation - I didn't see anything mentioning that replication job. It might be helpful to include a comment in the build.gradle files in these PRs stating that the interim state (continuing to publish artifacts to jfrog, but consuming them from gar) works because of that out-of-band process.
@ungwudik I see - thanks for the explanation - I didn't see anything mentioning that replication job. It might be helpful to include a comment in the build.gradle files in these PRs stating that the interim state (continuing to publish artifacts to jfrog, but consuming them from gar) works because of that out-of-band process.
Oh I had not included the job link to that doc, but I've done so now. Here's the link though https://console.cloud.google.com/kubernetes/cronjob/us-central1-a/dsp-tools/artifactmigration/artifactmigration-migrate-cronjob/details?inv=1&invt=Ab0TtA&project=dsp-tools-k8s
I'll add some comments as suggested.