lucene icon indicating copy to clipboard operation
lucene copied to clipboard

ASF CI snapshot publishing fails occasionally with network failures

Open dweiss opened this issue 7 months ago • 4 comments

Description

This is odd. The publishing component does have an internal retry (3 times, 1 sec. interval):

https://github.com/gradle/gradle/blob/aeccb00494ab0b279603de9eeb709f7aa899e5fe/platforms/software/dependency-management/src/main/java/org/gradle/api/internal/artifacts/repositories/transport/NetworkOperationBackOffAndRetry.java#L26

I can't see any log messages in the output because they're logged at info level. I'll enable info-level debugging to confirm it's really trying to publish. If it's working as expected, I'll manually tweak the interval and/or retries.

dweiss avatar May 14 '25 05:05 dweiss

It does retry uploading. https://ci-builds.apache.org/job/Lucene/job/Lucene-Maven-Snapshots-main/2217/console

Uploading lucene-expressions-11.0.0-20250514.070312-231.jar to /content/repositories/snapshots/org/apache/lucene/lucene-expressions/11.0.0-SNAPSHOT/lucene-expressions-11.0.0-20250514.070312-231.jar
Uploading lucene-expressions-11.0.0-20250514.070312-231.pom to /content/repositories/snapshots/org/apache/lucene/lucene-expressions/11.0.0-SNAPSHOT/lucene-expressions-11.0.0-20250514.070312-231.pom
Uploading lucene-expressions-11.0.0-20250514.070312-231-sources.jar to /content/repositories/snapshots/org/apache/lucene/lucene-expressions/11.0.0-SNAPSHOT/lucene-expressions-11.0.0-20250514.070312-231-sources.jar
Uploading lucene-expressions-11.0.0-20250514.070312-231-javadoc.jar to /content/repositories/snapshots/org/apache/lucene/lucene-expressions/11.0.0-SNAPSHOT/lucene-expressions-11.0.0-20250514.070312-231-javadoc.jar
Error in 'PUT https://repository.apache.org/content/repositories/snapshots/org/apache/lucene/lucene-expressions/11.0.0-SNAPSHOT/lucene-expressions-11.0.0-20250514.070312-231-javadoc.jar'. Waiting 1000ms before next retry, 2 retries left
Error in 'PUT https://repository.apache.org/content/repositories/snapshots/org/apache/lucene/lucene-expressions/11.0.0-SNAPSHOT/lucene-expressions-11.0.0-20250514.070312-231-javadoc.jar'. Waiting 2000ms before next retry, 1 retries left

dweiss avatar May 14 '25 07:05 dweiss

The other build did succeed after a retry -

Error in 'PUT https://repository.apache.org/content/repositories/snapshots/org/apache/lucene/lucene-backward-codecs/maven-metadata.xml.sha256'. Waiting 1000ms before next retry, 2 retries left
Successfully ran 'PUT https://repository.apache.org/content/repositories/snapshots/org/apache/lucene/lucene-backward-codecs/maven-metadata.xml.sha256' after 1 retries

dweiss avatar May 14 '25 07:05 dweiss

I've added -Dorg.gradle.internal.network.retry.initial.backOff=5000 -Dorg.gradle.internal.network.retry.max.attempts=20 to these configurations on Jenkins. Let's see if this helps.

dweiss avatar May 14 '25 07:05 dweiss

I also let infra know: https://issues.apache.org/jira/browse/INFRA-26821

dweiss avatar May 14 '25 07:05 dweiss