nexus3-cli
nexus3-cli copied to clipboard
Allow full downloads on large repos
Thx for writing this.. has eased a lot of stuff with repo transfers between instances.
Added this around line 176 of nexus_client.py
try:
content = response.json()
except json.decoder.JSONDecodeError:
raise exception.NexusClientAPIError(response.content)
and got the following:
nexuscli.exception.NexusClientAPIError: b'ERROR: (ID 40b7714e-41ac-416a-8cbc-7fe8b7a0b639)
Failed to execute phase [query], all shards failed;
shardFailures {[pUq2BG92Raedvpiju7ztMg][215bfafbc6963d8c9f7ec9a57f88c6223b00dfa6][0]:
RemoteTransportException[[8A042D9D-8A28015E-AF29C30B-EC2AE858-7B66110B][local[1]][indices:data/read/search[phase/query]]]; nested:
QueryPhaseExecutionException[Result window is too large, from + size must be less than or equal to: [10000] but was [10050].
See the scroll api for a more efficient way to request large data sets.
This limit can be set by changing the [index.max_result_window]
index level parameter.]; }
I think this is an elastic search issue with nexus etc and I doubt they will fix it anytime soon.
Wrote a quick wrapper to use the cli to download every sub directory. Would be cool if you could implement a way to do this within the cli.
e.g. grab the directory structure and download each sub directory 1 by 1.
Thanks for the bug report @pbsladek. I'm looking to reproduce this locally - how big is too big? I'm assuming 10,001 files.
I have a feeling that the directory iteration strategy might still fail for directories with a number of files above 10k.
Nexus issue: https://issues.sonatype.org/browse/NEXUS-16917
To reproduce:
nexus3 repository create raw raw
from nexuscli import nexus_client, nexus_config
config = nexus_config.NexusConfig()
config.load()
c = nexus_client.NexusClient(config)
for i in range(10001):
c.upload('/dev/null', f'raw/a{i}')
nexus3 dl raw/ .
@pbsladek the error can happen on a single directory with more than 10k files, so the strategy of breaking-up downloads per directory won't always work, although it would probably cover most cases.
I might implement your suggestion but I'd like to think about this for a bit.
Hey, no problem. Thanks for taking a look.
I ran into the same issue you mentioned on our repos. Generated a list of file names and downloaded them individually. Prob wouldn't make sense for the cli to manage that though.
Hi, this issue can also be related to migrate repositories from a server A to a server B? Like doing a full download of all your components for a migration? thanks
Oi, @forgondolin. Yes, you would probably see this in an operation like you describe. I just looked at the (upstream issue)[https://issues.sonatype.org/browse/NEXUS-16917] and Sonatype won't be fixing this any time soon. It might help if everyone who sees this issues goes there and upvotes the bug.
Meanwhile, use the workaround that @pbsladek suggested: break-up your downloads into chunks of up to 10,000 files.
I'm happy to review PR contributions for a work-around but I'm also unlikely to do it myself.