athens icon indicating copy to clipboard operation
athens copied to clipboard

List endpoint excessively slow

Open 56KBs opened this issue 2 years ago • 15 comments

Describe the bug When completing a /list lookup on a dependency to identify all versions of that dependency, the lookup is consistently very slow.

To Reproduce Steps to reproduce the behaviour:

export ATHENS_STORAGE=/tmp/athens-storage
mkdir -p ${ATHENS_STORAGE}
docker run --rm -v $ATHENS_STORAGE:/var/lib/athens \
   -e ATHENS_DISK_STORAGE_ROOT=/var/lib/athens \
   -e ATHENS_STORAGE_TYPE=disk \                                                          
   --name athens-proxy \
   -p 3000:3000 \
   gomods/athens:v0.11.0
time curl http://localhost:3000/github.com/aws/aws-sdk-go-v2/@v/list -s -o /dev/null
0.01s user 0.01s system 0% cpu 31.045 total

Expected behaviour I would have expected the time for this to run to be similar to that of querying via go list -m -versions -json github.com/aws/aws-sdk-go-v2@latest, or querying proxy.golang.org directly (https://proxy.golang.org/github.com/aws/aws-sdk-go-v2/@v/list)

Environment (please complete the following information):

  • OS: [e.g. Linux 64bit]
  • Go version : 1.14.15
  • Proxy version : v0.11.0
  • Storage (fs/mongodb/s3 etc.) : fs

Additional context This in particular is causing us pain internally as we are using https://github.com/renovatebot/renovate to update dependencies internally.

Once we configured renovate to use our Athens proxy we experienced scan times increasing from ~12 minutes to ~2 hours - I believe this is due at least in part because of this /list endpoint.

56KBs avatar Oct 26 '21 15:10 56KBs

Having looked at this further, it appears to be the expected behaviour when running GOPROXY=direct go list -m -versions -json github.com/aws/aws-sdk-go-v2@latest. Proxying this request via proxy.golang.org reduces query time down to ~3 seconds, which is still seems excessive - Having looked at other issues in this project, it appears that caching the /list results is not a simple solution so likely won't happen?

56KBs avatar Oct 27 '21 08:10 56KBs

I can see the same issue trying to do a go get -u X for a private module hosted on github. Here are the results both with and without using athens (This is an athens server with 1CPU serving only this request).

Using athens: 25 seconds

< time GOPRIVATE=github.com/COMPANY GOPROXY=https://athens.internal.COMPANY go get -v -u github.com/COMPANY/PROJECT

real    0m25.567s
user    0m1.248s
sys     0m0.368s

Not using athens: 3 seconds

< time GOPRIVATE=github.com/COMPANY go get -v -u github.com/COMPANY/PROJECT

real    0m3.033s
user    0m1.057s
sys     0m0.403s

cep21 avatar Dec 13 '21 23:12 cep21

Having looked at other issues in this project, it appears that caching the /list results is not a simple solution so likely won't happen?

@56KBs caching is tricky in the most common (as of this writing) usages of Athens, which is what you saw in the issues you mentioned. However, when Athens is running in "offline" mode (which we recently introduced), caching /list is done by default. If you can run in that mode, this problem would be greatly reduced.

@cep21 in the test you did, did Athens have github.com/COMPANY/PROJECT in its storage backend?

arschles avatar Dec 14 '21 17:12 arschles

@arschles you were right I've modifed my command to the following

time GONOSUMDB=github.com/X GOPROXY=https://athens.internal.X.ai go get -u github.com/X/Y

Inside my S3 bucket I see the files for the module.

However, the time is still much slower.

Athens: real 0m31.357s No athens: real 0m2.293s

I did some experimentation: helm is pretty complex and downloads fast

< time GOPROXY=https://athens.X.internal.Y.ai go get -u helm.sh/helm/v3

real    0m3.657s
user    0m0.490s
sys     0m0.169s

But not as fast as no proxy by a factor of 10x

< time go get -u helm.sh/helm/v3

real    0m0.277s
user    0m0.478s
sys     0m0.126s

When I look at my athens logs while fetching go get -u a private module I see a spam of messages that look like this (but for a large number of modules):

        http-method=GET http-path=/golang.org/x/net/context/@v/list kind=Not Found module= operation=download.ListHandler ops=[download.ListHandler pool.List protocol.Li
st vcsLister.List] request-id=c6444470-74dc-4a8f-9618-eb0c6ae3d4c0 version=
INFO[5:46PM]: incoming request  http-method=GET http-path=/golang.org/x/net/context/@v/list http-status=404 request-id=c6444470-74dc-4a8f-9618-eb0c6ae3d4c0              INFO[5:46PM]: exit status 1: go list -m: module golang.org/x/net/http/httpguts: no matching versions for query "latest"
        http-method=GET http-path=/golang.org/x/net/http/httpguts/@v/list kind=Not Found module= operation=download.ListHandler ops=[download.ListHandler pool.List proto
col.List vcsLister.List] request-id=598783d9-3954-40d2-a795-dcf544686a6b version=
INFO[5:46PM]: incoming request  http-method=GET http-path=/golang.org/x/net/http/httpguts/@v/list http-status=404 request-id=598783d9-3954-40d2-a795-dcf544686a6b
INFO[5:46PM]: exit status 1: go list -m: module golang.org/x/net/http: no matching versions for query "latest"
        http-method=GET http-path=/golang.org/x/net/http/@v/list kind=Not Found module= operation=download.ListHandler ops=[download.ListHandler pool.List protocol.List
vcsLister.List] request-id=ac678d63-a26f-49de-8e5b-b95f614eb008 version=                                                                                                 INFO[5:46PM]: incoming request  http-method=GET http-path=/golang.org/x/net/http/@v/list http-status=404 request-id=ac678d63-a26f-49de-8e5b-b95f614eb008
INFO[5:46PM]: exit status 1: go list -m: module golang.org/x/net/idna: no matching versions for query "latest"
        http-method=GET http-path=/golang.org/x/net/idna/@v/list kind=Not Found module= operation=download.ListHandler ops=[download.ListHandler pool.List protocol.List
vcsLister.List] request-id=c528a115-f3f2-44fc-95ef-3fba0a277903 version=                                                                                                 INFO[5:46PM]: incoming request  http-method=GET http-path=/golang.org/x/net/idna/@v/list http-status=404 request-id=c528a115-f3f2-44fc-95ef-3fba0a277903
INFO[5:46PM]: exit status 1: go list -m: github.com/google@latest: invalid github.com/ import path "github.com/google"
        http-method=GET http-path=/github.com/google/@v/list kind=Not Found module= operation=download.ListHandler ops=[download.ListHandler pool.List protocol.List vcsL
ister.List] request-id=c62adce9-55a1-400f-a990-031e8aa9ad16 version=                                                                                                     INFO[5:46PM]: incoming request  http-method=GET http-path=/github.com/google/@v/list http-status=404 request-id=c62adce9-55a1-400f-a990-031e8aa9ad16
INFO[5:46PM]: exit status 1: go list -m: module golang.org/x/time/rate: no matching versions for query "latest"
        http-method=GET http-path=/golang.org/x/time/rate/@v/list kind=Not Found module= operation=download.ListHandler ops=[download.ListHandler pool.List protocol.List
 vcsLister.List] request-id=7723047b-36bd-4ff3-85a4-9cee69242562 version=

General question: When you run athens, and try to run go get -u on a complex private module with many dependencies, do you see a similar slowdown?

cep21 avatar Dec 14 '21 17:12 cep21

I think the delays go back to /list since I see these timings when I curl via helm, and I see the /list calls in my logs

Athens:

< time curl https://athens.X.internal.Y.ai/helm.sh/helm/@v/list > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2129    0  2129    0     0    588      0 --:--:--  0:00:03 --:--:--   588

real    0m3.629s
user    0m0.014s
sys     0m0.007s

Regular:

< time curl https://proxy.golang.org/helm.sh/helm/@v/list > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2267  100  2267    0     0  42082      0 --:--:-- --:--:-- --:--:-- 42773

real    0m0.063s
user    0m0.014s
sys     0m0.009s

cep21 avatar Dec 14 '21 17:12 cep21

Reading a bit about the /list difficulties, is it possible to do all /list calls via proxy.golang.org first, and if that's a 404, then use private auth from the proxy's configuration? I think that would give me the best of both worlds.

cep21 avatar Dec 14 '21 18:12 cep21

Maybe another actionable artifact is a FAQ around how to performantly run athens for private setups?

cep21 avatar Dec 14 '21 18:12 cep21

My first thought was to try something like this

downloadURL = "https://proxy.golang.org"
mode = "async_redirect"
download "github.com/COMPANY/*" {
    mode = "sync"
}

But the problem is that the proxy still tries to do list requests rather than sending them to proxy.golang.or

cep21 avatar Dec 15 '21 04:12 cep21

@56KBs caching is tricky in the most common (as of this writing) usages of Athens, which is what you saw in the issues you mentioned. However, when Athens is running in "offline" mode (which we recently introduced), caching /list is done by default. If you can run in that mode, this problem would be greatly reduced.

From the documentation this suggests that in offline mode it'd never reach out to VCS to identify new versions - Is this correct?

56KBs avatar Dec 17 '21 09:12 56KBs

Have you found any solution aroudn this @56KBs ? I run athens offline, and get exactly the same issue. It always forces to use the list on every dependency and wait for it to time out. It takes an hour, what should be done in a matter of seconds

elamre avatar Mar 09 '22 10:03 elamre

Hi everyone 👋🏼 apologies for the delay here.

The /list endpoint goes through the VCS every time to see if there's a new version that's been published. That's why it's slow. Proxy.golang.org caches those results which means often you don't see new versions for however long the cache is (unless you explicitly request the new version directly and the proxy will add it to the list).

Within Athens, you have a couple of solutions:

  1. Athens sets GOPROXY=direct by default, which you can set to be proxy.golang.org|direct via this config. This will mean you get the same performance as proxy.golang.org (let me know if you don't and we can debug why).
  2. If you are in an offline mode like @elamre , you can set this config to offline and the VCS will never be reached out, and therefore you should see the /list endpoint to be very fast, or as fast as your storage listing method is. Again, let me know if that's not the case.

marwan-at-work avatar Mar 09 '22 19:03 marwan-at-work

Thanks for the suggestions @marwan-at-work. I'm not sure either of those is a solution for my team though.

We're currently utilizing a combination of Athens and Artifactory. What we've found to be the case is that the size of the repo is impacting Artifactories ability to retrieve the modules from GitLab via Athens. Due to the size, Artifactory hangs up after several seconds and never gets the module. This results in requests for specific versions of modules to work fine, but un-versioned requests force it to iterate the entire tree to get the version list and @latest, and ultimately leads Artifactory to hang up, even though Athens internally is still processing the request and eventually returns a 200.

By cleaning up the repository history some, I was able to get response times down somewhat, and now un-versioned calls for modules appears to mostly work. What still doesn't work is versioned requests for modules that are branch name versions. Athens eventually completes the @branchname request and caches the pseudo version, but by then Artifactory has already hung up. So if there's a way to improve the performance in this scenario that isn't the above two options, that would be great to hear.

Thanks

umdstu avatar Jul 07 '22 15:07 umdstu

This should be fixed now. Can you try with https://github.com/gomods/athens/releases/tag/v0.12.0-rc1

manugupt1 avatar Mar 29 '23 04:03 manugupt1

@manugupt1 FWIW this is not fixed, we still observe excessive list times with generous resources.

Screenshot 2023-09-14 at 17 26 26

uhthomas avatar Sep 14 '23 16:09 uhthomas