conan
conan copied to clipboard
[bug] Conan fails return error if server returns HTTP/502 (Bad Gateway) or HTTP/503 (Service Unavail)
My conan server frequently returns 502 (Bad Gateway) error. Sometimes 503 too. It's due to low memory I think. However, conan command in this case fails to return not zero.
In order to get to success. I run it several times in the loop. Like this:
for i in $(seq 1 5); do conan upload $PKG_NAME_FULL --all -r=psl-conan -c --retry 5 --retry-wait 15 && break sleep 15 && false done
This set of commands shall have failed status in the end if 502 or 503 error happens. However, in the log I'm getting this:
+for i in $(seq 1 5)
+conan upload openssh/8.4p1 --all -r=psl-conan -c --retry 5 --retry-wait 15
Uploading to remote 'psl-conan':
Uploading openssh/8.4p1 to remote 'psl-conan'
Compressing recipe...
Recipe is up to date, upload skipped
Uploading package 1/1: 141d689e77b3d2f372bf3bcf3398ce7b577a789f to 'psl-conan'
Compressing package...
Uploading conan_package.tgz -> openssh/8.4p1:141d
Waiting 15 seconds to retry...
Waiting 15 seconds to retry...
Uploading conaninfo.txt -> openssh/8.4p1:141d
Uploading conanmanifest.txt -> openssh/8.4p1:141d
ERROR: 502 Server Error: Bad Gateway for url: https://artifactory-cpp.dev.prismsl.net/api/conan/conan/v2/conans/openssh/8.4p1///revisions/79b6d3b2792a65b1e4d5a8f05804fe64/packages/141d689e77b3d2f372bf3bcf3398ce7b577a789f/revisions/bbd3ab2ef2e4d373cf5e43081acb86dd/files/conan_package.tgz
ERROR: 503 Server Error: Service Unavailable for url: https://artifactory-cpp.dev.prismsl.net/api/conan/conan/v2/conans/openssh/8.4p1///revisions/79b6d3b2792a65b1e4d5a8f05804fe64/packages/141d689e77b3d2f372bf3bcf3398ce7b577a789f/revisions/bbd3ab2ef2e4d373cf5e43081acb86dd/files/conan_package.tgz
+break
The command "./upload.sh" exited with 0.
Environment Details (include every applicable attribute)
- Operating System+version: Ubuntu 18.04
- Compiler+version: GCC, but Not applicable
- Conan version: 1.29.2
- Python version: 3.6.9 Conan server: Artifactory Community Edition
Steps to reproduce (Include if Applicable)
See script above.
Logs (Executed commands with output) (Include/Attach if Applicable)
See above.
I think for some reason handling of these errors (502/503) is not done properly. Executable shall return not zero code if such error happens, but it returns zero for some reason.
Hi @Yurmusp, Thanks a lot for reporting.
- For the Artifactory errors, have you tried to change the JVM Parameters by setting JAVA_OPTIONS in
$JFROG_HOME/artifactory/var/etc/system.yaml? If that's memory related maybe increasing the maximum size would help with that. - For the return code, I have not been able to reproduce the issue as for me the return code seems correct. Are you running the
upload.shin a CI environment like Jenkins or Travis? Running the conan command outside that environment is producing the same return code?
Hi Carlos,
- I need to increase physical instance RAM. So, it's a bit more work and that's why it's not done yet. However, it's not the point of the issue and I think the server side is out of scope besides that it is failing in it's own way. I complain not about the server side, but client side.
- Return code is correct when everything is successful. However, when web request returns an error then this error is not propagated to proper process exit code. It's a bug in logic. Maybe not all http error codes are handled or exception is hidden or something similar.
Bug is definitely in conan client executed code and it's frustrating that it is not working properly.
I experience this code both in Travis (where CI build is) and in my vagrant vm when I build locally on my laptop. This bug is pretty consistent if the server side fails consistently. I would say it is predetermined. I did not notice unstable behavior. It stably fails on the client side if server side fails. I'v run into it many times.
In all cases everything is deployed by the same script in Ubuntu 18.04 VM.
On Tue, Jan 5, 2021 at 5:32 AM Carlos Zoido [email protected] wrote:
Hi @Yurmusp https://github.com/Yurmusp, Thanks a lot for reporting.
- For the Artifactory errors, have you tried to change the JVM Parameters by setting JAVA_OPTIONS in $JFROG_HOME/artifactory/var/etc/system.yaml? If that's memory related maybe increasing the maximum size https://www.jfrog.com/confluence/display/JFROG/Installing+Artifactory would help with that.
- For the return code, I have not been able to reproduce the issue as for me the return code seems correct. Are you running the upload.sh in a CI environment like Jenkins or Travis? Running the conan command outside that environment is producing the same return code?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/conan-io/conan/issues/8287#issuecomment-754638245, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFQTGQWXKC6JLKIWZLXVXYTSYMIG7ANCNFSM4VUH5NJQ .
Hi @Yurmusp
Sorry, it was not fully clear. Can you confirm that when you execute this command directly in the command line, without script, you don't get an error exit code?
$ conan upload $PKG_NAME_FULL --all -r=psl-conan -c
This is what we cannot reproduce, it seems that Conan is returning a non zero exist code, and is working correctly.
From the output you are copying above, it seems it is your script which is not exiting with an error code: The command "./upload.sh" exited with 0. This could be because the shell script doesn't seem to be forwarding the error code from Conan to the caller of the upload.sh script.
Hi James,
In the end of conan command-line, please observe this part: ' && break '
This part only executes if conan was successful. i.e. exited with 0 exit code. In the original log above with shell tracing, you can clearly see that 'break' part of command is executed. It is shown as +break
'break' shall NOT execute in case if conan failed. When it fails, it shall do sleep 15 && false and when it does it last time it will make whole script to fail. This worked in multiple scenarios, but not with scenario I'm describing.
Another peculiarity here, is that recipe is not uploading (it's already up to date), but package is starting to upload. See log for details. May be this part complicates things and make it fail.
Yuriy
Ok, thanks for the explanation, let's us have another look then.
Hmm,
May be it's because of retry and one of retries succeeds... So, it prints error when it gets it, but finally succeeds. However, it's not clear if it is due to order of output. Which likely is not realtime as errors might go to stderr.
Last scenario is harder to check.
I have been able to reproduce the behavior above:
def test_upload_error_503():
# https://github.com/conan-io/conan/issues/8287
class Connection503Error(TestRequester):
counter = 0
def put(self, *args, **kwargs):
self.counter += 1
if self.counter == 3:
from requests.exceptions import HTTPError
raise HTTPError(503, "This is an error msg!!!")
else:
return super(Connection503Error, self).put(*args, **kwargs)
client = TestClient(requester_class=Connection503Error, default_server_user=True)
client.save({"conanfile.py": GenConanfile()})
client.run('user -p password -r default user')
client.run('create . lib/1.0@user/channel')
client.run('upload * -c --all -r default --retry-wait=0')
print(client.out)
assert "ERROR: [Errno 503] This is an error msg!!!" in client.out
It is that, one of the retries fails, but the next is able to succeed, so the whole command is considered successful and exits with return code 0.
You are right it is printing some "Error" messages, even if the final result is not an error. This is a known issue, but can't be easily changed at the moment (Conan 1.X), without breaking, but it will definitely be changed in Conan 2.0 (some work has already be done to solve this).
If this is true, it is not a bug, it is behaving correctly, and it would be a UX issue to improve, sending it to stderr instead (already planned for Conan 2.0)
Ok. Thanks. Sorry for bothering. Did not understand it right away.
On Tue, Jan 5, 2021 at 2:24 PM James [email protected] wrote:
I have been able to reproduce the behavior above:
def test_upload_error_503(): # https://github.com/conan-io/conan/issues/8287 class Connection503Error(TestRequester): counter = 0
def put(self, *args, **kwargs): self.counter += 1 if self.counter == 3: from requests.exceptions import HTTPError raise HTTPError(503, "This is an error msg!!!") else: return super(Connection503Error, self).put(*args, **kwargs) client = TestClient(requester_class=Connection503Error, default_server_user=True) client.save({"conanfile.py": GenConanfile()}) client.run('user -p password -r default user') client.run('create . lib/1.0@user/channel') client.run('upload * -c --all -r default --retry-wait=0') print(client.out) assert "ERROR: [Errno 503] This is an error msg!!!" in client.outIt is that, one of the retries fails, but the next is able to succeed, so the whole command is considered successful and exits with return code 0.
You are right it is printing some "Error" messages, even if the final result is not an error. This is a known issue, but can't be easily changed at the moment (Conan 1.X), without breaking, but it will definitely be changed in Conan 2.0 (some work has already be done to solve this).
If this is true, it is not a bug, it is behaving correctly, and it would be a UX issue to improve, sending it to stderr instead (already planned for Conan 2.0)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/conan-io/conan/issues/8287#issuecomment-754938896, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFQTGQTD3DQ2RC4B7XBZBKLSYOGQ5ANCNFSM4VUH5NJQ .
I ran into a similar issue (periodical 502 or 503 HTTP errors in CI jobs). Artifactory docs indicate that it handles at most 200 active HTTP connections by default:
Artifactory runs on Tomcat which uses the Apache HTTP client to handle HTTP requests. Tomcat maintains a pool of threads and allocates a single thread to handle each incoming request. As the pool of threads gets depleted, Tomcat will create more (up to the maximum specified by the maxThreads attribute) to handle additional requests that come in. If the number of requests exceeds the maxThreads value, requests will be queued (up to the maximum specified by the acceptCount attribute). Should this parameter be too high, requests waiting for a free thread may experience delays until threads become available to handle them. Should the acceptCount limit also be reached, Tomcat will throw “connection refused” errors for any additional requests until the required resources are eventually released. In turn, any new request, uploads, downloads, updates, etc., which could not acquire a thread, will fail. To avoid these performance hits and errors, you can increase the values of these attributes in the Tomcat’s server.xml file. The default value for maxThreads is 200, while acceptCount is set to 100.
Using monitoring method described in https://jfrog.com/blog/monitoring-and-optimizing-artifactory-performance/ I see that conan search -r creates 12 concurrent HTTP connections while single conan install generates over 300(!) connections:
Sat Jul 23 05:08:00 UTC 2022 HTTP Connections: 5
Sat Jul 23 05:08:05 UTC 2022 HTTP Connections: 48
Sat Jul 23 05:08:10 UTC 2022 HTTP Connections: 71
Sat Jul 23 05:08:15 UTC 2022 HTTP Connections: 114
Sat Jul 23 05:08:20 UTC 2022 HTTP Connections: 178
Sat Jul 23 05:08:25 UTC 2022 HTTP Connections: 192
Sat Jul 23 05:08:30 UTC 2022 HTTP Connections: 308
Sat Jul 23 05:08:35 UTC 2022 HTTP Connections: 376
Sat Jul 23 05:08:40 UTC 2022 HTTP Connections: 381
Sat Jul 23 05:08:45 UTC 2022 HTTP Connections: 381
Sat Jul 23 05:08:50 UTC 2022 HTTP Connections: 381
Sat Jul 23 05:08:55 UTC 2022 HTTP Connections: 381
Sat Jul 23 05:09:00 UTC 2022 HTTP Connections: 380
Sat Jul 23 05:09:05 UTC 2022 HTTP Connections: 342
Sat Jul 23 05:09:10 UTC 2022 HTTP Connections: 309
Sat Jul 23 05:09:15 UTC 2022 HTTP Connections: 268
Sat Jul 23 05:09:20 UTC 2022 HTTP Connections: 228
Sat Jul 23 05:09:25 UTC 2022 HTTP Connections: 204
Sat Jul 23 05:09:30 UTC 2022 HTTP Connections: 110
Sat Jul 23 05:09:35 UTC 2022 HTTP Connections: 6
Sat Jul 23 05:09:40 UTC 2022 HTTP Connections: 1
I have general.parallel_download=8 in Conan but 380 connections seems to be to much.
Hi @db4
I think this high number of connections would deserve its own issue. Would you like to create a new issue with the above, please?
It would be useful to detail how did you obtain those traces, and also to test the same, but without general.parallel_download. Also, detail the conan version, just in case, and any other detail that could be relevant, if you have some non-standard installation, or something like that, both in the client or in the server. Thanks!
@memsharded I think I've sorted it down. It's not Conan client but nginx that I use as a HTTP redirector to the actual Artifactory instance. For the record, I've configured it the following way:
upstream artifactory_backend {
server artifactory1.my_internal_domain:8082;
}
server {
server_name artifactory.my_internal_domain;
client_max_body_size 1G;
location / {
resolver dns.my_internal_domain ipv6=off;
proxy_buffering off;
proxy_request_buffering off;
proxy_pass http://artifactory_backend;
proxy_set_header Host $host;
keepalive_timeout 0;
}
}
nginx seems to generate a lot of concurrent HTTP connections...
Thanks for the feedback @db4 and following up. I am not nginx expert, difficult for me to help there.
Couple of possibilities:
- Maybe @davidsanfal knows a bit more than me, can you please have a quick look?
- If you use an Artifactory that has included support, I'd submit a ticket to that support, because this sounds mostly an advanced Artifactory setup issue.
@memsharded The Recommended Nginx Reverse Proxy Configuration for Artifactory 7 seems to works for me without issues. I wish I looked into the Jfrog knowledge base before asking questions here...
As the above issue has been solved in 2.0 (messages are directed to stderr by default), and the recent issue seems to be solved with Artifactory recommended configuration, I think this issue can be closed, thanks!