ucx
ucx copied to clipboard
UCT/TCP: Use SIOCGIFCONF ioctl when /sys/class/net is missing.
What
This change provides alternative code that uses the SIOCGIFCONF
ioctl to get the names of the available TCP network interfaces.
Why ?
In some cases such as isolated build environments (as found in GNU Guix), containers, or non-Linux based system, /sys
is missing.
How ?
Using the old, portable SIOCGIFCONF
ioctl.
It may be that the SIOCGIFCONF
can in fact replace the /sys
-based code since the information returned should be the same. WDYT?
Can one of the admins verify this patch?
Hi @dmitrygx,
Thanks for your feedback. I've amended the patch following your suggestions, except one:
pls, use
ucs_netif_ioctl
instead
AFAICS, ucs_netif_ioctl
is not applicable here because if_name
would be NULL
. However, I've changed this bit to use ucs_socket_create
instead of socket
.
Let me know what you think!
AFAICS,
ucs_netif_ioctl
is not applicable here becauseif_name
would beNULL
. However, I've changed this bit to useucs_socket_create
instead ofsocket
.
@civodul yes, you're right
ok to test
Mellanox CI: FAILED on 25 of 25 workers (click for details)
Note: the logs will be deleted after 25-Nov-2019
Agent/Stage | Status |
---|---|
_main | :x: FAILURE |
hpc-arm-cavium-jenkins_W0 | :x: FAILURE |
hpc-arm-cavium-jenkins_W1 | :x: FAILURE |
hpc-arm-cavium-jenkins_W2 | :x: FAILURE |
hpc-arm-cavium-jenkins_W3 | :x: FAILURE |
hpc-arm-hwi-jenkins_W0 | :x: FAILURE |
hpc-arm-hwi-jenkins_W1 | :x: FAILURE |
hpc-arm-hwi-jenkins_W2 | :x: FAILURE |
hpc-arm-hwi-jenkins_W3 | :x: FAILURE |
hpc-test-node-gpu_W0 | :x: FAILURE |
hpc-test-node-gpu_W1 | :x: FAILURE |
hpc-test-node-gpu_W2 | :x: FAILURE |
hpc-test-node-gpu_W3 | :x: FAILURE |
hpc-test-node-legacy_W0 | :x: FAILURE |
hpc-test-node-legacy_W1 | :x: FAILURE |
hpc-test-node-legacy_W2 | :x: FAILURE |
hpc-test-node-legacy_W3 | :x: FAILURE |
hpc-test-node-new_W0 | :x: FAILURE |
hpc-test-node-new_W1 | :x: FAILURE |
hpc-test-node-new_W2 | :x: FAILURE |
hpc-test-node-new_W3 | :x: FAILURE |
r-vmb-ppc-jenkins_W0 | :x: FAILURE |
r-vmb-ppc-jenkins_W1 | :x: FAILURE |
r-vmb-ppc-jenkins_W2 | :x: FAILURE |
r-vmb-ppc-jenkins_W3 | :x: FAILURE |
Hi @dmitrygx,
Not sure I understand what the build failures are about. Let me know if you need anything else from me.
Hi @dmitrygx,
Not sure I understand what the build failures are about. Let me know if you need anything else from me.
/scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/uct/tcp/tcp_iface.c:646:5: error: passing argument 2 of ‘ucs_malloc’ makes pointer from integer without a cast [-Werror]
conf.ifc_req = ucs_malloc(1, conf.ifc_len, "ifreq");
^
In file included from /scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/ucs/sys/sys.h:19:0,
from /scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/uct/base/uct_iface.h:21,
from /scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/uct/tcp/tcp.h:10,
from /scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/uct/tcp/tcp_iface.c:11:
/scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/ucs/debug/memtrack.h:102:7: note: expected ‘const char *’ but argument is of type ‘int’
void *ucs_malloc(size_t size, const char *name);
^
/scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/uct/tcp/tcp_iface.c:646:5: error: too many arguments to function ‘ucs_malloc’
conf.ifc_req = ucs_malloc(1, conf.ifc_len, "ifreq");
^
In file included from /scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/ucs/sys/sys.h:19:0,
from /scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/uct/base/uct_iface.h:21,
from /scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/uct/tcp/tcp.h:10,
from /scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/uct/tcp/tcp_iface.c:11:
/scrap/jenkins/workspace/hpc-ucx-pr/label/hpc-arm-cavium-jenkins/worker/0/contrib/../src/ucs/debug/memtrack.h:102:7: note: declared here
void *ucs_malloc(size_t size, const char *name);
this following has to be changed from
conf.ifc_req = ucs_malloc(1, conf.ifc_len, "ifreq");
to
conf.ifc_req = ucs_calloc(1, conf.ifc_len, "ifreq");
Indeed... Done, thanks.
Mellanox CI: FAILED on 4 of 25 workers (click for details)
Note: the logs will be deleted after 26-Nov-2019
Agent/Stage | Status |
---|---|
_main | :x: FAILURE |
hpc-test-node-legacy_W0 | :x: FAILURE |
hpc-test-node-legacy_W2 | :x: FAILURE |
hpc-test-node-legacy_W3 | :x: FAILURE |
hpc-arm-cavium-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W3 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W0 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W1 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W2 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W3 | :heavy_check_mark: SUCCESS |
Hello! The messages I received from Mellanox' CI system show that the 3 test failures are about:
Fatal: transport error: Endpoint timeout
It's unclear to me how this could relate to this patch. Thoughts?
unrelated
bot:mlx:retest
Mellanox CI: FAILED on 2 of 25 workers (click for details)
Note: the logs will be deleted after 27-Nov-2019
Agent/Stage | Status |
---|---|
_main | :x: FAILURE |
hpc-test-node-legacy_W2 | :x: FAILURE |
hpc-arm-cavium-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W3 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W0 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W1 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W2 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W3 | :heavy_check_mark: SUCCESS |
bot:mlx:retest
Mellanox CI: FAILED on 3 of 25 workers (click for details)
Note: the logs will be deleted after 28-Nov-2019
Agent/Stage | Status |
---|---|
_main | :x: FAILURE |
hpc-test-node-legacy_W0 | :x: FAILURE |
hpc-test-node-legacy_W3 | :x: FAILURE |
hpc-arm-cavium-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W3 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W0 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W1 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W2 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W3 | :heavy_check_mark: SUCCESS |
infra issues bot:mlx:retest
Mellanox CI: PASSED on 25 workers (click for details)
Note: the logs will be deleted after 29-Nov-2019
Agent/Stage | Status |
---|---|
_main | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W3 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W0 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W1 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W2 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W3 | :heavy_check_mark: SUCCESS |
Mellanox CI: UNKNOWN on 17 workers (click for details)
Note: the logs will be deleted after 29-Nov-2019
Agent/Stage | Status |
---|---|
_main | :question: ABORTED |
r-vmb-ppc-jenkins_W0 | :question: ABORTED |
r-vmb-ppc-jenkins_W3 | :question: ABORTED |
hpc-arm-cavium-jenkins_W0 | :question: UNKNOWN |
hpc-arm-cavium-jenkins_W1 | :question: UNKNOWN |
hpc-arm-cavium-jenkins_W2 | :question: UNKNOWN |
hpc-arm-cavium-jenkins_W3 | :question: UNKNOWN |
hpc-test-node-gpu_W0 | :question: UNKNOWN |
hpc-test-node-gpu_W1 | :question: UNKNOWN |
hpc-test-node-gpu_W2 | :question: UNKNOWN |
hpc-test-node-gpu_W3 | :question: UNKNOWN |
hpc-test-node-new_W0 | :question: UNKNOWN |
hpc-test-node-new_W1 | :question: UNKNOWN |
hpc-test-node-new_W2 | :question: UNKNOWN |
hpc-test-node-new_W3 | :question: UNKNOWN |
r-vmb-ppc-jenkins_W1 | :question: UNKNOWN |
r-vmb-ppc-jenkins_W2 | :question: UNKNOWN |
Mellanox CI: FAILED on 2 of 25 workers (click for details)
Note: the logs will be deleted after 29-Nov-2019
Agent/Stage | Status |
---|---|
_main | :x: FAILURE |
hpc-test-node-gpu_W3 | :x: FAILURE |
hpc-arm-cavium-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W3 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W0 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W1 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W2 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W3 | :heavy_check_mark: SUCCESS |
Mellanox CI: FAILED on 2 of 25 workers (click for details)
[----------] 1 test from st/test_profile_perf
[ RUN ] st/test_profile_perf.overhead/0 <1>
[ INFO ] overhead: 51.7127 nsec
[ INFO ] overhead: 51.8367 nsec
[ INFO ] overhead: 51.7635 nsec
[ INFO ] overhead: 51.6434 nsec
[ INFO ] overhead: 51.8108 nsec
[ INFO ] overhead: 51.6247 nsec
[ INFO ] overhead: 51.668 nsec
[ INFO ] overhead: 51.7209 nsec
[ INFO ] overhead: 51.6302 nsec
[ INFO ] overhead: 51.7486 nsec
[ INFO ] overhead: 51.7306 nsec
/scrap/jenkins/workspace/hpc-ucx-pr-6/label/hpc-test-node-gpu/worker/3/contrib/../test/gtest/ucs/test_profile.cc:393: Failure
Expected: (overhead_nsec) < (EXP_OVERHEAD_NSEC), actual: 51.7306 vs 50
Profiling overhead is too high
[ FAILED ] st/test_profile_perf.overhead/0, where GetParam() = 1 (28584 ms)
[----------] 1 test from st/test_profile_perf (28584 ms total)
bot:retest
Mellanox CI: PASSED on 25 workers (click for details)
Note: the logs will be deleted after 29-Nov-2019
Agent/Stage | Status |
---|---|
_main | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-cavium-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W0 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W1 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W2 | :heavy_check_mark: SUCCESS |
hpc-arm-hwi-jenkins_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-gpu_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-legacy_W3 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W0 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W1 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W2 | :heavy_check_mark: SUCCESS |
hpc-test-node-new_W3 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W0 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W1 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W2 | :heavy_check_mark: SUCCESS |
r-vmb-ppc-jenkins_W3 | :heavy_check_mark: SUCCESS |
@civodul - I just wanted to check if you have signed CLA. What organization do you represent ? thanks !
@shamisp I haven't signed the CLA; where can I find it? Thanks in advance!
@shamisp I haven't signed the CLA; where can I find it? Thanks in advance!
See https://www.openucx.org/license, "Contributor License Agreement"
Hi @yosefe,
As I understand it, this "Contributor License Agreement" equates to copyright assignment. The problem for me is that it fails to guarantee that my contributions will remain free software: UCX is currently distributed under the 3-clause BSD license, which is fine by me, but nothing in the CLA says that the "Copyright Holders" (capital letters) are committed to keeping code under that license. Is this correct?
Thanks, Ludo'.
@civodul Response from our legal council:
The UCF Contributor license agreement requires that Contributors license (not transfer) their copyrights to Copyright Holders and recipients of distributed software in order to allow UCF to freely license the specification contributions made by its Contributors. Subject to the scope of the definition of Contribution in the agreement.
If you have any additional questions I can put you in touch with our legal council. thanks !
adding WIP until CLA is signed