volk
volk copied to clipboard
index: Fix volk_get_index
This function results in an infinite loop on Debian 11 for some impls. This is a first step to fix it.
@mbr0wn , @marcusmueller does that fix the reported issue? If you could point me to a CI script for Debian 11 that we can integrate here, that'd be very helpful.
Fix #516
EDIT This is a minimal Debian 11 Container to test things.
FROM debian:bullseye
RUN apt-get update \
&& apt-get install -y \
build-essential git cmake python3-mako python3-distutils liborc-dev
GNU Radio CI container for Debian 11 where volk_profile
hangs on volk_32f_8u_polarbutterflypuppet_32f
.
The corresponding Dockerfile: Dockerfile.
Don't we need to handle -1 in the calling functions to avoid calling into random memory?
Don't we need to handle -1 in the calling functions to avoid calling into random memory?
+1 from me on this: looks like that's just in tmpl/volk.tmpl.c : 191
I fixed the "file local comments". Thus, we need to discuss how to deal with the -1
return value.
Why did we avoid to deal with it until now? We ended up in an infinite loop and thus terminated useful execution at this point.
Considering this, the actual bug report should be fixed now.
volk_get_index
is used in 2 functions:
-
lib/volk_rank_archs.c:volk_rank_archs
: return values propagate here. -
tmpl/volk.tmpl.c:186
, i.e. in all_manual
kernel calls.
volk_rank_archs
is used in "one" place.
-
tmpl/volk.tmpl.c:151-152
: kernel init. Thus, the first call to any kernel.
Both functions in tmpl/volk.tmpl.c
return void. We can't signal errors that way. We'd have to change the API.
We go back and forth between signed and unsigned variables in these functions. But that's a separate issue.
We could treat -1
in both functions as: use the first impl in the list and hope it is supported on your system.
I'd really like to raise a NotImplementedError
but that's obviously not an option either.
I know this is an unsatisfactory but we could ignore this problem for now and add it to the list of issues to fix for our next major release. Would that be a viable option?
I reverted the change where !
is replaced with != 0
. It caused issues with most systems except Ubuntu x86. I rebased to the latest main
as well.
I reverted the change where
!
is replaced with!= 0
. It caused issues with most systems except Ubuntu x86. I rebased to the latestmain
as well.
interesting ... I wonder why? Anyway, !
is fine
This PR fixes the direct issue, which is important. It does change the API unfortunately by now returning -1 where it didn't before. I guess as an interim we could instead have it return the first entry (if there is one) & print a warning noting as such. Of course if there is no entry then we'd want to return -1 ... so, does this help? Hmmm ...
This PR fixes the direct issue, which is important. It does change the API unfortunately by now returning -1 where it didn't before. I guess as an interim we could instead have it return the first entry (if there is one) & print a warning noting as such. Of course if there is no entry then we'd want to return -1 ... so, does this help? Hmmm ...
It sounds like every solution somehow breaks the API. However, at the moment the broken API won't work at all. After this change, it will. If we'd do another 2.x release, I'd argue this fix should go in there.
I added a message "Volk ERROR ...". If we'd return 0;
, we'd most probably cause an "illegal instruction error".
Besides, it might be time to add a Debian test. However, Debian 'bullseye' ships GCC 10. We already have a test with GCC 10. The question would be: Is this distro specific?
By the way, how do we figure out what breaks finding that generic
implementation? even if this PR was perfect in every way, it would not solve the underlying issue that leads to https://github.com/gnuradio/volk/issues/516 . And maybe fixing that would completely rewrite this part of code, anyway...
Updated the initial PR description. I started some tests with:
FROM debian:bullseye
RUN apt-get update \
&& apt-get install -y \
build-essential git cmake python3-mako python3-distutils liborc-dev
I couldn't find liborc-dev
and python3-distutils
in the GNU Radio CI Docker file.
liborc-dev
is essentially a dependency and should be available anyways.
python3-distutils
is only a build dependency. So it won't be installed with a binary.
Besides, I can't reproduce the issue in my minimal container with the current main
. Debian 11 ships VOLK 2.4.1 which I didn't test yet.
liborc-dev is essentially a dependency and should be available anyways.
liborc-dev should only be a build-time dependency, right?
I converted this PR to a draft. At the moment I doubt it really fixes the issue it is supposed to fix. We need a better understanding of the issue here.
This draft PR was introduced because of a bug that turned out to be outside of VOLK. I'm closing this PR now.