fix bucket_sorter bug affecting min_degree_ordering
removing an element from a stack does not always work, since the predecessor of a stack element is not (always) stored. this patch merges the array head into next, so a top element in a stack can point to the head of the stack it is in.
remarks:
- i am guessing the intended use (see test below), as it works like that for elements deeper down in a stack
- the fix abuses pointers/iterators and infers offsets from address differences. (yes it's a bit ugly.)
- memory needs and complexity are unaffected, size_type is probably big enough.
test case. B is a bucketsorter operating on a vector V.
V[0]=0; V[1]=1; B.push(0); B.push(1);
// now, stacks 0 and 1 are singletons. // try to move 0 to stack 1, should result in // head_1->0->1->end, but calls remove first, then
V[0]=1; B.update(0); // <- BOOM
// the update calls remove, it "removes" 0 from stack V[0]=1. // it's not there yet (!). // instead 1 (top of bucket 1) must die. // the result is head_1->0->end, and wrong.
need to redesign my patch.. don't know how to do it yet.
the value index datatype may be too small to represent twice the range. so actually the longer head array may be outside of the addressable range (if too many indexes are used). otoh adding an extra bit will affect run time.
at the same time i'ts unclear to me, why the addressed bug does not affect algorithms such
as min_degree_ordering. it does use remove.
I think the best thing is if you can make the simplest unit test that fails when it should pass according to the intended behaviour of the component.
On Wed, Jul 25, 2018 at 03:37:49AM -0700, Jeremy W. Murphy wrote:
I think the best thing is if you can make the simplest unit test that fails when it should pass according to the intended behaviour of the component.
Thank you.
We do have a unit test against the upstream header in treedec (former tdlib). It hits the issue, and it is flagged as fail [1]. It is pretty simple, but it might not be the simplest possible. See the comments in the code. Please modify and use it for any purpose whatsoever.
NB: treedec ships a modified header, which i think fixes the issue. similar tests [2,3] witness this.
Please let me know, if/how I can help any further.
[1] https://github.com/freetdi/tdlib/blob/develop/tests/bucket_sorter_2u.cpp [2] https://github.com/freetdi/tdlib/blob/develop/tests/bucket_sorter_1.cpp [3] https://github.com/freetdi/tdlib/blob/develop/tests/bucket_sorter_2.cpp
I think it's best if you put the relevant unit test into this PR. Whoever is maintaining Boost.Graph won't want to look around at code elsewhere on the web. Ideally (if I was the maintainer, which I'm not) I would want to see that a) the unit test is correct and that b) it fails on master but c) passes with the change to the algorithm.
On Wed, Jul 25, 2018 at 04:14:59AM -0700, Jeremy W. Murphy wrote:
I think it's best if you put the relevant unit test into this PR. Whoever is maintaining Boost.Graph won't want to look around at code elsewhere on the web. Ideally (if I was the maintainer, which I'm not) I would want to see that a) the unit test is correct and that b) it fails on master but c) passes with the change to the algorithm.
I understand that everybodys time is limited, and I consider doing what you suggest, once I get to it.
To determine whether the unit test is correct, somebody will have to read it in any case. so, in effect I cannot really save somebody elses time here, other than the time it took me to find the bug (go figure). I fold in contributions into several other projects, and I know how little time is saved by trying to convince contributors to do things differently. some can't write code, some cant use git, some don't understand what they do at all.
If you find that this bug is more important to you, go ahead, bring it to the attention to somebody who is willing to spend time on it, or fill the forms -- as you suggest. I will eventually, but it's not my priority, and it hasn't been for a while.
I do appreciate the reminder, in any case. I forgot about this one.
I'm helping out with the PR backlog. Looks like you have a bugfix, incomplete tests, and no new examples will be needed. Please make your PR complete. This is to let you know and help me prioritize PR's.
thanks. I commited the test from treedec mentioned above.
another bug in min_degree_ordering seems related to this. min_degree_ordering assert fails on some empty and on some complete graphs. see the test.
This may be tangled then. The PR needs to be clean and not affect other parts of the implementation. I'm going to hold off on messing with this until after the first two batches of merges.
On Fri, Aug 31, 2018 at 12:01:02PM -0700, jrmarsha wrote:
This may be tangled then. The PR needs to be clean and not affect other parts of the implementation. I'm going to hold off on messing with this until after the first two batches of merges.
the fix is unchanged and only touches include/boost/pending/bucket_sorter.hpp.
the min_degree_ordering test shows a side effect only.
There really shouldn't be unintended side effects. I'm not going to promote a PR that knowingly introduces regressions.
On Fri, Aug 31, 2018 at 12:09:16PM -0700, jrmarsha wrote:
There really shouldn't be unintended side effects. I'm not going to promote a PR that knowingly introduces regressions.
right now, boost::min_degree_ordering crashes on empty input graphs, or silently fails if NDEBUG is set.
with my fix it not only doesn't crash, but also produces a correct ordering. this is just extra evidence. i am not aware of any other side effects, but i didn't run min degree on all the other graphs yet.
it is clearly tricky. i did mention that sb actually will have to read the bucket sorter test case, and guess whether the stack corruption and the md crash is a bug or a feature.
Ah crud. I'm going to keep this out of the first two batches of merges. I need to sink time into PR.
@felix-salfelder thanks for adding the unit test.
Earlier, you wrote:
need to redesign my patch.. don't know how to do it yet.
the value index datatype may be too small to represent twice the range. so actually the longer head array may be outside of the addressable range (if too many indexes are used). otoh adding an extra bit will affect run time.
Is this still an open implementation issue for this patch? I.e., does it need some work before being merged? Bucket sort is not an overly complex algorithm in general: we could potentially make a new, simpler, generic implementation, depending on how tightly coupled the buckets are with how Boost.Graph is using it. (I have a little experience with these algorithms and a little spare time to work on open source projects at the moment.)
at the same time i'ts unclear to me, why the addressed bug does not affect algorithms such as min_degree_ordering. it does use
remove.
I guess the answer in the end was that it does affect it.
On Sat, Oct 13, 2018 at 08:49:58PM -0700, Jeremy W. Murphy wrote:
[..] need redesign [..]
Is this still an open implementation issue for this patch? I.e., does it need some work before being merged?
Thanks Jeremy for digging into this.
looking at it now.. trying to remember.
the patch embeds the head array into the next array to save several conditionals which would be all over the place. i thought there was a problem with integer arithmetic overflow. i just can't see it, perhaps it was in some other code.
somehow related, the patch introduces ugly assignments in the stack class such as
prev[new_head] = bucket_id + (head - next);
which could possibly be reformulated. does not seem essential.
note that prev is pointing to size_type (hard wired), which is unnecessary large in most applications. bucket_id can be an unsigned char. but this is not the overflow...
Bucket sort is not an overly complex algorithm in general: we could potentially make a new, simpler, generic implementation, depending on how tightly coupled the buckets are with how Boost.Graph is using it. (I have a little experience with these algorithms and a little spare time to work on open source projects at the moment.)
not sure. "generic", "simple" and "fast" are harder to achieve simultaneously. especially, the current interface is quite useful.
i expect that a new implementation will lead to similar choices and tradeoffs. i'd suggest to change the current code, if you can think of a better way. YMMV.
please, if you have time, consider testing this more, including the shorter integer types. some simple randomised testing would have easily revealed the original bug.
I guess the answer in the end was that it does affect it [graph::bmd].
a corner case, but yes.
I just ran the unit test and it returned "broken", which was unexpected. Is that what it's returning for you?
Hmmm, maybe just something to do with the way I compiled it manually, because when I run it as part of the Boost.Graph test suite, it passes. Kinda weird, though.
On Sun, Oct 14, 2018 at 03:02:09AM -0700, Jeremy W. Murphy wrote:
Hmmm, maybe just something to do with the way I compiled it manually, because when I run it as part of the Boost.Graph test suite, it passes. Kinda weird, though.
I only have this. never tried to use the test suite.
test$ g++ bucket_sorter_2u.cpp # uses system headers (old) test$ ./a.out | grep -e good -e broken broken test$ g++ bucket_sorter_2u.cpp -I../include test$ ./a.out | grep -e good -e broken good
I ticked the "Allow edits from maintainers" box. please feel free.
OK, yeah, I must have picked up the system headers by accident when I compiled it manually.
I ticked the "Allow edits from maintainers" box. please feel free.
I'm not a maintainer. :)
@felix-salfelder Could you rebase your PR? More CI tests are in devel, as well as other fixes.
This PR is against master, and should be against develop.
@felix-salfelder What is your status on this?
On Thu, Nov 01, 2018 at 07:36:12AM -0700, jrmarsha wrote:
@felix-salfelder What is your status on this?
tried to follow your instructions (rebase etc.). thought it worked.
(short on time.) what else can I do?
It really is just rebasing on boostorg/develop .
On Mon, Nov 05, 2018 at 01:53:14PM -0800, jrmarsha wrote:
It really is just rebasing on boostorg/develop .
rebased to f7dfafb5af0ca. was it the right one?
thanks!
use the devel tag, not a hash. It'll be good git practice (and lets me be lazy so I can buy a display cable).
On Mon, Nov 05, 2018 at 01:58:39PM -0800, jrmarsha wrote:
use the
develtag, not a hash.
all tags i can see are of the form
[..] boost-1.65.0 boost-1.65.1 boost-1.66.0 boost-1.67.0 boost-1.68.0
no 'devel' here, except for the branch pointing to f7dfafb5af0 on my end (?)
thanks
Well that's isn't right. I'll need you to walk me through how to re-create your setup. How are to making/pulling the graph repo, and how are you rebasing it? All of your explicit steps. You might want to add a gist of a sanitized bash history for me.
On Mon, Nov 05, 2018 at 02:05:12PM -0800, jrmarsha wrote:
Well that's isn't right. I'll need you to walk me through how to re-create your setup. How are to making/pulling the graph repo, and how are you rebasing it? All of your explicit steps. You might want to add a gist of a sanitized bash history for me.
i cloned from https://github.com/boostorg/graph
$ git clone https://github.com/boostorg/graph
this is now my "origin". then added my own fork (after clicking the fork button on the github page)
$ git remote add myownfork ssh://git@github/etcpp
in that repo i found a "devel" branch on which i have rebased the four commits.
$ git log -1 origin/develop |head -n1
commit f7dfafb5af0ca1fd0dd119377bc0acf78ed5a1fc
$ git checkout my_stuff
$ git rebase -i origin/develop
[.. select 4 commits ]
$ git push myownfork +my_stuff:bucket_sorter
hth felix
Not the way I would have gone about it, but it does seem to have done the job. Looks in mostly good shape. I'll give it a more thorough look over after the CI's complete.
I would have merged from boostorg/develop into your github repo. Nice web interface and all :P