phobos
phobos copied to clipboard
Added std.range.orElse which provides a default value for empty ranges
As mentioned on #5153 if an API method could potentially return an "undefined response (aka null), there are a couple of strategies:
| Method | Uses in Phobos |
|---|---|
| return Nullable | no precedent in Phobos (?) |
use enforce (aka exceptions) |
generally disliked, because exceptions allocate (and everybody wants to be @nogc nowadays |
add an overload with a seed |
std.algorithm.iteration.maxElement |
As the third way seems to be the current strategy (e.g. fold, reduce, ...), it creates a lot of "redundant" code. Here I propose to add a seed(<range>, <defaultValue>) method that can be combined with any range method to solve this problem. The gist is very simple:
import std.range : iota;
assert(0.iota(10).seed(42).front == 0);
assert(0.iota.seed(42).front == 42);
The simplest way to add such a orElse method would be to combine choose and only like this:
auto orElse(Range, RangeElementType = ElementType!Range)(Range r, RangeElementType fallback)
if (isInputRange!(Unqual!(Range)) &&
!is(CommonType!(ElementType!Range, RangeElementType) == void))
{
import std.range : choose, only;
return choose(!r.empty, r, only(seed));
}
However this has a couple of problems:
- no custom type - e.g.
static if (!hasDefaultValue!Range) enforce(!r.empty) choosedoesn't work in CTFE (#14660)- not fully
@safe(it usesunionandvoid[]) - doesn't work with ranges of
constelements - hasn't all
opSliceoverloads (e.g.InfiniteRanges) - no
toStringfor user convenience as that's generic - would require a custom
onlythat allows assigns, otherwisehasAssignableElementsandhasSwappableElementswon't work.
While most of these points are solvable with e.g. template mixins, I am not sure about all of them and hence this PR contains its an implementation of a OrElseRange.
Destroy this idea!
edit: renamed from seed to orElse
Please note that this comment is to continue a discussion here.
You're confusing a seed with a fallback. Seed is a term used when referring reduction functions, e.g. max, sum, fold, reduce. The seed is the initial value of an accumulator. Technically, you can seed a reduction function by unconditionally prepending to the sequence of inputs - and this is important - but not by only prepending when the input is empty, as this implementation does.
Furthermore, though technically walkBack could be expressed as a reduction HOF, and this seed function would technically achieve your goal if the prepending was unconditional, and prepending to the input isn't a totally unreasonable approach to seeding a reduction function, it's important to understand that it's not an apt way to perceive what's going on.
A fallback is a default value returned when an operation received an input that would otherwise cause it to fail, as opposed to throwing an exception or (e.g. when the assert in walkBack is omitted) silently resulting in undefined behavior. Probably the best analogy is the associativearray.get(key, fallback) method. Of course, the fallback is returned when no such key exists in the array. The fallback argument is not a seed in this case, and could not possibly be implemented as one.
My mach library implements a function similar to walkBack, accepting a fallback argument. It also defines a walkindex function, similar in purpose to walkLength and walkBack, which takes a range and an index. As you'd expect, it traverses the range until it hits that index, and it returns the element there. It optionally accepts a fallback. When the range is shorter than the given index, the fallback is returned rather than a fatal error being produced. This fallback also is not a seed, and could not possibly be implemented as one.
However - this isn't necessarily an argument against including a well-implemented function serving this purpose. (Though I don't think seed is an appropriate name for what it does) Rather, this is specifically an argument against using a function such as this one rather than simply having an optional fallback argument for walkBack.
The gist is very simple:
import std.range : iota; assert(0.iota(10).seed(42).front == 0); assert(0.iota.seed(42).front == 42);
This cries to me to call it "fallback" or "orElse" because that is what it does. "seed" doesn't tell me anything - all I can think of is random number generators.
This cries to me to call it "fallback" or "orElse" because that is what it does.
I really like the name orElse!
(renamed and moved to std.range)
I really like the name orElse!
Shamelessly stolen from Scala Option ;) Now imagine if we make Nullable follow range interface this all can tick together very nicely.
Phobos-idiomatic D code is going to be a language for the initiated. We need to think about rite of Passage
Phobos-idiomatic D code is going to be a language for the initiated. We need to think about rite of Passage
Something like that: a candidate should repent about all accidents with Go. Then he should bring bones of three Scala engineers and read Stroustrup: The C++ Programming Language all night without rest. Then we will exam this persons about Phobos internals
One more thing: orElse should also accept a full range that is returned if the first range is empty.
From the NG:
Yep. Additions to std.range like orElse will make idiomatic Phobos code slower then C++ and Scala. It is not clear when you do a benchmark for single implementation, but idiomatic combinations of D Ranges will became slow.
To bring this a bit into context: mir-algorithm uses C++-like iterators to avoid this - an example:
IotaIterator!int iota;
RetroIterator!(IotaIterator!int) retro;
++iota;
--retro;
assert(*retro == *iota);
http://docs.algorithm.dlang.io/latest/mir_ndslice_iterator.html
To bring this a bit into context: mir-algorithm uses C++-like iterators to avoid this - an example:
-
The main
orElseperformance problem is that D does not have virtual function calls for structs like C++ does. With virtual cals we can avoidifconditions. -
chainperformance problem isifcondition too. It can be solved by introducing a generic analog ofopApply. It will affect basic Dlang iteration idioms likechain(...).until,chain(...).arrayandchain(...).filter. -
bitwiseproblem is that it is implementation for random access range and input range APIs is the same. 2 separate implementations would be faster, few times smaller and clearer (plus less bugs). Recentlybitpack(for few bits integers) was added into mir-algorithm. It was very easy to add and to test it. Bothbitpackandbitwiserequire only two methods to implement:opIndexandopIndexAssign. That is all, 31 LOC and 42 LOC respectively. A type withopIndexis called field, on top of a field a random access iterator (in terms of C++) can be created usingFieldIteratorwrapper. On top of any random access iterator (including pointers) can be created aSlicetype that implements all random access range primitives including multidimensional features. This approach simplifies multiple times random access range implementations. For already implemented features please see ndslice scheme. Please note that input range API for bitwise is buggy because random number generators should have specialthis(this)implementation. As input rangebitwisealternative for RNGsBernoulli2Variablecan be used.
- The main
orElseperformance problem is that D does not have virtual function calls for structs like C++ does. With virtual cals we can avoidifconditions.
I find this claim surprising. Replacing a virtual call having 2-7 possible landing sites with a switch or cascade of if/else statements is a well known and venerable optimization, which to my knowledge has been first documented by a few seminal papers in the 90s, notably:
- "Profile-Guided Receiver Class Prediction" by Grove et al (http://www.research.ibm.com/people/d/dgrove/papers/oopsla95.pdf)
- "Optimization of Object-Oriented Programs Using Static Class Hierarchy Analysis" by Dean et al (http://web.cs.ucla.edu/~palsberg/tba/papers/dean-grove-chambers-ecoop95.pdf)
The optimization called "guarded devirtualization" or "type casing" is commonly taught in graduate classes in compiler design and replaces essentially code that you claim is fast, with code that you claim is slow. The win comes from replacing indirect calls (not inlinable and difficult to speculate) with branches and regular calls (easy to inline and speculate).
Of course, this technique, although considered canon, is of venerable age and may have become stale due to recent advances in CPU technology. However the entire HHVM team has used it in recent times systematically in their JIT PHP compiler implemented in C++. Replacing virtual calls with typecased switch statements has always been fruitful, and can be seen everywhere in the code. (I also found an easy to digest recent independent benchmark: http://dimitri-christodoulou.blogspot.com/2011/09/case-for-replacing-polymorphism-with.html.)
For chain and orElse in particular, we're looking at a branch where speculation may fail only once or (respectively) never. Replacing these with indirect calls and obtaining a performance improvement would be a very remarkable result. I'd be in your debt if you could produce such an experiment.
Regarding bitwise, any improvements would be welcome! I recall you submitted some and/or promised more. Please advise. Thanks!
The main orElse performance problem is that D does not have virtual function calls for structs like C++ does. With virtual cals we can avoid if conditions. I find this claim surprising. Replacing a virtual call having 2-7 possible landing sites with a switch or cascade of if/else statements is a well known and venerable optimization, which to my knowledge has been first documented by a few seminal papers in the 90s, notably:
"Profile-Guided Receiver Class Prediction" by Grove et al (http://www.research.ibm.com/people/d/dgrove/papers/oopsla95.pdf) "Optimization of Object-Oriented Programs Using Static Class Hierarchy Analysis" by Dean et al (http://web.cs.ucla.edu/~palsberg/tba/papers/dean-grove-chambers-ecoop95.pdf) The optimization called "guarded devirtualization" or "type casing" is commonly taught in graduate classes in compiler design and replaces essentially code that you claim is fast, with code that you claim is slow. The win comes from replacing indirect calls (not inlinable and difficult to speculate) with branches and regular calls (easy to inline and speculate).
Of course, this technique, although considered canon, is of venerable age and may have become stale due to recent advances in CPU technology. However the entire HHVM team has used it in recent times systematically in their JIT PHP compiler implemented in C++. Replacing virtual calls with typecased switch statements has always been fruitful, and can be seen everywhere in the code. (I also found an easy to digest recent independent benchmark: http://dimitri-christodoulou.blogspot.com/2011/09/case-for-replacing-polymorphism-with.html.)
For simple benchmarks orElse maybe faster then vtable. But it ads if conditions while vtable just replace pointers. For example complex pipeline based on chain, choose, chooseAmong and orElse will became slower after each pipeline addition. In other hand vtables plus special iteration patter for chain allow to generate fast and tiny code for any kind of function pipeline.
The advantage of devirtualisation is that it is done by optimiser after virtual tables were constructed.
For chain and orElse in particular, we're looking at a branch where speculation may fail only once or (respectively) never. Replacing these with indirect calls and obtaining a performance improvement would be a very remarkable result. I'd be in your debt if you could produce such an experiment.
stack function was added to mir-algorithm today. It is multidimensional chain analog for ndslice.
import mir.ndslice.allocation: slice;
import mir.ndslice.topology: iota;
// 0, 1, 2
// 3, 4, 5
auto a = iota(2, 3);
// 0, 1
// 2, 3
auto b = iota(2, 2);
// 0, 1, 2, 3, 4
auto c = iota(1, 5);
// 0, 1, 2, | 0, 1
// 3, 4, 5, | 2, 3
// ---------------
// 0, 1, 2, 3, 4
// construction phase
auto s = stack(stack!1(a, b), c);
// allocation phase
auto d = s.slice;
assert(d == [
[0, 1, 2, 0, 1],
[3, 4, 5, 2, 3],
[0, 1, 2, 3, 4],
]);
To work with stack special until function was added. Its idea is similar to opApply concept. If a predicate always returns false then until just iterates all elements in a stack. stack + until does not require additional conditions to iterate all elements comparing with chain + Range API.
The simplest performance optimisations that can be added to Phobos are array, copy and put specialisations for chain.
Regarding bitwise, any improvements would be welcome! I recall you submitted some and/or promised more. Please advise. Thanks!
I think that we should move forward with single random access range declaration (ndslice) and have separate implementation for random access ranges and non-random access ranges. I propose to freeze Phobos additions and add only improvements and bugfixes. Other my thoughts about Phobos you already know :-)
For simple benchmarks orElse maybe faster then vtable. But it ads if conditions while vtable just replace pointers. For example complex pipeline based on chain, choose, chooseAmong and orElse will became slower after each pipeline addition. In other hand vtables plus special iteration patter for chain allow to generate fast and tiny code for any kind of function pipeline.
That doesn't sound right. Composition with virtual functions (Java-style) or indirect calls (Haskell style) adds one extra indirect call per composition, which are notoriously difficult to optimize.
Do you have any evidence to support the claim that you can compose freely with virtuals and have them all gone by the time the code is run?
To work with stack special until function was added. Its idea is similar to opApply concept. If a predicate always returns false then until just iterates all elements in a stack. stack + until does not require additional conditions to iterate all elements comparing with chain + Range API.
But that's a different design than what you discussed previously.
The simplest performance optimisations that can be added to Phobos are array, copy and put specialisations for chain.
Yah, range methods that specialize certain generic functions would be good to add. I recall you did have a pull request to that effect, what came of it?
That doesn't sound right. Composition with virtual functions (Java-style) or indirect calls (Haskell style) adds one extra indirect call per composition, which are notoriously difficult to optimize.
I mean D interface style. choose will just return one of its arguments for example, so it will add nothing.
Do you have any evidence to support the claim that you can compose freely with virtuals and have them all gone by the time the code is run?
Nope, we can not remove all virtual calls without advanced JIT, but we can reduce pipeline of virtual calls to single one in expression likechoose(cond1, a, b).orElse(c).chooseAmong(cond2, d, e, f). The result will be just one of a, b, c, d, e, or f. But all ranges should be interfaces/classes (in D).
I recall you did have a pull request to that effect, what came of it?
A PR was for map specialisation. I implemented our ideas for ndslice iterators and fields.
To work with stack special until function was added. Its idea is similar to opApply concept. If a predicate always returns false then until just iterates all elements in a stack. stack + until does not require additional conditions to iterate all elements comparing with chain + Range API.
But that's a different design than what you discussed previously.
Citation: chain performance problem is if condition too. It can be solved by introducing a generic analog of opApply
@9il yah the relative tradeoffs of internal and external iteration are well trodden ground. A good read: http://gafter.blogspot.com/2007/07/internal-versus-external-iterators.html (and of course the GoF book).
@wilzbach phobos uses external iteration, but you'd do good here to improve some cases by having orElse implement each like this:
void each(alias fun)()
{
if (useElse) fun(other);
else source.each!fun;
}
This composes properly and makes for a simple pass through everything. Thx!
ping @wilzbach
I wonder whether or not a coalesce function would be more useful.
@trikko what would be the semantics of coalesce? thx!
Just like in SQL. coalesce(a, b, c, d, ...) select the first non-null. I would select a.front if a is a range or only(a).front if not.
(we can force the first argument to be a range, to static check types of other args)
Isn't this rather like chain?
@klickverbot it's a conditional chain
More like chain(…).front with automatic application of only I suppose, yes. (I was referring to the more general coalesce; trying to figure out where the design sweet spot in terms of simplicity versus functionality lies.)
Isn't coalesce just a orelse with more than two args?
I'm certainly no expert in D, but is there too much wrong with the following implementation of orElse?:
auto orElse(Range, T)(Range r, T defaultValue) if (isInputRange!Range && is(T : ElementType!Range)) {
return r.empty ? r.chain([defaultValue]) : r.chain((T[]).init);
}
But you lose a few things so maybe that's not a viable alternative. From the unittests you defined, the following fails:
- slicing doesn't work on the returned orElse range, so your tests on infinite ranges fails with
Error: no [] operator overload for type Result - Your
hasSwappableElements,hasLvalueElements, andhasAssignableElementsfail - The
@nogctests fail.
So chain doesn't seem to give you a range that's similar to what you put in. The docs say "If all input ranges offer random access and length, Chain offers them as well."
Maybe an alternative is to have orElse implemented with chain, and "fix" chain to offer opSlice if all input ranges have slicing, etc? Also in the implementation of chain I see an attempt to handle hasSwappableElements but also a comment that says @@@BUG@@@ (not sure what the bug is though)
PS: first post here for me in D's github, so I claim the right to ignorance on D's internals and open source directions incase the above is just way too silly ;)
Edit:
Wait or isn't this just: choose(r.empty, [defaultValue], r) ?
Thanks for your pull request, @wilzbach!
Bugzilla references
Your PR doesn't reference any Bugzilla issue.
If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.
Testing this PR locally
If you don't have a local development environment setup, you can use Digger to test this PR:
dub fetch digger
dub run digger -- build "master + phobos#5154"