[Feature] Sequence.first
I really like the fluent sequence api that wren has. But I feel like a method to extract a single item from a sequence is missing.
Possible implementation
var lst = [6, 7, 8, 9]
lst.first // 6
lst.skip(1).first // 7
lst.first {|n| n > 7 } // 8
lst.first {|n| n > 10 } // null
lst.skip(4).first // null
first {
for (element in this) {
return element
}
return null
}
first(f) {
for (element in this) {
if (f.call(element)) {
return element
}
}
return null
}
Do people think this would be a good addition?
I've often thought about it, the interesting question is the lack of symmetry with last, wouldn't it feel weird without?
Yeah. While it's much less common, last would be nice too.
The implementation is a bit more tricky though. For a sequence that can be indexed like List, you would want to prefer that to get the value more efficiently.
I would say no to that addition. We already have sequenceable[index] and
sequenceable.where(unaryPredicate)[index] with index 0 and -1 to achieve
the same results.
Also if first was added, what about second, third...
Also that naming implies ordering which is not implied by the proposed
implementation (which would produce "invalid" results for maps).
Ah okay, the sequence[index] meets what I want, just to get a single item from a sequence. Is it just not added yet then? I'm not sure exactly what I'm looking for, but I couldnt see any PRs.
I just tested in wren_cli its not in 0.4.0.
Sequences aren't in general indexable (for example Range isn't) so you have to convert to a list first:
var lst = [6, 7, 8, 9]
var first = lst.where { |e| e > 6 }.toList[0]
System.print(first) // 7
The trouble with this is that it's not very efficient. You have to iterate through the entire sequence to find out which elements satisfy the predicate, add those to a new list and then extract the first one.
Fot this reason, I would be in favor of your proposal as you only need to iterate the sequence until you find an element which satisfies the predicate and then return that directly.
Out of curiosity, what is your use case?
The real general case would be to use sequence.iteratorValue(sequence.iterate(null)).
Unless I'm mistaken, there is no guarantee that it will output a predicable output (in particular since maps can/is nordered). So it defeat the point in the name choice.
Also if
firstwas added, what aboutsecond,third... Also that naming implies ordering which is not implied by the proposed implementation (which would produce "invalid" results for maps).
Although you have a point about 'first' implying ordering, in my experience languages which have similar fluent APIs to Wren (C#, Kotlin etc.) do still use the word 'first' in the sense that it returns the first element in the iteration which satisfies the predicate. How useful this is will depend on the nature of the sequence though I agree that it may be useless for maps whose iteration order is undefined unless you simply want to extract some random key/value pair which satisfies the predicate.
Also those languages do generally have a 'last' method as well though I can't think of one which has second, third etc. Although you'd need to iterate through the whole sequence to find the last element it would still be more efficient than 'where' because you wouldn't need to do .toList[-1] to extract it.
Possible implementations for Wren:
last {
var final = null
for (element in this) {
final = element
}
return final
}
last(a, f) {
var final = null
for (element in this) {
if (f.call(element)) {
final = element
}
}
return final
}
In C++, it's called front/back instead of first/last.
@PureFox48 These implementations are O(N). O(1) can be achieved in most case with a reverse().first(), since most reverse() could be implemented lazily. It would also help to guarantee that we don't enter an infinite loop for non terminated Sequences.
Yes, I wondered about that - I have a reverse iterator in my own modules, though we don't have one as yet in the core library.
front/back would be reasonable alternatives to first/last in the overloads which don't have a predicate (more or less the C++ usage) but I'm not sure they sound quite right where there is a predicate.
first(unaryPredicate) really sounds like find(unaryPredicate) returning a value instead of an iterator. So maybe it should be something like findValue(unaryPredicate) instead?
I'd avoided suggesting find remembering that you have a PR (#898) open for Sequence.find to return an iterator.
But we could perhaps use findFirst and findLast to return the first and last values.
first(predicate) is useless, it's just where(predicate).first
You could implement it like that but it would produce an intermediate WhereSequence which @Deijin27's suggested implementation does not do.
On the other hand it would add less code to core and front/back would be fine.
I don't mind what it's called, it's the functionality that matters.
A second or third wouldn't be necessary, as you can combine it with skip(1) skip(2) for this rare case.
I've had a think about my use case, and general use cases in any language First I'll give some common ones
// if a matching item is found, handle it, otherwise return
// handling it in the loop often leads to pyramid code, so it's common to have
// the 'failures' return, and the 'successful' path be the last thing in the method
// if you need a loop, even the where feels unnecessary when it could be replaced with an if
// the example further down is probably a better case to convey why you might
// only want or expect a single matching item
// but it's also reasonable that one might want any singlular Frog
var foundFrog = null
for (frog in amphibians.where {|amphibian| amphibian is Frog }) {
foundFrog = frog
break
}
if (foundFrog == null) return
handleFrog(firstFrog)
var firstFrog = amphibians.first {|amphibian| amphibian is Frog }
if (foundFrog == null) return
handleFrog(firstFrog)
// you have a list which could be empty, you want
// a single item, in this example to be a default "selected" item
// the indexer throws an exception if the list is empty,
// so you have less simple code
var selectedAmphibian = null
if (amphibians.count > 0) {
selectedAmphibian = amphibians[0]
}
var selectedAmphibian = amphibians.first
// find an item, where a property is generally unique to an item
// but it's a property on that item, rather than being enforced via
// a map. Or maybe it's multiple items that are unique as a whole.
// Consider an collection with firstName and surname
var people = [
Person.new("Kermit", "TheFrog"),
Person.new("Froggy", "McFrogFace")
]
var kermitTheFrog = null
for (person in people.where {|x| x.firstName == "Kermit" && x.surname == "TheFrog"}) {
kermitTheFrog = person
break
}
var kermitTheFrog = people.first {|x| x.firstName == "Kermit" && x.surname == "TheFrog" }
A specific case I have in wren is my xml library. In the actual implementation of various methods, I use the approach of returning the first value in a loop.
#doc = "The first and only node in this document that is an XElement. null if there is no XElement in the document."
root {
for (node in nodes) {
if (node is XElement) {
return node
}
}
return null
}
And from a user's point of view, an xml element has child nodes that are elements or comments, and thus the elements property is always a WhereSequence to filter out the comments. When processing you often have similar cases to the kermit example, since xml is commonly used for serialization of such data.
// example from readme. Finding a the color of a specific fish in the xml document
var colorOfFishCalledPearl = doc
.root
.elements("danio")
.where {|e| e.attribute("name").value == "pearl" }
.toList[0]
.attribute("color")
.value
I am considering adding various methods that take a function argument, but that wouldn't be necessary wren had the first method; that combined with the already existing where and map would be perfect, and my library would nicely slot into wren's infrastructure.
The real question is why do you need to handle only the first entry. I mean when you are dealing with a list of item, either:
- one treat the full list in place.
- one extract an entry, to treat exhaust the list.
- one choose an entry from it, so the first one become some default. And because peeking the first entry is not stable, since it depends of what the container implementation is, it is not a predictable source, unless it is indexed by any form of unique key.
@mhermier It exists in many APIs for a reason. Sometimes, you just want an item in a sequence and you pick the first to come.
For me, instead it should be name more something like "peekValue" (that throw) with a compagnon "peekValueOr" (that has a default value), since it removes the notion of position in the naming.
Well, a Sequence, by definition, has an order: the order in which you can iterate over the elements. So the "first" is well defined: it's the first element in the sequence.
It has an intrinsic order which is necessary for traversal, not a stable order in the sense that inserting objects in some order guarantee the traversal order in every Sequence species.
I think the examples I gave show how its useful, and they are examples with ordered sequences. Lists or filtered lists. I dont think I've ever used it on the equivalent of a Map in other programming languages, its just there by coincidence.
Skip is comparable to this. Since the Map's enumeration order is not consistent, skip is not useful for it.
It is true, the API is somehow broken/inconsistent in some aspects. It doesn't take into account some Sequences behaviors like infinite size, ordering and probably some others I don't think of right now.
When most programmers see the word peek they probably think about obtaining the first element of some list structure (typically a stack or queue) without removing it. So I think it would be a good word to use here rather than first or front to avoid any ordering connotation.
I see no reason to use peekValue (what else can we be peeking?) or peekValueOr - we can just use peek || default for that.
If we also adopt @jube's idea of using where to do any filtering, there's no need to have a separate overload which takes a predicate. So peek is only going to add a few lines to core and would be a worthwhile addition IMO.
Another advantage of peek is that we needn't bother with adding last/back to maintain symmetry. These can only be implemented efficiently in Wren if we have a reverse iterator which would have to be done on the C side to cover non-indexed sequences properly. I doubt whether it's worth the effort as, in practice, most sequences are indexed and people would just use for (i in seq.count-1..0) or simply reverse them with seq[-1..0] and iterate through that. The last element itself would, of course, be seq[-1].
I proposed peekValue because I thought it could returns an iterator. But I agree peek make sense.
About peekOr I insist on it. We need to be able to distinguish between a peek that failed and potential valid falsy value that could have been returned by the Sequence. It should probably be discussed in another issue, but I really think we need undefined and operator ?: or alike, to mimic null optional. It would simplify a bunch of problem like these.
Well the only way that peek can fail is if the sequence is empty in which case it will return null.
We could check explicitly for that with (peek == null) ? default : peek but I'll grant you that the first element could exist and have a null value and so this isn't a perfect solution.
There are doubtless other methods in core where returning null is ambiguous but, if we start having Or methods for all of them, that may add considerable bloat.
As you suggest, a better solution would be to return an undefined value which can be checked for to resolve any ambiguity - I believe there already is one which is used internally but is not exposed publicly so we could perhaps use that. But that's obviously quite a far reaching change so I think it would be best to open a new issue for that with your preliminary thoughts.
Incidentally, a simple way to deal with undefined values would be to have a convention whereby you returned some obscure control code which would never be used in practice these days to denote a failure. A possible candidate would be NAK ('Negative Acknowledgement') which is "\x15" and, if one goes way back, was actually known as ERR at one time.
A more OO solution which has just occurred to me would be to add a small singleton class to core:
class Undefined {
// Returns the singleton. If it hasn't been created, creates it first.
static value { __instance == null ? __instance = Undefined.new_() : __instance }
// Private constructor.
construct new_() {}
}
You could then return Undefined.value instead of null to signify an error and test for it with returnValue is Undefined. So instead of peekOr we could do (peek is Undefined) ? default : peek.
undefined exist in the VM. It is not exposed to the public, and serve as a tombstone value for the map implementation. It is only a matter of political decision to expose it.
Well, if there's no good reason why it can't be exposed, then I think that would be the best solution.
undefined would need to become a keyword and so the change would not be backwards compatible but I think that would be a price worth paying.
Technically and analogous to null, undefined would be the only instance of an Undefined class in core and (along with null and false) would be regarded as 'falsey'.
I see what you're getting at with there needing to be a way to distinguish between a not found and a found null value.
I just wanted to say if you were doing the singleton value approach, you just use the class itself.
class Undefined {}
peek {
for (value in this) {
return value
}
return Undefined
}