Ignore return values of generators?
(REALLY great work on this package @lionel-, it's astonishingly good!)
I would expect this code:
library(coro)
gen <- generator(function() {
yield(1)
2
})
collect(gen())
to result in list(1), but it results in list(1, 2). The current behavior appears to be intentional, if test-generator.R is any indication.
It's causing problems for me in at least two usage patterns.
Early exit
library(coro)
g <- generator(function(items) {
if (length(items) < 3) {
# Do nothing
return()
}
for (i in items) {
yield(i)
}
})
collect(g(c(1,2)))
I want list(), but instead I get list(NULL). Currently I'm working around this by never returning early.
Side-effecty code after generation
library(coro)
yielded_values <- c()
g <- generator(function(items) {
for (i in items) {
yield(i)
}
yielded_values <<- c(yielded_values, items)
})
collect(g(c(1,2)))
I want list(1, 2), but instead I get list(1, 2, yielded_values). Currently I'm working around this by including if (FALSE) yield(NULL) as the last line of the generator function--that seems to prod the state machine into having the behavior I want.
hmm I don't remember exactly but I may have been confused by the meaning of {value: "something", done: true} returned from JS generators. I may have thought this was an inclusive boundary for the generator iterator, but testing in a for loop I now see it's an exclusive boundary, despite the value being returned to the caller.
Your examples are convincing and coro should be consistent with existing design in other languages to avoid surprises. So I think we should fix this.
@dfalbel do you think changing this is likely to cause issues for mlverse packages?
(Note that the early exit case seems like a bug in coro because returned NULL values should be transformed to the exhausted sentinel IIRC.)
I don't think this will cause issues with the mlverse usage of coro. We're not using generator, but only iterator functions built directly.
For the early returns, is this expected to only affect generators? I'm assuming we can still collect NULLs from eg:
iter = function() {
i <<- 0
function() {
i <<- i + 1
if (i == 1) {
return(1)
} else if (i == 2) {
return(NULL)
}
coro::exhausted()
}
}
coro::collect(iter())
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> NULL
@dfalbel Correct, this would only affect generators.
There's the question of what to do with returned values. Do we simply throw away the information and return the exhaustion sentinel instead? Ideally we'd return the value along with the exhaustion signal.
To do the latter I'll have to change the way exhaustion is currently signalled with a sentinel value (a special symbol created with exhausted()). We'd need to switch to the JS approach of a structured value with value and done fields. Packages and users are supposed to treat the exhaustion sentinel as an opaque value, so that shouldn't cause (too much) incompatibility.
We can start with throwing away the return value though.
Error handling is a tricky case. If we ignore return values this returns exhausted():
my_gen <- generator(function() {
tryCatch(
1 + "",
error = function(cnd) "handled"
)
})
So it looks like we need to allow yield() inside error handlers. This is also how it works in JS:
function* errorGenerator() {
try {
yield 'ok';
throw new Error('foo');
} catch (err) {
yield err.message;
}
}
This is quite a breaking change.