Indefinite iterable support
While it is currently possible to create objects that support the foreach statement by implementing .len and [], some objects that would typically be iterated through may not have a definite value to return for .len or [].
Having the following as standard interfaces:
module iter { Type1 };
interface Iterator {
fn bool next(Type1* out);
}
interface Iterable {
fn Iterator{Type1} new_iterator(Allocator a);
fn Iterator{Type1} tnew_iterator();
}
macro Iterator.@foreach(&self; @block(elem)) {
Type1 elem;
while(self.next(&elem)) {
@block(elem);
}
}
macro Iterable.@foreach(&self; @block(elem)) {
@pool() {
Iterator iter = self.tnew_iterator();
Type1 elem;
while(iter.next(&elem)) {
@block(elem);
}
};
}
This would open the door to "generator" types, for example a sequence of all prime numbers up to a specific value:
// new and free not shown for simplicity
struct PrimeIterator (Iterator{uint}) {
// initialized to 2
uint current;
uint limit;
List { uint }* seen;
}
fn bool PrimeIterator.next(&self, uint* out) @dynamic {
while(self.current < self.limit) {
bool is_prime = true;
foreach(i, v : self.seen) {
if(self.current % v != 0) { continue; }
is_prime = false;
break;
}
if(!is_prime) {
self.current += 2; // Always go to next odd number
continue;
}
*out = self.current;
if(self.current == 2) { self.current += 1; } // special case to go from 2 to 3
else { self.current += 2; } // can iterate only odds after 3
return true;
}
return false;
}
It would also ensure that types like linked list could iterate their nodes in a more performant way, since they would be able to step node to node, rather than needing to do a full [n] lookup through all n nodes.
See rust std::iter https://doc.rust-lang.org/std/iter/ for a similar feature
They use optionals instead of Bool + copy, and have
iter() which iterates over references,
iter_mut() which iterates over mutable references,
into_iter() which iterates over values
We could drop iter_mut() since rust mutability/immutability doesn't really directly translate to c3, maybe something along the lines of this is more suitable. Though the iter_ref functions are giving me some trouble with compilation:
module iter { Type1 };
interface Iterator {
fn Type1? next();
}
interface Iterable {
fn Iterator{Type1} new_iter(Allocator a);
fn Iterator{Type1} tnew_iter();
// Compiler error: Generic resolution of this type has become
// deeply nested, it was aborted after reaching 32 recursions
fn Iterator{Type1*} new_iter_ref(Allocator a);
fn Iterator{Type1*} tnew_iter_ref();
}
you can already use an optional and something like
SomeIterator irer = /*...*/;
while (try value = iter.next())
{
/*do something*/
}
where iter.next is fn SomeValue? SomeIterator.next(&self)
but an interface of
Iterator { Type }
that provides the .next function might be useful
Yep, I was implementing based off the C# IEnumerable/IEnumerator which uses a bool return, but I do think that the optional route is probably better.
Having the std collection types implement a shared interface for generating iterators would enable C# linq like functions that act as lazy collection evaluators:
module iter::where { Type1 };
import iter::iterator;
alias PredicateFn = fn bool(Type1);
struct WhereIterator (Iterator{Type1}) {
Iterator{Type1}* parent;
PredicateFn predicate;
}
fn Iterator{Type1} Iterator{Type1}.where(&self, PredicateFn predicate) {
WhereIterator* out = allocator::new(tmem, WhereIterator);
out.parent = self;
out.predicate = predicate;
return out;
}
fn Type1? WhereIterator.next(&self) @dynamic {
while(true) {
Type1? v = self.parent.next();
if(catch e = v) { return e?; }
if(!self.predicate(v)) continue;
return v;
}
}
module iter::select { Type1, Type2 };
import iter::iterator;
alias SelectFn = fn Type2(Type1);
struct SelectIterator (Iterator{Type2}) {
Iterator{Type1}* parent;
SelectFn selector;
}
// Probably want an explicit allocator version of this as well
fn Iterator{Type2} Iterator{Type1}.select(&self, SelectFn selector) {
SelectIterator* out = allocator::new(tmem, SelectIterator);
out.parent = self;
out.selector = selector;
return out;
}
fn Type2? SelectIterator.next(&self) @dynamic {
Type1? value = self.parent.next();
if(catch excuse = value) { return excuse?; }
return self.selector(value);
}
Definitely just found a bug in relation to the above implementation:
I've got this code:
import std::io;
import array_vec;
import iter;
import iter::iterator;
import iter::select;
import iter::where;
// should print 12 14 16 18 20 22 24 26 28 30 32 34 36 38
fn void main() {
// Custom Dynamic sized array with Iterable interface
ArrayVec{usz}* test = array_vec::tnew{usz}(20);
for(usz i = 0; i < 20; i++) {
test.push(i);
}
// Compilation works with these, errors with them commented out:
// Error: The 'Iterator' interface has no method 'select', did you spell it correctly?
// Error: The 'Iterator' interface has no method 'where', did you spell it correctly?
PredicateFn{usz} pred = fn (x) => x > 10;
SelectFn{usz, usz} selector = fn (x) => x * 2;
@pool() {
test.tnew_iter()
.select(fn (x) => x * 2)
.where(fn (x) => x > 10)
.@foreach(; usz e) {
io::printf("%d ", e);
};
};
}
I assume this is happening because the concrete version of the generic module isn't being pulled in unless there is an non-implicit "request" for the type. Though, this also doesn't work:
.select{usz, usz}(fn (x) => x * 2)
.where{usz}(fn (x) => x > 10)
Another limitation, specifically affecting iterators which transform the type of the output, is that you cannot have multiple instances of the same generic method, even if they return a different type:
// The same method is generated by multiple instances of
// 'iter::select': 'iter::select{ulong, ulong}' and
// 'iter::select{ulong, char}'. You need to use `@if` to restrict it
// to one of them, or move it out of the generic module entirely.
alias Selector1 = SelectFn{usz, usz};
alias Selector2 = SelectFn{usz, char};
There's maybe a way around this with macros https://c3-lang.org/faq/#patterns. But it would be pretty messy, and I don't think there's even a way to reflect on generic type parameters so returning anything other than Iterator{any} might be impossible.
I think we would need actual function overloading, at least for calls with different generic type parameters, before something like SelectIterator can be actually fleshed out, unfortunately.
Corollary to this issue would be generator function syntax similar to rust std::iter::iter! and C# enumerable functions
Something like:
fn Iterator{usz} range(usz stop, usz start = 0, usz skip = 1) @generator {
for(usz i = 0; i < stop; i+= skip) {
yield i;
}
// specify excuse directly or automatically use NO_MORE_ELEMENTS
// yield std::iter::NO_MORE_ELEMENTS?;
}
// which would processed to something along the lines of:
struct __RangeIterator(Iterator{usz}) {
// parameters
usz __stop;
usz __start;
usz __skip;
usz __i;
}
fn Iterator{usz} range(usz stop, usz start = 0, usz skip = 1) {
return allocator::new(tmem, __RangeIterator);
}
usz? __RangeIterator.next(&self) @dynamic {
if(self.__i < stop) {
usz __i = self.__i;
self.__i += self.__skip;
return __i;
}
return std::iter::NO_MORE_ELEMENTS?;
}
This is definitely a much more involved feature since it requires translating all of the control flow statements into equivalent stateful ones for the next function.
at that point you are just doing stackless coroutines but yeah I have a lot of experience with the C# counterpart to this (I have written way too much C# lol)
I do about 50/50 of my programming in C and C# (the pleasures of a full embedded stack with bare metal firmware and desktop software), and I can definitely feel the pain when losing some of the nice features that C# has.
Obviously c3 isn't suited for the vast majority of what is available in C#, but I think any language can benefit from "little" features like this
I'm not a C++ person so I was unfamiliar with the stackless coroutine implementation, but yes, the function syntax described above would essentially be the generator<T> type. In my opinion, first-class support for some kind of stackless suspend/resume, even if more generic than an iterator specific version would be very welcome. If you look at something like async.h, you can see how to get anything like async/await you have to abuse language constructs with defines or mess with jumping to naked function pointers. Pretty nasty stuff. Having there be a "safe" way to implement something like that, even if no such functionality is provided by std, would be a huge boon, imo.
Rust does a similar thing under the hood, providing the async/await syntax to automatically compile a function into a stackless state machine, as well as the Future and IntoFuture traits, but you either need to roll your own implementation or pull in an external crate to actually do any type of asynchronous execution.
Note that you can construct a stateful iterator that keeps the current value, and as such is performant.
Here is the implementation for linked list:
struct LinkedListArrayView
{
LinkedList* list;
Node* current_node;
usz current_index;
}
fn usz LinkedListArrayView.len(&self) @operator(len) => self.list.size;
<*
@require index < self.list.size
*>
fn Type LinkedListArrayView.get(&self, usz index) @operator([])
{
return *self.get_ref(index);
}
<*
@require index < self.list.size
*>
fn Type* LinkedListArrayView.get_ref(&self, usz index) @operator(&[])
{
if (index == self.list.size - 1)
{
self.current_node = self.list._last;
self.current_index = index;
}
while (self.current_index != index)
{
switch
{
case index < self.current_index: // reverse iteration
self.current_node = self.current_node.prev;
self.current_index--;
case index > self.current_index:
self.current_node = self.current_node.next;
self.current_index++;
}
}
return &self.current_node.value;
}
As you see, it will use a "current" cursor to walk forward (or backwards) int he optimized case to the next index. That means foreach (which will only walk one step at a time) will be sufficiently fast. It is like an iterator with next and prev with an "index" function on top of that. This doesn't quite solve the indefinite case though.
What prevents me from allowing foreach (i : iterator) is that I find it useful for foreach to mean "iterate over a bounded list" and "while (try i = foo.next())" being "iterate over a list/set without known bounds". This gives an extra hint to the reader.