encore icon indicating copy to clipboard operation
encore copied to clipboard

For-Comprehension

Open ElieOaks opened this issue 5 years ago • 0 comments

For - Comprehension A new kind of for-comprehension is being implemented into Encore that replaces the old for-loop. For-comprehension can: -- act as a function call :

var x = for y <- someList do
              y
            end

-- iterate over itterable objects, not just Arrays and Ranges:

for x <- linkedList do
        --body
end

-- iterate over multiple data structures in one call:

for x <- [1, 2, 3], y <- ["a", "b", "c"] do
        -- body
end

-- be passed as an argument

foo( for y <- someList do
            y
        end)

The new for-comprehension does not allow: -- Different kinds of data structures in header

for x <- linkedList, y <- ["a", "b", "c"] do
        -- body
end

-- Range to return a collection. It will compile, but x will be of type unit. Therefor ranges only desugar into foreach.

var x = for y <- [0 .. 10] do
              y
            end

-- A return in the body of the loop to return something outside of it. The bellow example will not return 1.

fun foo() : int
  for x <- [1, 2, 3]
    return x
  end
end

Rather it is equivalent to:

fun foo() : int
  var returnCollection  for x <- [1, 2, 3]
                                        return x
                                   end
 return returnCollection
end

Which of course would break the typing rules, as it returns an array, and not an int. Under the hood: The for-comprehension is pure syntactic sugar, and is desugared into combination of calls to the methods map, flatMap, foreach, and if the loop contains break, into maybeForeach. These methods can be found in some of the collections in the standard libraries, such as Array and LinkedList. the methods map and flatMap are used in combination, when a return collection is expected. Otherwise the method foreach is used, as it's loop has only side effects.

var x = for y <- someList, z <- someListTwo do
              y + z
            end

is desugaraed into: var x = someListTwo.flatMap(someList.map(y))


for y <- [0 .. 10] do
         print(y)
end

is desugaraed into: [0 .. 10].foreach(print(y))


for y <- [0 .. 10] do
         print(y)
         if y > 9 then
            break
         end
end

is desugaraed into:

 [0 .. 10].maybeForeach(
  print(y)
  if y > 9 then
     return Nothing
  end
  Just(())
  )


Added compilation steps The desugaring of for-comprehension depends upon typing information, and is therefore a new step of compilation called TypedDesugarer, and occurs before the optimizations. As the desusaring process breaks down the For-Expr and builds up a sub-AST of new Expr, the AST needs to be re-type checked, and re-capture checked. Therefore the compilation includes Typed Desugaring, as well as Re - Type checking, and Re - Capture checking. When modular type checking, and capure checking becomes available, these compilation steps can hopefully be removed.

Problem with re - type checking Type checking twice is not possible due to the type checking process of Match clauses. When Match is type checked, it changes the Match-clause so that it violates its own typing rules and cannot be type checked again without throwing an Error. A hack has been created so that programs with for-comprehension can compile and run, but is currently incompatible with the match - type checker. This fork can be merges only when the match type checking problem has been solved.

Arrays Unlike classes such LinkedList in the standard Library, there is a list of functions, rather than methods that can be applied to arrays. Therefore for-comprehensions that are over arrays are desugared into function calls map, flatMap, foreach, and if the loop contains break, into maybeForeach, rather than into method calls by the same name.

Other changes This PR has also had to incorporate the PR: Ability to compile pony runtime with gcc 7.x #865. As the latest ubuntu, which this was developed on, runs on gcc 7. It also desugars ranges into a class RRange, that can be found in the module-file Data. This was in order to have the ability to add methods. As range is a protected word in the compiler, the class has to be named something else, hence the double R:s. As ranges are only used together with for-loops / for-comprehensions, this change is not expected to affect anything else.

ElieOaks avatar May 08 '19 07:05 ElieOaks