nickel
nickel copied to clipboard
Lists and list comprehensions
Currently, Nickel does not have lists nor arrays. We propose to add lists and a set of tools for manipulating them to the language, namely list operations and list comprehensions. The set of operations must be complete enough to allow users to avoid general recursion except for very few particular cases. This proposition, including syntax, is not final but rather a concrete basis for sparking discussion.
Definition
A list is an immutable sequence of comma-separated expressions. Each element is stored inside a thunk, such that it is only evaluated if needed, and at most once. Elements can have different types.
Example: ["a", 2, "/a/b/c", false, {a={b=2;};}]
Operations
| Operation | Syntax | Description | Example |
|---|---|---|---|
| concatenation | l ++ l' |
Concatenate two lists | ["a", 2] ++ ["3", {a=1;}] = ["a", 2, "3", {a=1;}] |
| access | elemAt l n |
Access the nth element of l | elemAt ["a", 2, "3] 1 = 2 |
| map | map f l |
Apply f to each element of l | map (fun x => x + 1) [1,2,3] = [2,3,4] |
| filter | filter f l |
Return a list consisting of the elements of l for which the function f returns true | filter (fun x => isNum x) [1, "2", 3, {a=4;}] = [1, 3] |
| flatten | flatten f l |
Concatenate a list of lists | flatten [[1, 2], [3, 4]] = [1, 2, 3, 4] |
| fold left | foldl f x l |
Iteratively apply (strictly) the binary operator f on elements of l, from left to right, starting with x | foldl (fun x y => x + y) 0 [1, 2, 3] = 6 |
| membership | elem x l |
Return true if x is an element of l, and false otherwise | elem 3 [1, 2, 3] = true |
| all | all p l |
Return true if the predicate p is true for all elements of l, and false otherwise | all (fun x => x % 2 == 0) [1, 2, 3] = false |
| any | any p l |
Return true if the predicate p is true for at least one element of l, and false otherwise | any (fun x => x % 2 == 0) [1, 2, 3] = true |
Comprehensions
A syntax for easing the construction of lists. We basically propose Haskell list comprehensions.
Examples
[i*i | i <- [1,2,3]] = [1,4,9]
[[i,j] | i <- [1,2], j <- ["a","b"]] = [[1,"a"], [1,"b"], [2,"a"], [2,"b"]]
[i | i <- [1,2,3], i % 2 == 1] = [1,3]
Syntax
A list comprehension is composed of a main expression, on the left of |, with free variables which will be bound to various values in order to generate the elements of the final list. The right of | is a list of comma separated qualifiers, which are either:
- A binding of a free variable (or a pattern involving free variables) of the main expression to values in a list
- A boolean predicate for filtering which elements we want to keep (a guard)
- A local let binding
compr ::= [ e | qual1, ..., qualn]
qual ::= pat <- e | let bindings | e
e ::= any Nickel expression
pat may be taken to just be a variable for now, and can be extended later to be a pattern when we add pattern matching/destructuring. The last e in qual corresponds to guards.
Semantics
[e | true] = [e]
[e | q] = [e | q, true]
[e | let binding, Q] = let binding in [e | Q]
[e | b, Q] = if b then [e | Q] else []
[e | p <- l, Q ] = let bind = fun p => [e | Q] in flatten (map bind l). This will have to be updated if p can be a more general pattern
As a minor syntactic bikeshedding, I find the python list comprehension syntax (basically replace | and , by for and <- by in) nicer to use and probably more natural to most people
Personally I’m not a big fan of list comprehensions, as they can be expressed fully with simple higher-order functions. They complicate the language, and thus the work of implementors.
If we decide that list comprehensions is what people want, then they should be fairly low-priority (list syntax can be implemented without them obviously).
functional pipes for the win!
[1, 2, 3] |> std.array.filter (fun x => x==2) |> std.array.map (fun x => x+1)
# [3]
let
array_sum = std.array.fold_left (fun acc val => acc + val) 0
in
[1, 2, 3] |> array_sum
# 6
let
array_avg =
let
array_sum = std.array.fold_left (fun acc val => acc + val) 0
in
fun arr => (array_sum arr) / (std.array.length arr)
in
[1, 2, 3] |> array_avg
# 2
see also: array functions in nickel stdlib
Revisiting this issue, I agree that pipes provide reasonable ergonomics. List comprehension can be still useful for iterating from multiple sources, like [ x + y + z for x in l1, y in l2, z in l3] which is a tad uglier with pipes and zips, but it also cost new syntax, several ways of doing the same thing, etc. Unless someone is siding for list comprehensions strongly, I think I'm gonna close this issue soon, for the time being.
iterating from multiple sources, like
[ x + y + z for x in l1, y in l2, z in l3]
let array_zip =
fun lst2 =>
let len = std.array.length (std.array.at 0 lst2) in # -- TODO assert equal length of all lst
std.array.generate (fun idx => std.array.map (fun lst => std.array.at idx lst) lst2) len
in
let
lstlst = [ [1,2,3], [4,5,6], [7,8,9] ]
in
lstlst
|> array_zip # -- [ [ 1, 4, 7 ], [ 2, 5, 8 ], [ 3, 6, 9 ] ]
|> std.array.map (fun lst => std.array.fold_left (+) 0 lst) # -- [ 12, 15, 18 ]
edit: zip != cartesian product
Sure, my point wasn't that this is not possible with higher order functions (since list comprehensions are just syntactic sugar), just that this is slightly less convenient and readable :slightly_smiling_face:
i was just curious how array.zip could look in nickel : )
part of what makes list comprehensions pretty is the value binding,
which could be expressed with an apply function. similar to javascript
btoa.apply(null, ["1"]) == btoa("1")
[ l1, l2, l3 ] # -- [ [1,2,3], [4,5,6], [7,8,9] ]
|> array_zip # -- [ [ 1, 4, 7 ], [ 2, 5, 8 ], [ 3, 6, 9 ] ]
|> std.array.map (fun lst => apply (fun x y z => x + y + z) lst) # -- [ 12, 15, 18 ]
edit: refactor ...
let xx = [ 1, 2, 3 ] in
let yy = [ 4, 5, 6 ] in
let zz = [ 7, 8, 9 ] in
let
array_comp = fun fcb lstlst => # -- aka "list comprehension"
lstlst
|> array_zip
|> std.array.map (fun lst => apply fcb lst)
in
[ xx, yy, zz ] |> array_comp (fun x y z => x + y + z)
almost there
[ x + y + z for x in xx, y in yy, z in zz ]
edit: zip != cartesian product
We need probably some observation time on what people need array for in practice. That being said, a comprehension syntax is often a great addition, as chaining concatMap is not particularly syntactically present, even with forward application. It often improves readability even of simple maps. It's just a good swiss-army knife thing. We shouldn't underplay its value. And since we already have array literal, and the comprehension syntax merely extends this syntax, it's not very costly.
I wouldn't close the issue, just recognise that we don't know exactly what the outcome of it will be.
I'm a bit confused by the intended semantics of the discussed [ x + y + z for x in l1, y in l2, z in l3] example. If I read the original post correctly, this should work on the cartesian product l1×l2×l3, yet the non-comprehension alternative discussed here seems to use a simple zip instead. Which one is it then?
I think the non-comprehension alternative is indeed not correct. The original issue is the source of truth, and the natural semantics for multiple list comprehension is the cartesian product (interpreting them as nested comprehension), e.g. in Python or in Haskell.
How far would we get without comprehensions by having cartesian product helper functions in std? I do agree that using higher order functions is preferable to the addition of a complex language feature, however this does not mean users should reinvent the wheel all the time for simple tasks. There should be powerful helper function instead who do the heavy lifting.
aah, sorry, zip != cartesian product
zip in python, haskell, nickel
zip in python:
list(zip([1, 2], [3, 4]))
# [(1, 3), (2, 4)]
zip in haskell:
zip [1, 2] [3, 4]
# [(1, 3), (2, 4)]
zip in nickel:
let array_zip =
fun lstlst =>
# TODO assert equal length of all lst
let len = std.array.length (std.array.at 0 lstlst) in
std.array.generate (
fun idx => std.array.map (
fun lst => std.array.at idx lst
) lstlst
) len
in
let
lstlst = [ [1,2], [3,4] ]
in
lstlst |> array_zip
# [[1,3],[2,4]]
cartesian product in python, haskell, nickel
cartesian product in python:
[ (x, y) for x in [1, 2] for y in [3, 4]]
# [(1, 3), (1, 4), (2, 3), (2, 4)]
import itertools
list(itertools.product([1, 2], [3, 4]))
# [(1, 3), (1, 4), (2, 3), (2, 4)]
def f():
for x in [1, 2]:
for y in [3, 4]:
yield (x, y)
list(f())
# [(1, 3), (1, 4), (2, 3), (2, 4)]
cartesian product in haskell:
[ (x,y) | x <- [1,2], y <- [3,4]]
# [(1, 3), (1, 4), (2, 3), (2, 4)]
sequence [[1,2],[3,4]]
# [[1, 3], [1, 4], [2, 3], [2, 4]]
import Control.Applicative
((,) <$> [1,2] <*> [3,4])
# [(1, 3), (1, 4), (2, 3), (2, 4)]
cartesian product in nickel:
# cartesian product of 2 arrays
let array_cart2 =
fun lstlst =>
let lst1 = std.array.at 0 lstlst in
let lst2 = std.array.at 1 lstlst in
# TODO assert equal length of all lst
#let len1 = std.array.length lst1 in
std.array.flatten (
std.array.map (fun x1 => (
std.array.map (fun x2 => (
[x1, x2]
)) lst2
)) lst1
)
in
let
lstlst = [ [1,2], [3,4] ]
in
lstlst |> array_cart2
# [[1,3],[1,4],[2,3],[2,4]]