mario
mario copied to clipboard
Changing the signature of the write- commands
ls -l | mario map 're.split("\s+", x)' write-json
This gives a list of arrays (an array of arrays would seem nicer, but ok)
...
]
[
"drwxr-xr-x",
"6",
"username",
"group",
"4096",
"Jan",
"18",
"2019",
"filename"
]
[
"drwxr-xr-x",
...
ls -l | mario map 're.split("\s+", x)' write-csv-tuples
This splits each item with commas and groups the items in a line with double newlines... Huh?
...
d,r,w,x,r,-,x,r,-,x
6
u,s,e,r,n,a,m,e
g,r,o,u,p
4,0,9,6
J,a,n
1,8
2,0,1,9
f,i,l,e,n,a,m,e
d,r,w,x,r,-,x,r,-,x
...
Ubuntu 18.04 / Python 3.8.0 in a virtualenv without system site packages.
ls -l | mario map 're.split("\s+", x) ! [x]' reduce operator.add write-csv-tuples
ok, so this works. I'm not clear why, though. Hm.
Thanks for the thought-provoking question! I'll try to answer the question and add some commentary.
On the question
Input data
Suppose we start with
$ mario map 'str.split ! tuple' <<EOF
-rw-r----- 1 tmp tmp 1.6K Dec 16 22:22 example.py
-rw-r----- 1 tmp tmp 150 Dec 17 13:08 foo.py
-rw-r----- 1 tmp tmp 427 Dec 17 11:49 bar.py
EOF
('-rw-r-----', '1', 'tmp', 'tmp', '1.6K', 'Dec', '16', '22:22', 'example.py')
('-rw-r-----', '1', 'tmp', 'tmp', '150', 'Dec', '17', '13:08', 'foo.py')
('-rw-r-----', '1', 'tmp', 'tmp', '427', 'Dec', '17', '11:49', 'bar.py')
So the signature is approximately
map 'str.split ! tuple' : Iterable[str] -> Iterable[Sequence[str]]
We can think of these commands as having the signature Iterable -> Iterable
, including write-json
. For instance, in the following, write-json is being passed an Iterable[int]
and returning an Iterable[str]
. (In actual fact they are handling AsyncIterable
s, but we can gloss over that here.)
mario eval 1 write-json
1
How to get the desired result
We can get the results you wanted with
mario map 'str.split ! tuple ' apply list write-csv-tuples <<EOF
-rw-r----- 1 tmp tmp 1.6K Dec 16 22:22 example.py
-rw-r----- 1 tmp tmp 150 Dec 17 13:08 foo.py
-rw-r----- 1 tmp tmp 427 Dec 17 11:49 bar.py
EOF
-rw-r-----,1,tmp,tmp,1.6K,Dec,16,22:22,example.py
-rw-r-----,1,tmp,tmp,150,Dec,17,13:08,foo.py
-rw-r-----,1,tmp,tmp,427,Dec,17,11:49,bar.py
The same technique works for json:
mario map 'str.split ! tuple ' apply list write-json <<EOF
-rw-r----- 1 tmp tmp 1.6K Dec 16 22:22 example.py
-rw-r----- 1 tmp tmp 150 Dec 17 13:08 foo.py
-rw-r----- 1 tmp tmp 427 Dec 17 11:49 bar.py
EOF
[
[
"-rw-r-----",
"1",
"tmp",
"tmp",
"1.6K",
"Dec",
"16",
"22:22",
"example.py"
],
[
"-rw-r-----",
"1",
"tmp",
"tmp",
"150",
"Dec",
"17",
"13:08",
"foo.py"
],
[
"-rw-r-----",
"1",
"tmp",
"tmp",
"427",
"Dec",
"17",
"11:49",
"bar.py"
]
]
Explanation of output
That output is consistent with the signature
write-json : Iterable[JSONValue] -> Iterable[str]
which means the ls -l
example that's specified to
write-json : Iterable[Sequence[str]] -> Iterable[JSONArray]
where JSONArray
is a subtype of str
:
mario map 'str.split ! tuple' write-json <<EOF
-rw-r----- 1 tmp tmp 1.6K Dec 16 22:22 example.py
-rw-r----- 1 tmp tmp 150 Dec 17 13:08 foo.py
-rw-r----- 1 tmp tmp 427 Dec 17 11:49 bar.py
EOF
[
"-rw-r-----",
"1",
"tmp",
"tmp",
"1.6K",
"Dec",
"16",
"22:22",
"example.py"
]
[
"-rw-r-----",
"1",
"tmp",
"tmp",
"150",
"Dec",
"17",
"13:08",
"foo.py"
]
[
"-rw-r-----",
"1",
"tmp",
"tmp",
"427",
"Dec",
"17",
"11:49",
"bar.py"
]
The CSV case works the same way:
Row = Sequence[str]
write-csv-tuples : Iterable[Iterable[Row]] -> Iterable[CSVFile]
where CSVFile
is a subtype of str
.
mario map 'str.split ! tuple' write-csv-tuples <<EOF
-rw-r----- 1 tmp tmp 1.6K Dec 16 22:22 example.py
-rw-r----- 1 tmp tmp 150 Dec 17 13:08 foo.py
-rw-r----- 1 tmp tmp 427 Dec 17 11:49 bar.py
EOF
-,r,w,-,r,-,-,-,-,-
1
t,m,p
t,m,p
1,.,6,K
D,e,c
1,6
2,2,:,2,2
e,x,a,m,p,l,e,.,p,y
-,r,w,-,r,-,-,-,-,-
1
t,m,p
t,m,p
1,5,0
D,e,c
1,7
1,3,:,0,8
f,o,o,.,p,y
-,r,w,-,r,-,-,-,-,-
1
t,m,p
t,m,p
4,2,7
D,e,c
1,7
1,1,:,4,9
b,a,r,.,p,y
Commentary
I think the logic is consistent between the two. But as you suggest, this makes the raw write-
commands a bit inconvenient. Maybe the serialization commands should slurp the input internally, so they're more like an apply
command than a map
command. That would be
# Each input item is a row.
new-write-csv-tuples : Iterable[Sequence[Cell]] -> Iterable[CSVRow[Cell]]
# Each input item is an array entry.
new-write-json : Iterable[Sequence[Cell]] -> Iterable[JSONArray[Cell]]
Open questions
- The current system is (I think) self-consistent but a bit awkward. Should it be changed/extended?
- This new signature would need to be added for all the serialization formats.
- The
read-
forms would also need to be considered. - What should happen to values that aren't iterable?
- Should these be new commands, or should the semantics of the original functions be changed? If they will be changed, the change should happen before a 1.0 release.