mario icon indicating copy to clipboard operation
mario copied to clipboard

Changing the signature of the write- commands

Open Russell-Jones-OxPhys opened this issue 5 years ago • 2 comments

ls -l | mario map 're.split("\s+", x)' write-json

This gives a list of arrays (an array of arrays would seem nicer, but ok)

...
]
[
    "drwxr-xr-x",
    "6",
    "username",
    "group",
    "4096",
    "Jan",
    "18",
    "2019",
    "filename"
]
[
    "drwxr-xr-x",
...

ls -l | mario map 're.split("\s+", x)' write-csv-tuples

This splits each item with commas and groups the items in a line with double newlines... Huh?

...

d,r,w,x,r,-,x,r,-,x
6
u,s,e,r,n,a,m,e
g,r,o,u,p
4,0,9,6
J,a,n
1,8
2,0,1,9
f,i,l,e,n,a,m,e

d,r,w,x,r,-,x,r,-,x
...

Ubuntu 18.04 / Python 3.8.0 in a virtualenv without system site packages.

Russell-Jones-OxPhys avatar Dec 17 '19 17:12 Russell-Jones-OxPhys

ls -l | mario map 're.split("\s+", x) ! [x]' reduce operator.add write-csv-tuples ok, so this works. I'm not clear why, though. Hm.

Russell-Jones-OxPhys avatar Dec 17 '19 17:12 Russell-Jones-OxPhys

Thanks for the thought-provoking question! I'll try to answer the question and add some commentary.

On the question

Input data

Suppose we start with

$ mario map 'str.split ! tuple' <<EOF                 
-rw-r-----  1 tmp  tmp  1.6K Dec 16 22:22 example.py
-rw-r-----  1 tmp  tmp   150 Dec 17 13:08 foo.py
-rw-r-----  1 tmp  tmp   427 Dec 17 11:49 bar.py
EOF
('-rw-r-----', '1', 'tmp', 'tmp', '1.6K', 'Dec', '16', '22:22', 'example.py')
('-rw-r-----', '1', 'tmp', 'tmp', '150', 'Dec', '17', '13:08', 'foo.py')
('-rw-r-----', '1', 'tmp', 'tmp', '427', 'Dec', '17', '11:49', 'bar.py')

So the signature is approximately

map 'str.split ! tuple' : Iterable[str] -> Iterable[Sequence[str]]

We can think of these commands as having the signature Iterable -> Iterable, including write-json. For instance, in the following, write-json is being passed an Iterable[int] and returning an Iterable[str]. (In actual fact they are handling AsyncIterables, but we can gloss over that here.)

mario eval 1 write-json
1

How to get the desired result

We can get the results you wanted with

mario map 'str.split ! tuple ' apply list write-csv-tuples <<EOF
-rw-r-----  1 tmp  tmp  1.6K Dec 16 22:22 example.py
-rw-r-----  1 tmp  tmp   150 Dec 17 13:08 foo.py
-rw-r-----  1 tmp  tmp   427 Dec 17 11:49 bar.py
EOF
-rw-r-----,1,tmp,tmp,1.6K,Dec,16,22:22,example.py
-rw-r-----,1,tmp,tmp,150,Dec,17,13:08,foo.py
-rw-r-----,1,tmp,tmp,427,Dec,17,11:49,bar.py

The same technique works for json:

mario map 'str.split ! tuple ' apply list write-json <<EOF                
-rw-r-----  1 tmp  tmp  1.6K Dec 16 22:22 example.py
-rw-r-----  1 tmp  tmp   150 Dec 17 13:08 foo.py
-rw-r-----  1 tmp  tmp   427 Dec 17 11:49 bar.py
EOF
[
    [
        "-rw-r-----",
        "1",
        "tmp",
        "tmp",
        "1.6K",
        "Dec",
        "16",
        "22:22",
        "example.py"
    ],
    [
        "-rw-r-----",
        "1",
        "tmp",
        "tmp",
        "150",
        "Dec",
        "17",
        "13:08",
        "foo.py"
    ],
    [
        "-rw-r-----",
        "1",
        "tmp",
        "tmp",
        "427",
        "Dec",
        "17",
        "11:49",
        "bar.py"
    ]
]

Explanation of output

That output is consistent with the signature

write-json : Iterable[JSONValue] -> Iterable[str]

which means the ls -l example that's specified to

write-json : Iterable[Sequence[str]] -> Iterable[JSONArray]

where JSONArray is a subtype of str:

mario map 'str.split ! tuple' write-json  <<EOF      
-rw-r-----  1 tmp  tmp  1.6K Dec 16 22:22 example.py
-rw-r-----  1 tmp  tmp   150 Dec 17 13:08 foo.py
-rw-r-----  1 tmp  tmp   427 Dec 17 11:49 bar.py
EOF
[
    "-rw-r-----",
    "1",
    "tmp",
    "tmp",
    "1.6K",
    "Dec",
    "16",
    "22:22",
    "example.py"
]
[
    "-rw-r-----",
    "1",
    "tmp",
    "tmp",
    "150",
    "Dec",
    "17",
    "13:08",
    "foo.py"
]
[
    "-rw-r-----",
    "1",
    "tmp",
    "tmp",
    "427",
    "Dec",
    "17",
    "11:49",
    "bar.py"
]

The CSV case works the same way:

Row = Sequence[str]
write-csv-tuples : Iterable[Iterable[Row]] -> Iterable[CSVFile]

where CSVFile is a subtype of str.

mario map 'str.split ! tuple' write-csv-tuples  <<EOF   
-rw-r-----  1 tmp  tmp  1.6K Dec 16 22:22 example.py
-rw-r-----  1 tmp  tmp   150 Dec 17 13:08 foo.py
-rw-r-----  1 tmp  tmp   427 Dec 17 11:49 bar.py
EOF
-,r,w,-,r,-,-,-,-,-
1
t,m,p
t,m,p
1,.,6,K
D,e,c
1,6
2,2,:,2,2
e,x,a,m,p,l,e,.,p,y

-,r,w,-,r,-,-,-,-,-
1
t,m,p
t,m,p
1,5,0
D,e,c
1,7
1,3,:,0,8
f,o,o,.,p,y

-,r,w,-,r,-,-,-,-,-
1
t,m,p
t,m,p
4,2,7
D,e,c
1,7
1,1,:,4,9
b,a,r,.,p,y

Commentary

I think the logic is consistent between the two. But as you suggest, this makes the raw write- commands a bit inconvenient. Maybe the serialization commands should slurp the input internally, so they're more like an apply command than a map command. That would be

# Each input item is a row.
new-write-csv-tuples : Iterable[Sequence[Cell]] -> Iterable[CSVRow[Cell]]

# Each input item is an array entry.
new-write-json : Iterable[Sequence[Cell]] -> Iterable[JSONArray[Cell]]

Open questions

  • The current system is (I think) self-consistent but a bit awkward. Should it be changed/extended?
  • This new signature would need to be added for all the serialization formats.
  • The read- forms would also need to be considered.
  • What should happen to values that aren't iterable?
  • Should these be new commands, or should the semantics of the original functions be changed? If they will be changed, the change should happen before a 1.0 release.

python-mario-bot avatar Dec 18 '19 00:12 python-mario-bot