ecsharp icon indicating copy to clipboard operation
ecsharp copied to clipboard

Adding forward pipe operator `|>`

Open dadhi opened this issue 3 years ago • 22 comments

https://github.com/dotnet/csharplang/issues/74#issuecomment-390397707

dadhi avatar Jul 06 '20 08:07 dadhi

Console.ReadLine()
|> File.ReadAllBytes()
|> SHA1.Create().ComputeHash()
|> BitConverter.ToString()
|> Console.WriteLine("SHA1: {0}");

So that's supposed to mean this?

Console.WriteLine("SHA1: {0}", 
  BitConverter.ToString(
    SHA1.Create().ComputeHash(
      File.ReadAllBytes(
        Console.ReadLine()))));

I'm not sure how valuable this is... but if I were going to do this operator, I think I'd use a syntax like this, using __ or _ or # to represent the previous item in the pipeline:

Console.ReadLine 
|> File.ReadAllBytes 
|> SHA1.Create().ComputeHash 
|> BitConverter.ToString 
|> Console.WriteLine("SHA1: {0}", __);

Edit: I see YaakovDavis had the same idea.

qwertie avatar Jul 07 '20 13:07 qwertie

By the way, when you used Do() in your comment, I suppose you meant a method like this?

public static R Do<T, R>(this T obj, Func<T, R> action) => action(obj);

It's funny because I just added that exact method to Loyc.Essentials in the release yesterday.

qwertie avatar Jul 07 '20 13:07 qwertie

By the way, when you used Do() in your comment, I suppose you meant a method like this?

Yeah I have such Do too.

Over the years I am more and more thinking in the pipiline / fluent flow style. So I am mildly extending the number of pipe operations in my own libs https://github.com/dadhi/ImTools/blob/master/src/ImTools/ImTools.cs#L61

The reason is similar to why you have introduced the ::myVar operator, to procceed with your flow of thought without breaking into imperative assignment in the middle of the line.

A most important drawback of using the extension method (except the one with multiple argument methods) is the performance. It is too easy to complain about lambda cast to dismiss the feature.

So using the macro for this is ideal!

I think I'd use a syntax like this, using __ or _ or # to represent the previous item in the pipeline

Yes, for instance using both # and ::


  GetUserId()::id |> GetUserDetails(#, extended: true) |> Display(id, #.Name);

Update

Using with null-coalescing ?.

Another example where piping really shines is the combination with ?.


items.FirstOrDefault(x => x.Amount > 0)
    ?.To(it =>  someDict.GetValueOrDefault(it))
    ?.To(x => Process(x));

and in presumable piping syntax


items.FirstOrDefault(x => x.Amount > 0)
    ?> someDict.GetValueOrDefault(#))
    ?> Process(#);

Btw, this is also the most common (my) use of piping when you have the linq method chain and then you need to proceed / switch to some other context piping the result.

Last but not least the await escape piping

In my previous job with there was literally a hundred of extension methods (written by other people, some I am not biased ;) ) to pipe the the awaited result into the normal linq or other non-async methods, like the following:

  ...
  return await userRepo.GetAllByIds(ids).ToListAsync();

Imagine all other possible combinations with linq and custom utility helpers. I tried to refactor them into more ortogonal ToAsync(...) overloads and get rid of most adhoc extensions but it wax a bit ugly, longer and less performant

So with piping operator, e.g |@ and say ?@

return await userRepo.GetAllByIds(ids) |@ ToListAsync(#);

Maybe it is good now to take a break and think about naming.

Update 2 - no need for the |@

I've just realized that you may have a special @:: operator for the awaited result into and then pipe as usual:

return userRepo.GetAllByIds(ids)@::users |> ToListAsync(users);

or even

return userRepo.GetAllByIds(ids)@ |> ToListAsync(#);

Check the absence of await, so the Y()@::x will be expanded to var x = await Y();. It makes @:: and @ for # a separate orthogonal feature wich is often better.

dadhi avatar Jul 08 '20 10:07 dadhi

@qwertie ...and while we are at it, your glorious postfix :: operator give me an idea of how to tackle the monadic "do-notation" in more ergonomic way than C# Linq and Scala.

Thoughts unfold :)

seq {
   ReadFile(f)::lines \
   Log($"{f.Name} has {lines.Count} lines") \
   par { lines.Batch(100) |> CountWords(#) } |> Sum(#)::wordCount
   Log($"{f.Name} has {wordCount} words") \
   Return(lines, wordCount)
}::program;

program.Debug(TestInterpreter);
program.Run(AsyncInterpreter) |> WriteLine(#);

The link for example inspiration https://github.com/dadhi/SharpIO/blob/e1a7f900de409dae4d841e4abe79925991ea2a53/FreeIO/FreeIO.cs#L42

dadhi avatar Jul 08 '20 12:07 dadhi

I guess you mean To as a synonym of Do and I guess par means "parallel" but I don't understand the intended semantics of @::, or par { x }, or Return, or the \ operator, or what Debug(TestInterpreter) might mean.

I'm willing to add at least syntax for a |> operator, given its level of popularity among Roslyn geeks, and maybe even ?|> where A ?|> B(...#...) means A::tmp == null ? null : B(...tmp...), but what precedence should it have? Fun fact, in LES3 this is already defined as the lowest-precedence operator (#87) ... but there is no ?> or ?|> operator so its precedence will the same as >.

qwertie avatar Jul 09 '20 19:07 qwertie

?|> where A ?|> B(...#...) means A::tmp == null ? null : B(...tmp...)

Yes, that's what I mean.

For now please ignore the rest - sorry for the bloat :<

From this noise I would've only peek the ::@ (the name is arbitrary - something shorter is better) as the next candidate. I will create a separate issue. But basically it means

if (A()::@a != null) B(a);

converted to

var a = await A();
if (a != null) B(a);

dadhi avatar Jul 09 '20 19:07 dadhi

I don't think it's worth a new operator when you can just write (await A())::a != null... though personally I never had to use async extensively.

qwertie avatar Jul 10 '20 16:07 qwertie

The async and await is used extensively in (micro)Service, API, and LOB apps.. So the method lengths there are big like this:

await _powerUsersMongoRepo.GetAllUnderpoweredUsersByIds(blahhhhhhhh).ToList();

Here is the error - I cannot simply call ToList() but I am already far too past the method start.

So may be we don't need the whole @::var but only a lower operator, say @. which may or may not compose with ::

_powerUsersMongoRepo.GetAllUnderpoweredUsersByIds(blahhhhhhhh)@.ToList();

dadhi avatar Jul 10 '20 16:07 dadhi

How about this. Define the following macro:

define operator.($x, awaited) { (await($x)); }

(The parentheses are requires around $x because the await operator does not exist outside async functions, but await foo is equivalent to await(foo). The extra parens are also required because, er, it looks like there's an obscure bug related to await being treated as an ordinary function, but the extra parens fix it.)

And then you can write

_powerUsersMongoRepo.GetAllUnderpoweredUsersByIds(blahhhhhhhh).awaited.ToList();

qwertie avatar Jul 11 '20 17:07 qwertie

How about this. Define the following macro...

Will try it out! But that's opening a lot of possibilities. I presume I can use other operators as well.

dadhi avatar Jul 11 '20 18:07 dadhi

Yes, you can. And, fun fact, define is unaware of operators... in EC#, operator/ is just a function name like any other. In Loyc trees, operators and functions and constructs are all basically the same thing so my pattern matching code works on all of them equally well.

I made another example to show you... and was horrified that it didn't work:

define operator/($x * $y, $z) { MulDiv($x, $y, $z); }
var x = a * 7 / c; // warning: 1 macro(s) saw the input and declined to process it

"Is there some major bug in the define macro?" I worried. No, in the debugger I discovered that everything is working exactly as intended. The problem is the parser's C#'s heritage: it supports pointer syntax. So $x * $y was being parsed as declaring a variable called $y of type $x*. Sigh. But this version of the macro works:

// Input
define operator/(($x * $y), $z) { MulDiv($x, $y, $z); }
var x = a * 7 / c;

// Output
var x = MulDiv(a, 7, c);

Add the [Passive] attribute to suppress irrelevant "macro saw the input and declined to process it" warnings:

[Passive]
define operator/(($x * $y), $z) { MulDiv($x, $y, $z); }
var x = 1.0 / y;

qwertie avatar Jul 14 '20 18:07 qwertie

define operator.($x, awaited) { (await($x)); }

Here is more foundation and discussions for the thing:

  • https://github.com/dotnet/csharplang/issues/1117
  • https://github.com/dotnet/csharplang/issues/35

dadhi avatar Jul 30 '20 15:07 dadhi

While I added |> and ?|> operators in 2.8.3, I didn't actually implement a macro to transform them. And there isn't a mechanism yet to write "smart" macros (i.e. macros backed by arbitrary logic) the "proper" way. I mean, you could stick a [LexicalMacro] attribute on a compileTime function, but then you can't easily register the macro function with LeMP (the old way is to generate an assembly in a separate step, and register it with LeMP via command-line argument).

You can kind of hack an implementation as follows:

#ecs;

compileTime {
    static int tmpCounter = 0;
    static LNode PipeOperator(LNode source, LNode target) {
        int count = 0;
        // Look for usage of # in target (right side), e.g. 
        //   new Random() |> Console.WriteLine(#.Next(6) + #.Next(6));
        // produces output like
        //   Console.WriteLine((new Random()::src1).Next(6) + src1.Next(6));
        // if # appears only once, the temporary variable is not needed
        var target' = target.ReplaceRecursive(n => {
        	if (n.IsIdNamed("#")) {
        		count++;
        		return source;
        	}
        	return null;
        });
        if (count == 0)
        	return quote($target($source));
        if (count == 1)
        	return target';
        count = 0;
        LNode tmp = LNode.Id("src" + ++tmpCounter);
        return target.ReplaceRecursive(n => {
        	if (n.IsIdNamed("#")) {
        		if (count++ == 0)
        			return quote($source::$tmp);
        		else
        			return tmp;
        	}
        	return null;
        });
    }
}

define operator|>($a, $b) { precompute(PipeOperator(quote($a), quote($b))); }

void examples()
{
	Console.ReadLine 
	|> File.ReadAllBytes 
	|> SHA1.Create().ComputeHash 
	|> BitConverter.ToString 
	|> Console.WriteLine("SHA1: {0}", #);
	
	new Random() |> Console.WriteLine(#.Next(6) + #.Next(6));
}

This does work:

// Generated from Untitled.ecs by LeMP 2.8.3.0.
void examples()
{

	Console.WriteLine("SHA1: {0}", BitConverter.ToString(SHA1.Create().ComputeHash(File.ReadAllBytes(Console.ReadLine))));

	var src1 = new Random();
	Console.WriteLine(src1.Next(6) + src1.Next(6));
}

But PipeOperator() has no access to the LeMP context object, plus, having to pass the syntax tree through quote has the side effect of forgetting original source code locations and comments/newlines. So yeah, I should improve the macro-writing experience.

qwertie avatar Nov 17 '20 07:11 qwertie

@qwertie Anyway, this is cool and something I can start using (even if to learn more about complex lemp stuff).

Also, means that the tool is mature enough to provide (even if not ideal) the custom solution for the custom need.

Simplifying the macros writing should help people with the fun and crazy (useful) ideas.

dadhi avatar Nov 17 '20 07:11 dadhi

In the latest commit, you can define macros in C# code using a new macro macro. Exciting times!

["Change first argument to HELLO if it's not an identifier", Passive]
macro StupidDemoMacro($(arg0 && !arg0.IsId), $(..rest))
{
    return node.WithArgChanged(0, quote(HELLO));
}

StupidDemoMacro(1 + 1, 2 + 2);
StupidDemoMacro(goodbye, 2 + 2);

// Generated from Untitled.ecs by LeMP 2.9.0.0.
StupidDemoMacro(HELLO, 2 + 2);
StupidDemoMacro(goodbye, 2 + 2);

Note: C# 9 has a new with operator... perhaps I should also add support for a when operator for this sort of expression, so you could write $(arg0 when !arg0.IsId). There is a when quasi-operator in C# 8 switch expressions, but it is used in a weird non-expression context.

qwertie avatar Dec 09 '20 02:12 qwertie

@qwertie Cool, man! Will be checking it out.

dadhi avatar Dec 09 '20 05:12 dadhi

Okay, I still haven't actually released v29, but I will soon, and compileTime {} is now able to map error locations back from the plain C# code to the original Enhanced C# code. So compile-time errors will point to the right place instead of the compileTime block itself.

The same algorithm could be used to provide red squiggly underlines in Visual Studio, mapping errors from the ".out.cs" file back to the ".ecs" file, but doing so would basically involve writing a brand new Visual Studio extension and when it comes to doing that, I don't really know where to begin. But of course, if we can map errors back to the original ecs code, perhaps it is also possible to map IntelliSense back to the original ecs code... if I could figure out how to use the VS APIs properly, it seems like it should be possible, when you type "." or "(" in an ecs file, to run LeMP and then ask Roslyn for code completions, thus providing support for (slow) IntelliSense in ecs files. But of course, I don't have anyone to tell me how to use those darn APIs. And of course I've wanted to support VS Code, but that will involve a completely different set of APIs.

qwertie avatar Dec 25 '20 00:12 qwertie

and compileTime {} is now able to map error locations back from the plain C# code to the original Enhanced C# code

This is a big step.

Did you think about writing the Language Service extension for VS Code?

dadhi avatar Dec 25 '20 06:12 dadhi

Let me put it this way: I'm not even sure what those words mean. I would like to support it, it's just a matter of figuring out all those APIs. Not only the APIs for writing a language service but also the APIs for consuming the existing C# language service - assuming it's even practical for one language service to use a different one. It's especially hard for me because I don't even know how VS Code behaves in C# projects as a user. I mean, VS Code is folder-based, you open a folder and start working. But I'm used to the Visual Studio model of opening a solution, and I've never used VS Code for C# and don't really understand how it behaves. So I'd have to figure that out too. Btw I have two projects in each folder in this repo, a .NET 4.5/4.7 version and a .NET Standard 2.0 version... no idea how VS Code will act in the face of such madness.

Right now I'm just planning to investigate how to write printers more efficiently before the official 2.9 release. In the meantime, here's a prerelease. Merry Xmas.

If you'd like to discuss more things unrelated to a |> operator, please create a new discussion topic. BTW v2.9 won't have built-in support for |> but I know I'll be prototyping the feature with the macro macro 🙂 - which reminds me, I generated a thing with ALL unicode characters on a single page, which is useful enough that it's remarkable no one seems to have done it before - or rather Google can't find anyone else's version of this. So that's where I grabbed the 🙂 from.

qwertie avatar Dec 25 '20 16:12 qwertie

If you'd like to discuss more things unrelated to a |> operator, please create a new discussion topic.

Sorry for the flood, will use the discussions.

dadhi avatar Dec 25 '20 18:12 dadhi

It's not you, it's me - I should have just started a discussion and @atted you.

qwertie avatar Dec 25 '20 23:12 qwertie

Okay, so, in total I've added four operators, |> ?|> |=> ?|=>, plus two synonyms: ?> as a synonym of ?|> and ?=> as a synonym of ?|=>.

I'm preparing a release today. It won't have semantics for these operators, but I decided to try implementing them quickly with some macros. Here's what I came up with (I see GitHub is having some trouble with the syntax highlighting ... they're picking up on the fact it's not the original C#!)

#ecs;
using System.Linq;

define operator|=>($A, $B) { $B = $A; }
define operator?=>($A, $B) { $A::temp# == null ? null : $B = temp#; }
macro operator|>($A, $B)
{
	int counter = B.DescendantsAndSelf().Count(n => n.IsIdNamed("#"));
	if (counter == 0)
		return quote($B($A));
	else if (counter == 1)
		return B.ReplaceRecursive(n => n.IsIdNamed("#") ? A : null);
	else {
		LNode temp = LNode.Id("temp" + #context.IncrementTempCounter());
		LNode B' = B.ReplaceRecursive(n => n.IsIdNamed("#") ? temp : null);
		return quote(#runSequence(var $temp = $A, $B'));
	}
}
macro operator?>($A, $B)
{
	int counter = B.DescendantsAndSelf().Count(n => n.IsIdNamed("#"));
	LNode temp = LNode.Id("temp" + #context.IncrementTempCounter());
	if (counter == 0) {
		return quote($A::$temp == null ? null : $B($temp));
	} else {
		LNode B' = B.ReplaceRecursive(n => n.IsIdNamed("#") ? temp : null);
		return quote($A::$temp == null ? null : $B');
	}
}

void Example()
{
	DoThing(x) |> DoOtherThing(#, #) ?> IfPreviousThingWasn'tNull ?=> Result;
}

Generated result:

void Example()
{
	var temp12 = DoThing(x);
	var temp11 = DoOtherThing(temp12, temp12);
	var temp10 = temp11 == null ? null : IfPreviousThingWasn_apostNull(temp11);
	temp10 == null ? null : Result = temp10; // not sure why but this concats to previous line
}

Unfortunately, MS C# rejects the last line. Looks like something special will have to be done with the ?> and ?=> macros to avoid that, but I'm not sure what the solution is. If you were wondering why temp-variable numbering starts at 10, it's trying to blindly avoid name collisions with user-selected names, as temp2 is a fairly common name but temp10 is not.

qwertie avatar Jan 12 '21 23:01 qwertie