cppfront [SUGGESTION] Make recieving and returning functions from a function easier and safer.

Will your feature suggestion eliminate X% of security vulnerabilities of a given kind in current C++ code? Yes - It will eliminate the security vulnerabilities that arise from defining the arguments incorrectly via the cpp1 syntax - the same things that are eliminated by using in, out, inout, copy, forward and move in regular cpp2 function definitions instead of using the cpp1 syntax.

Will your feature suggestion automate or eliminate X% of current C++ guidance literature? It will make it unnecessary to learn about std::function (or at least unnecessary to learn about the old syntax for function signature, if keeping std::function around is desirable for some reason)

Describe alternatives you've considered. I've recently written a functional (i.e. a function that receives functions and possibly data, and returns a composition of those functions and data.)

The current syntax, as far as I'm aware, is quite confusing, as it is necessary to specify that the arguments as well as the return value as type e.g. std::function(return_val(arg1_type,arg2_type)) - which is inconsistent with the cpp2 syntax and is not intuitive - it makes it necessary to both know about std::function, as well as the (hopefully to be) deprecated cpp1 function signature syntax.

There are three alternatives I thought of, I'll give an example for a functional that receives two functions - the first receives and int and returns an int, the second receives an int and returns a string, and the functional returns a function that receives an int and applies the latter function to the former one, while forwarding the received int to the former one:

Using function type the same way we define a function type in cpp2:

f: (g: (int)->int, h: (int)->std::string)->(int)->std::string =  
  :(v)->(int)->std::string = h$(g$(v));

Using parenthesis to make it a bit easier to parse:

  f: (g: ((int)->int), h: ((int)->std::string))->((int)->std::string) = 
    :(v)->((int)->std::string) = h$(g$(v));

or

  f: (g: <(int)->int>, h: <(int)->std::string>)-><(int)->std::string> = 
    :(v)-><(int)->std::string> = h$(g$(v));

Using std::function, but with the cpp2 syntax:

  f: (g: std::function<(int)->int>, h: std::function<(int)->std::string>)->std::function<(int)->std::string> = 
    return :(v)->std::function<(int)->std::string> = h$(g$(v));

The current syntax is quite confusing for people who aren't familiar with cpp1, and would unnecessarily require them to learn the old syntax along with the new:

f: (g: std::function<int(int)>, h: std::function<std::string(int)>->std::function<std::string(int)> =  
return :(v)->std::function<std::string(int)> = h$(g$(v));

This is even more necessary if you consider the benefits of having in, out, inout, copy, forward and move defined for us in the function definitions that we receive and return (left out as default in for brevity in the above examples) - which is one of the main motivators for cpp2 in the first place.

Sep 18 '24 15:09 feature-engineer

Ok, so I was playing around a bit and figured we are quite close to something like what is proposed here in Cpp2 already, take a look at this example:

namespace cpp2::impl {
template<typename T, typename Signature>
concept function_like = std::is_convertible_v<T, std::function<Signature>>;
}

signature: type == (_:int) -> int;

f: (g: _ is cpp2::impl::function_like<signature>, h: int) -> int = g(h);

main: () = std::cout << f(:(x: int)->int = x*2, 2) << '\n';

Here we use the C++ concept cpp2::impl::function_like, to match specifically for functions with the given signature, but not depend on a specific type of closure (it can be a lambda, std::function, etc.), for a real implementation I would expect something more elaborate that doesn't depend on std::function, we can then match that in our higher order function.

From that, some syntactic sugar could be add so that this compiles (it would lower to the concept above): f: (g: _ is (_:int) -> int, h: int) -> int = g(h);

I see two issues with this:

It gives you the impression that doing _ is size_t might work, which is not true. Here we are trying to match something which is "function-like" with a concrete signature, but which is not a concrete type.
You can't separately declare the signature and then match it, as we do in the first example, as that would be ambiguous with a concept.

@filipsajdak do you think it would be possible for is to match something akin to the concept above? e.g.: f is (_:int) -> int. It might be a useful feature in general.

Sep 20 '24 17:09 DyXel

I think so. Give me a second to check it.

Sep 20 '24 17:09 filipsajdak

OK, I have checked that — a prototype solution: https://godbolt.org/z/Yj34afb3M (with no bad implicit casts, e.g., double to int).

That requires two changes:

parsing of is - currently x is (something) threat something as value... and in this case, it needs to parse the whole signature of the callable,
after the signature is parsed, it needs to be rewritten to the concept function_like<Callable, ReturnType, Args...> That means from:

f: (g: _ is (_:int) -> int, h: int) -> int;

To:

auto f(auto g, int h) -> int
  requires function_like<CPP2_TYPEOF(g), int, int>
;

Side note. According to the standard, the above could be rewritten to:

auto f(function_like<int, int> auto g, int h) -> int;

Unfortunately, some compilers are not good at parsing these.

Sep 20 '24 19:09 filipsajdak

Nice! Looks like the way to go to me. Thanks for the investigation Filip!

Sep 20 '24 19:09 DyXel

My solution does not handle generic callables (e.g., generic lambdas or generic functions). I will also consider solving these cases.

Sep 20 '24 19:09 filipsajdak

I made a change that also accepts generic callable: https://godbolt.org/z/v7vraxPfh

Unfortunately, when at least one argument is generic, we lose control of the implicit cast of other arguments.

Sep 20 '24 22:09 filipsajdak

I corrected the prototype: https://godbolt.org/z/nrcGqvoKG

Also, if we have all signatures parsed in cpp2 we can add additional checks for defined types - to avoid implicit casts.

E.g.:

fun([](auto a, brace_initializable_to<int> auto b) { // this blocks implicit cast of second argument
    return "<" + std::to_string(a) + ", " + std::to_string(b) + ">";
});

Sep 21 '24 08:09 filipsajdak

What are the benefits of using f: is _ (_:int)->int) over f: (_:int)->int, or even better (IMO) f: (int)->int? The latter seems more natural and succinct to me. It makes passing function signature as easy as defining them without learning any new syntax.

Sep 22 '24 05:09 feature-engineer

@feature-engineer is serves a different purpose: checking if something can be used as an argument. The main difference in the context of function argument is that when you use _ is XYZ syntax as an argument, you don't specify the expected type. You select the concept XYZ that the type needs to fulfill to be accepted as an argument. So, you create a generic function with a requires clause that limits possible arguments to XYZ.

So, when XYZ is a function signature, we can introduce a check that will create a requires clause that will allow any function that fulfills it to be used as an argument. That function might take different arguments, but arguments should be safely casted to the needed types, and the same goes for the returned type.

So, my experiment was not about defining the function but about requires clauses for its function arguments. The same feature could be used to inspect the function in the inspect or if, eg:

// Warning: non-existing syntax
if fun is (_:int) -> int {
  // use fun here
}

or

// Warning: non-existing syntax
x := inspect fun as std::string => {
  is (_ : int) -> int = "takes int returns int";
  is (_ : std::ostream) -> _ = "you can print to its argument";
  is _ = "unknown function";
}

Sep 22 '24 07:09 filipsajdak

What are the benefits of using f: is _ (_:int)->int) over f: (_:int)->int, or even better (IMO) f: (int)->int? The latter seems more natural and succinct to me. It makes passing function signature as easy as defining them without learning any new syntax.

I would love to be able to just write (int) -> int, but at least today you need the _:, I think this is to match the regular function signature parsing and I don't know if its necessary to keep it that way to have context-free grammar.

Sep 22 '24 10:09 DyXel

@filipsajdak So the benefit is that it's more performant (and less restrictive in that it allows implicit argument conversion) than using std::function if the passed function is e.g. a free function?

If not, (in int, inout size_t)->bool could just be translated into std::function<bool(const & int, size_t &)> and have it treated as a type instead. std::function can already take any callable, and if we want to be not too strict about the signature, and enable implicit conversion, couldn't we specify that in the signature type instead - e.g. have (in x is Integer)->int be translated to the appropriate std::function?

Sep 22 '24 10:09 feature-engineer

Sep 22 '24 11:09 feature-engineer

I am not too sure, I haven't looked at the lexing and parsing code too closely, but it could be a possibility. That regex is a monster btw 🤣

Sep 22 '24 14:09 DyXel

After thinking about it some more, I think that the concepts approach and transpilation of the template function signature in std::function are not contradictory, but rather complimentary.

std::function<(int, int)->int> is useful for when we know the actual function only during runtime (e.g. plugin system)

While the concept approach is preferable when the invokable is known at compile time, because it enables inlining where appropriate and is a true 0-overhead abstraction, as opposed to std::function.

So I think both of these should be supported.

I suggest the following syntax (if possible):

combine: (f: _ is (int, int)->int, g: _ is (int)->std::string)->std::string when using functions that are known at compile time, and

combine: (f: std::function<(int, int)->int>, g: std::function<(int)->std::string>)->std::string for the more general case, when we want to support cases where the function might only be known at runtime.

Using std::function instead of the more succinct (...)->... syntax for this case would alert the user to the fact that this incurs some overhead and is to be used only when necessary.

@filipsajdak @DyXel What do you think?

Sep 26 '24 10:09 feature-engineer

Thats what I was thinking of in the first place, the problem is that today "inline function signatures" are not supported, indeed you should be able to write std::function<(int, int) -> int> or at least std::function<(_: int, _: int) -> int>. The is support would be for generic programming cases where you don't want to depend on std::function, but want something that is callable with the given signature.

Sep 26 '24 12:09 DyXel