c3c icon indicating copy to clipboard operation
c3c copied to clipboard

Anonymous struct destructuring

Open C34A opened this issue 4 years ago • 6 comments

There have been a number of proposals for the syntax to be used with anonymous structs, in particular for decomposing them into multiple variables. This issue seeks to compile all of these.

General Usage

This syntax seems to be mostly agreed upon for the usage of anonymous structs:

func struct {int x, int y} returns_int_struct() {
    return { 1, 2 };
}
// call and access member
io::printf("%d\n", returns_int_struct().x);

// without field names
func struct { double, double } returns_doubles() {
    return { 3.0, 4.0 };
}
// I am less sure about the syntax here exactly
struct { double, double } localStruct = returns_doubles();
io::printf("%f %f", localStruct.0, localStruct.1);

// Anonymous -> named struct:

Struct Foo {
    int a;
    int b;
}
// ...

// Cast to existing struct type (requires structual equality)
Foo renamed = returns_int_struct();
renamed.some_method();

// The reverse is also possible
func double vec_length(struct { double x, double y } vec) {
    return sqrt(vec.x * vec.x + vec.y * vec.y);
}

struct Vec2 {
    double a;
    double b;
}

Vec2 myVec = Vec2({1.0, 1.0});
// Vec2 is structurally equivalent to vec_length's parameter, so
// this is allowed.
double len = vec_length(myVec);
// I think this is also valid:
double len2 = vec_length( { (double)(3.0), (double)(4.0) } );

Struct Decomposition

struct

// decompose returned struct into int a and int b:
struct { int a, int b } = returns_int_struct();
io::printf("%d %d\n", a, b);

This seems suboptimal as it does not make it clear that a and b are members of the surrounding scope, seeming to imply that they are still in some kind of data structure.

comma separated declarations

// same function as previous
int a, int b = returns_int_struct();
// or perhaps
int a, int b = ... returns_int_struct();
// now we can use a and b
io::printf("%d %d\n", a, b);

This seems to still have ambiguity in that this code could be simply declaring a and setting b to the return value of returns_int_struct().

brackets

int {a, b} = returns_int_struct();

This seems like an excellent solution, but it is unclear how it could work with anonymous structs of mixed types:

func struct { int, double } my_func() { /* ... */ }
// Whats happening here? Are we casting the double to an int?
int { a, x } = my_func();

This issue suggests using a dedicated keyword for destructuring:

var { int a, double x } = my_func();
// or perhaps
destruct { int a, double x } = my_func();

An alternative solution is to simply start remove the prefix:

{ int a, double x } = my_func();

This seems like the most clear syntax. The only issue I see is it could still potentially be seen as composing a datastructure containing a and x. A similar alternative that may address this is using parentheses instead of brackets:

( int a, double x ) = my_func();

This solution is perhaps less "pretty" though, as the decomposition of a struct no longer mirrors its creation:

func struct {int, int, int} myFunc() {
  return { 1, 2, 3 } // {brackets} used here
}
// ...
( int a, int b, int c ) = myFunc(); // (parentheses) used here

C34A avatar Aug 13 '21 19:08 C34A

Anonymous structs as arguments are tracked in #193

lerno avatar Aug 13 '21 19:08 lerno

A somewhat better way to use anonymous structs would be to simply define it as a normal struct away from the function, and then add a special property to that particular struct when used for that function.

From #193

func void set_coordinates(struct { int i; int j; } coord) { ... }

Could possibly be:

struct Vec2 { int x, int y }
// A
func void set_coordinates((Vec2) vec2) { ... }
// B
func void set_coordinates(autocast Vec2 vec2) { ... }
// C
func void set_coordinates((Vec2)(vec2)) { ... }

Similar would then apply to returns:

// This example is C
func (Vec2) getCoordinates();

lerno avatar Aug 13 '21 19:08 lerno

For this case

{ int a, double x } = my_func();

Ambiguity can be resolved with lookahead, finding the first , or }.

It is worth mentioning that the most problematic use is in if statements, where an arbitrary number of expressions and declaration may preceed the final calculation. Like in C they are performed:

if (int a = 10, int b = a + 1, a * foo() > 0) { ... }
// Same as
{
   int a = 10;
   int b = a + 1;
   if (a * foo() > 0) { ... }
}

So we can write the proposals:

// A
if (int x = 10, { int b, double x } = my_func(), b > 0) { ... }
// B
if (int x = 10, ( int b, double x ) = my_func(), b > 0) { ... }
// C
if (int x = 10, struct { int b, double x } = my_func(), b > 0) { ... }
// D
if (int x = 10, var { int b, double x } = my_func(), b > 0) { ... }
// E
if (int x = 10, int { b, x } = my_func(), b > 0) { ... }
// F this is clearly ambiguous
if (int x = 10, int b, double x = ...my_func(), b > 0) { ... }

Note that ambiguity is not the end all. For example, it's possible to resolve the ambiguity by introducing () or similar. It's also possible to simply disallow destructuring inside of conditional's expr-decl chain. Finally it's even possible to use the known return type of "my_func()" to determine what extra declarations in the chain belongs to the destructuring But these are tools to wield with care as contextual semantics is very hard for a human to read as well.

lerno avatar Aug 14 '21 00:08 lerno

I think ( ) is a pretty good way to resolve most ambiguities. Given that the , operator is a bit different in C3 (it's only used in conditionals) it should be safe to use, even with a fairly weak operator as (

Because remember this has to be possible as well (x, y) = someFunc(). Note that (x) = someFunc() would be ambiguous, meaning that skipping elements can't simple drop the rest. _ is a common character to skip elements, but I have resisted adding it until now and an introduction here would have repercussions across the language. An easier way would be to simply place a type without declaration: (x, double) = someFunc(). This has the interesting effect of actually verifying that you know what you're skipping.

lerno avatar Aug 14 '21 00:08 lerno

Resolving #193, which actually addresses whether structs are written inline or if they are defined apart is required before this enhancement is possible.

lerno avatar Aug 14 '21 00:08 lerno

Maybe this is better for dropping values.

(x, ...) = someFunc();

lerno avatar Jan 18 '22 21:01 lerno

I am considering not including this feature unless there is a strong argument in favour.

lerno avatar Jun 13 '23 12:06 lerno

I'll close this. Not that it isn't an interesting feature (it is) but because it doesn't have enough uses. One must consider things like the added complexity for tools to handle these things as well.

lerno avatar Oct 01 '23 21:10 lerno