wren icon indicating copy to clipboard operation
wren copied to clipboard

[RFC] Introduce Data Classes

Open PureFox48 opened this issue 4 years ago • 22 comments

One thing I find rather tedious at present is that defining a simple 'data' class requires a lot of boiler-plate. For example:

class C {
    construct new(x, y) {
        _x = x
        _y = y
    }

    x { _x }
    y { _y }

    toString { "(%(_x), %(_y))" }
}

Unless I'm going to use it a lot, this discourages me (and I suspect others) from creating such classes and I think, oh, just use a list or map instead.

What if the above boiler-plate could be replaced by one line?

class C(x, y)

It's then a whole new ball game and one wouldn't be put off from creating such simple classes even if they were only going to be used a time or two in a particular script. This would make your script more self-documenting.

A number of points on how I envisage this working:

  1. Data classes (as I've called them for want of a better name) could contain not just two but any number of fields of any type.

  2. As far as the compiler is concerned, it would simply replace the latter with the former (as if it were a text-based macro) which I think should be possible with a single-pass compiler.

  3. As far as the user is concerned, all they'd need to know is that you create an object of such a class with new followed by the field names as parameters (ignoring initial underscores), there's a read-only property of the same name for each field and there's a default string implementation for the class as a whole.

  4. Although you could make the fields read/write, personally I don't think this would be a good idea. As well as being more complicated, you'd have no way of checking that the fields were being reset to sensible values.

  5. There would be no way of adding other instance/static methods to the data class which would inherit implicitly from Object. If you wanted to add more methods or inherit from a different class, you'd just have to write the class 'normally' as you do now.

  6. Data classes would still be reference types and would behave accordingly. So you wouldn't be able to use them (at least directly) as map keys.

  7. Whilst it's clear a toString method is required to override Object's default implementation, to maintain flexibility I think this needs to be as simple as possible. As it's not a list or a map, I suggest that parentheses are used to surround the fields rather than square brackets or curly braces. If, say, you then want to precede it with the class name you can simply write:

var c = C.new(1, 2)
System.print("C%(c)")
  1. I don't know if there would be any significant parsing or other difficulties with having two syntaxes for defining classes. If there were, you could use a different name for data classes such as struct or tuple. However, this would mean introducing a new reserved word and, as these words may have different implications to different people, I think we should stick with class if at all possible, as that - after all - is what it still would be.

  2. The problem of destructuring multiple values returned by methods/functions into separate variables has been touched upon in recent issues. This would no longer be much of a problem if data classes were introduced. Instead of:

var someFunc = Fn.new {
    // blah
    return [a, b, c, d]
}

var s = someFunc.call()
var a = s[0]
var b = s[1]
var c = s[2]
var d = s[3]
// do something with a, b, c, d

one could just write:

class S(a, b, c, d)

var someFunc = Fn.new {
    // blah
    return S.new(a, b, c, d)
}

var s = someFunc.call()
// do something with s.a, s.b, s.c, s.d
  1. This, of course, is pure sugar and is not a new idea. Similar constructs are found in several other OO languages. As well as for creating simple classes 'on the fly', they are typically used for other simple stuff such as points, ARGB colors and the like.

So what do you all think. A useful addition or an unnecessary refinement for a simple language such as Wren?

PureFox48 avatar Jan 12 '21 11:01 PureFox48

I didn't read the whole post, but something like that, and even more sophisticated, can be done using the meta module. In the past I'd written a NamedTuple.create() method, which creates a class similar to Python's collections.namedtuple().

This is simple as string manipulation and Meta.compile().

Edit: A very simple example:

import "meta" for Meta

class NamedTuple {
  static create(name, fields) { Meta.compile("class %(name) {
  construct new(%(fields.join(", "))) {
    %(fields.map {|field| "_%(field) = %(field)" }.join("\n"))
  }
  %(fields.map {|field| "%(field) { _%(field) }" }.join("\n"))
  toString { \"%(name)(%(fields.map {|field| "\%(_%(field))" }.join(", ")))\" }
}
return %(name)").call() }
}

S = NamedTuple.create("S", ["x", "y"])
s = S.new(1, 2)
System.print(s) // S(1, 2)
System.print("(%(s.x, s.y))") // (1, 2)

ChayimFriedman2 avatar Jan 12 '21 11:01 ChayimFriedman2

This is an interesting proposal. I was also frustrated with the required boilerplate, which was why i suggested the ideas in here: https://github.com/wren-lang/wren/issues/849

I don't know if I'm comfortable with a whole extra class syntax, compared to a more recognisable approach.

avivbeeri avatar Jan 12 '21 11:01 avivbeeri

@ChayimFriedman2

A difficulty I find with the Meta module is that there's no documentation and I'm still not entirely clear how it works - or, even, if it does work in some cases - from studying the examples.

PureFox48 avatar Jan 12 '21 11:01 PureFox48

I edited the original post and added an example. The meta module is quite clear IMHO; I know there aren't docs, but you should be able to understand what it does just from looking at its Wren code, and at most C code.

ChayimFriedman2 avatar Jan 12 '21 11:01 ChayimFriedman2

@avivbeeri you were faster than me to find your request XD

I don't like that syntax, it can be miss-interpreted with templates. I also don't really like the fact that toString is auto-generated and not over-writable. I prefer something like the solution from #849 with a keyword.

mhermier avatar Jan 12 '21 11:01 mhermier

The biggest advantage of using Meta is that it's customizable.

Want another .toString? Do it. Maybe you want to allow inheritance? Roll your own. Setters? It's possible. Anything we didn't consider? You can.

For example, in my NamedTuple, I had two constructors: one named new() with positional arguments, and one named fromMap() that takes a map and sets the fields by name. I even had an overload for create() which allowed specifying defaults (the default was null, obviously). Such simple yet powerful. Of course, the idea was taken from Python (they use named arguments there).

ChayimFriedman2 avatar Jan 12 '21 11:01 ChayimFriedman2

@ChayimFriedman2

Well, clearly I need to put more effort into studying how Meta works but I'm not sure the 'ordinary' user is going to be comfortable with writing stuff like that even if it is more flexible than my simple proposal.

PureFox48 avatar Jan 12 '21 11:01 PureFox48

The average user doesn't need to be conformable with writing such thing. He just need to be conformable using a library that does such thing. In Python this is built in, but I really don't think this is necessary.

ChayimFriedman2 avatar Jan 12 '21 11:01 ChayimFriedman2

Even if not pretty in some corner, the meta solution is better than that new syntax proposal to me.

mhermier avatar Jan 12 '21 11:01 mhermier

Not speaking you can also create a variant to control read an write arguments.

mhermier avatar Jan 12 '21 11:01 mhermier

@avivbeeri

Yes, I remembered #849 and, whilst I had sympathy with the general thrust of saving boiler-plate, I had reservations about auto-generating property setters as you couldn't then check that incoming values were sensible.

Incidentally and FWIW, I thought that the only syntax which would sit comfortably with what we already have for a combined getter/setter was this:

pos = _pos

PureFox48 avatar Jan 12 '21 11:01 PureFox48

I don't really want to say it because it opens a huge can of worm, we can also allow POD with something like:

struct C { a, b
  c
  d
}

mhermier avatar Jan 12 '21 12:01 mhermier

That's one reason why I shied away from suggesting struct for data classes.

It means different things in different languages. Structs in C/C++ are not the same as they are in C# for example.

PureFox48 avatar Jan 12 '21 12:01 PureFox48

Well, if there's anything useful to have come out of this discussion so far, it's that the Meta module needs to be documented and that there also needs to be a library for the average user to do useful things with it or maybe add some more methods to the Meta class itself.

PureFox48 avatar Jan 12 '21 12:01 PureFox48

@ChayimFriedman2

It seems that I had in fact figured out previously how to use Meta.compile to create a class dynamically - possibly without realizing the full power of it - as I wrote this snippet for the Rosetta Code site back in August 2020:

import "meta" for Meta
 
var genericClass = Fn.new { |cname, fname|
    var s1 = "class %(cname) {\n"
    var s2 = "construct new(%(fname)){\n_%(fname) = %(fname)\n}\n"
    var s3 = "%(fname) { _%(fname) }\n"
    var s4 = "}\nreturn %(cname)\n"
    return Meta.compile(s1 + s2 + s3 + s4).call() // returns the Class object
}
 
var CFoo = genericClass.call("Foo", "bar")
var foo = CFoo.new(10)
System.print([foo.bar, foo.type]) // [10, Foo]

I feel sure I had some issues - possibly with Meta.eval - though unfortunately I can't remember now what they were :(

PureFox48 avatar Jan 12 '21 17:01 PureFox48

I toyed around with the meta module for converting maps to objects. But the problem I encountered was passing complex objects from object to string and vice versa hence I requested #940 and maybe a repr() protocol would be needed to recreate the object at eval time.

import "meta" for Meta

var MapToObject = Fn.new {| map |
    var value = null
    var unique = (map.keys.join() + map.values.join()).bytes.take(7).join()
    var classname = "NewObjectFromMap_%(unique)"
    var out = "
        return Fiber.new {
    "
    out = out + "class " + classname + "{"
    for (key in map.keys) {
        value = map[key]

        if (value is String) {
            value = "\"%(value)\""
        }

        if (value is Num) {
            value = "Num.fromString(\"%(value)\")"
        }

        out = out + "\nstatic %(key) { " + value + "}"
    }
    out = out + "\n}"
    out = out + "\n" + "return " + classname
    out = out + "
        }.try()
    "
    return Meta.compile(out).call()
}

var obj = MapToObject.call({ "hello": "wren" })
var obj2 = MapToObject.call({"one" : 2})
var obj3 = MapToObject.call({"one" : 2})

System.print(obj.hello)
System.print(obj2.one)
System.print(obj3.one)

/*
bin/wren_cli scratch.wren

return Fiber.new {
    class NewObjectFromMap_104101108108111119114 {
        static hello { "wren" }
    }
    return NewObjectFromMap_104101108108111119114
}.try()


return Fiber.new {
    class NewObjectFromMap_11111010150 {
        static one { Num.fromString("2") }
    }
    return NewObjectFromMap_11111010150
}.try()


return Fiber.new {
    class NewObjectFromMap_11111010150 {
        static one { Num.fromString("2") }
    }
    return NewObjectFromMap_11111010150
}.try()

wren
2
2
*/

clsource avatar Mar 28 '21 02:03 clsource

(Ab)using variable with capitals, removed unique classname generation since it was not contributing but let it as a variable in case you want to reintroduce it. (If needed instead of a hash I can transform it so unique id is a counter if you prefer)

var MapToObject = Fn.new {|map|
    var classname = "NewObjectFromMap"
    var keys = map.keys
    
    var   out = keys.reduce("") {|str, key| str + "var %(classname)__%(key)\n" }
    out = out + "return Fn.new {|map|\n"
    out = out + keys.reduce("") {|str, key| str + "\t%(classname)__%(key) = map[\"%(key)\"]\n" }
    out = out + "\tclass %(classname) {\n"
    out = out + keys.reduce("") {|str, key| str + "\t\tstatic %(key) { %(classname)__%(key) }\n" }
    out = out + "\t}\n"
    out = out + "\treturn %(classname)\n"
    out = out + "}"

    return Meta.compile(out).call().call(map)
}

Need to be checked since it might rapidly consume the static variable space.

mhermier avatar Mar 28 '21 13:03 mhermier

A retake on previous one, with less pressure on the static variable space using a shadow map, and static variable caching.

class MapToObject {
    static call(map) {
        if (__uniqueid == null) __uniqueid = 0
        __uniqueid = __uniqueid + 1
        
        var classname = "NewObjectFromMap_%(__uniqueid)"
        var keys = map.keys
    
        var   out = "var %(classname)_shadow = {}\n"
        out = out + "return Fn.new {|map|\n"

        // Copy the map in the shadow
        out = out + "\tfor (key in map.keys) %(classname)_shadow[key] = map[key]\n"
    
        out = out + "\tclass %(classname) {\n"
        out = out + keys.reduce("") {|str, key| str + "\t\tstatic %(key) { __%(key) != null ? __%(key) : __%(key) = %(classname)_shadow[\"%(key)\"] }\n" }
        out = out + "\t}\n"
        out = out + "\treturn %(classname)\n"
        out = out + "}"

        return Meta.compile(out).call().call(map)
    }
}

Had to reintroduce uniqueid, because this time for some obscure reasons the compiler refused to accept the code with it ...

mhermier avatar Mar 28 '21 14:03 mhermier

I find this useful. Sadly it depends on the Meta module thus it could not be implemented in the core Object class Object.fromMap()

Wonder if this could be added to the core or at least the optional modules

clsource avatar Mar 28 '21 19:03 clsource

This can't really live into core, this is really a Meta/Reflection thing: given a description output a class. Where description could also be expanded to have functions to be called by symbols provided.

Another interesting feature to have would be to have automagically generate view objects that solve the following scenario: Given an object, create an object that behave like the first one with a restricted methods set (for security, constness, hidding internals...).

mhermier avatar Mar 29 '21 06:03 mhermier

Yes I think Reflection is useful :) That way Wren could have private methods I guess.

var reflector = Reflection.new('Foo')
var foo = reflector.instance()
var hello = reflector.method('hello')
var params = [1, 2, 3]
hello.call(foo, params)

clsource avatar Mar 29 '21 16:03 clsource

I feel sure I had some issues - possibly with Meta.eval - though unfortunately I can't remember now what they were :(

After two years, I finally remembered what those issues were!

Meta.eval and friends either don't work or don't work properly with non-class code if called from within a function/method or between modules. They only work at top level from the same module.

If we use Meta.compile rather than Meta.eval we can see that the problem is visibility of local variables:

/* meta_test.wren */

import "meta" for Meta

var f = Fn.new {
  var a = 2
  var b = 3
  var source = """
    var c = a + b
    System.print(c)
  """
  Meta.compile(source).call()    
}

f.call()

/* Output:
[./meta_test line 1] Error at 'a': Variable is used but not defined.
[./meta_test line 1] Error at 'b': Variable is used but not defined.
Null does not implement 'call()'.
[./meta_test line 12] in new(_) block argument
[./meta_test line 15] in (script)
*/

Even when classes are defined inside a function/method they can't see local variables though otherwise work fine.

I suspect this will take a lot of fixing if, indeed, it can be fixed at all given the nature of our compiler and scope rules.

Currently, there is no comment about this in the docs so I wonder whether we should say something to warn people that this is a known issue?

PureFox48 avatar Mar 05 '23 12:03 PureFox48