onyx-lang icon indicating copy to clipboard operation
onyx-lang copied to clipboard

To UFCS, Do a Novel Approach, or Stay Pure OOP

Open ozra opened this issue 9 years ago • 34 comments

UFCS (Uniform Function Call Syntax)

I'm eager to do this, but it requires lengthy discussion and then quite some work, turning some things inside and out and up and down in the compiler, so I think it's best to get a lot of input on this before (likely) implementing.

Briefly, UFCS let's you use methods as functions - syntactically - and vice versa.

type Foo
   some-val = 0  'get 'set  -- add getter and setter methods automatically

   my-method(v) ->
      @some-val = v  -- assign value to member var directly
end

my-routine(obj Foo, v) ->
   obj.some-val = v  -- set value via Foo's 'some-val=' setter method.

foo = Foo()
foo.my-method 47
my-method foo, 47
my-routine foo, 47
foo.my-routine 47

See, they all do the same thing - mutates some-val on an instance of Foo to the passed value. The func does it via a method on Foo, because the semantic distinction is that only member funcs (methods) can reach private state in the value instance - there are no exceptions to this in Onyx. Fret not! This is optimized away at machine code generation, and they end up with the exact same direct access machine code. Blazing!

Problems

The lookup order is the matter of concern. An instant, "being nice to legacy" idea is:

  1. method/free func first
    • method-style call? Look for best matching method, if none, look for best matching func.
    • func-style call? Vice versa

However, this means that different callables may be chosen by the compiler depending on syntax chosen. This has a bad smell to it. A stricter typed func will be unused if there's even just a slackly typed ("broadly defined") method when doing a method-style call. And vice versa.

So, the better option is:

  1. Look up methods and functions interleaved and only use overloading prioritization rules to choose which is the best candidate. If two callables has the same signature match priority, then method is preferred for method-call and vice versa. Their style of implementation is completely ignored, other than the last check, since the only difference is internal to them (access private state, or access states through methods), and as such an implementation detail beyond the interface.

There's only one thing then to consider, and that is that it doesn't break Crystal stdlib compatibility. Have to check that out. But ignoring that for a second, consider these through the "no compromise, the right thing to do (TM)" lens.

Further questions

How does instance construction sugar, functor and lambda call sugar come into the picture?

  • Should 5.Int be legal? (which would transform as such: "5.Int => Int(5) => Int.new(5)" [and in the end in machine code, probably just 5 as part of a store-instruction])
  • Should lambdas and functors in scope be taken into account!?

These questions can't just be ignored.

Some Pros

  • Implement funcs (well, procedures really) instead of methods to further separate logic from state, good for future changes of code involving more concurrency.
  • An optimal pattern, it seems without deeper analysis, would be to implement the minimum amount of methods on a type to work with state, and define other logic as free funcs. There is no compiler enforcements of any kind. Implement everything as methods or almost everything (sans data accessors) as functions - your choice in the end.
  • One single func def can cover several types horizontally, as long as the methods used match those available in type, if not strictly typing "receiver" parameter (first param).
  • More ways of expressing code, whichever looks clearer in a given situation.
  • No more guessing if a certain piece of functionality is a method or func - just go on instinct (this is more of a problem in, say Python, though).

Other cons (than "the" problem)

  • Some current method names might clash with language keywords - they won't be possible to call function style, but only method-style. This creates a small inconsistency, but a drawback that might be not too tough to swallow.

Prior Art

Currently it can be found in D and Rust. More?

As already mentioned there are different approaches to resolving, here are some papers and articles worth considering, all concerning the discussion of incorporating it in C++ (which must consider some same things as in Onyx: legacy):

  • https://en.wikipedia.org/wiki/Uniform_Function_Call_Syntax
  • Bjarne Stroustrup (C++) - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4174.pdf
  • Herb Sutter (C++) (with some banal tooling reasoning) - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4165.pdf
  • Bjarne Stroustrup & Herb Sutter follow-up (C++) http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4474.pdf

What are your thoughts on this?

ozra avatar Feb 21 '16 19:02 ozra

See my feature request for the |> syntax. It provides an alternative to UFCS that bypasses the problems you mention.

Also, I don't think methods should be callable as free functions; only the other way around.

One single func def can cover several types horizontally, as long as the methods used match those available in type, if not strictly typing "receiver" parameter (first param).

This is the best justification for UFCS I've ever read.

stugol avatar Feb 22 '16 01:02 stugol

The idea of implementing one-way only "half UFCS", or "OMFCS" (Optionally Method-notationized Function Call Syntax, haha, B-) ) is a viable option which would solve some problems, if it can't be implemented full on. Good Plan B idea!

I've coded quite a bit with forward pipes in different scenarios, but they are far from as generic as UFCS, and "less pretty" I'd contest. Iff it would be concluded that UFCS is a practical impossibility for one reason or another, forward-piping should definitely be implemented though, so I'd consider it a Plan C.

ozra avatar Feb 24 '16 19:02 ozra

I really like the pipe notation though. And I definitely think that calling a method like a free function is a recipe for disaster. And also entirely pointless. I can't think of a single circumstance in which I'd want to do that.

I'd like to see plans B and C implemented, and plan A dropped entirely.

stugol avatar Feb 24 '16 19:02 stugol

Alright, interesting thoughts! I'd love to see/hear even more elaborations on your thoughts on hows and whys, and specifically examples of potential culprits. It would be really helpful, since this is a huge concept to integrate, both conceptual vapor level and in implementation.

ozra avatar Feb 24 '16 23:02 ozra

Calling a free function like a method makes it look like a method. If you use pipes, it's obvious to the reader that this is a free function.

If I see a method I don't recognise, I go check the documentation for that type. This won't help me if the method in question isn't really a method.

Plus, the spaces around the |> symbol help to reinforce visually that the function doesn't "belong" to the value.

stugol avatar Feb 24 '16 23:02 stugol

Well, calling compiler tools from plugins in editors will give exact locations for defs/whatever of ,a specific token, so you'd just "go to definition", baam. No matter how it was defined.

ozra avatar Feb 25 '16 13:02 ozra

The pipe-related discussions should go in #34.

ozra avatar Feb 25 '16 15:02 ozra

Well I don't know about you, but my editor doesn't have an Onyx plugin. It doesn't even have a Crystal plugin that can look stuff up for me like that. I get syntax highlighting and nothing else.

stugol avatar Feb 25 '16 18:02 stugol

No, of course it was said a bit with the monocle aimed at the horizon. Point being, one can't design a language primarily with such a mindset, must have some patience and plan for the holistic perspective.

What editor are you using btw?

ozra avatar Feb 25 '16 19:02 ozra

Atom

stugol avatar Feb 25 '16 19:02 stugol

Ah, ok. I gave up on that one, crashed on me all the time and no control over typefaces :-/ Switched to sublime which is basically the same, just much faster and never crashed on me (despite being open weeks on end).

Atom is inspired from the same ed as sublime, so converting highlighter from sublime shouldn't be to hard I would think.

ozra avatar Feb 25 '16 19:02 ozra

I vaguely recall that Sublime wouldn't let me put the directory panel on the right. That's a bit of a deal-breaker.

stugol avatar Feb 25 '16 21:02 stugol

Ah, yeah I think it's more limited in some aspects like that. I only open panels for seconds with shortcuts when I need to do something, since everything is faster via keyboard, so I don't care.

In any event, it shouldn't be too hard to convert highlighter to Atom.

ozra avatar Feb 25 '16 22:02 ozra

I've given UFCS some more thought, and come to the conclusion that it might be a good idea after all.

Consider the following C++:

template<typename T> class Array { ... };
template<int> class Array {
  int sum();
};

Arrays of int will have a sum method; whilst other arrays will not. This cannot be done in Crystal at present:

class Array(T)
   def sum() where T.is_a? Int        # error: no such syntax
   end
end

But with UFCS:

def sum(a : Array(Int))
   ...
end

So I think at least one-directional UFCS should be implemented. In fact, writing free functions should be encouraged, rather than using inheritance and such to achieve a similar effect. For example, writing a sum(a : Array(Number)) function instead of subclassing Array.

But the point here is that the free functions are used as member functions. They're defined outside the class, but are conceptually member functions of one or more classes. Writing free functions that have no specific conceptual "owner" should be discouraged (e.g. puts).

stugol avatar Feb 28 '16 21:02 stugol

The big kahuna questions is (apart from true UFCS, both ways), if there should be a line drawn as what generality to match (would be kinda weird though).

sum(x List<Int>) -> ...
sum(x List<*>) -> ...
sum(x) -> ...

[1, 2, 3].sum  -- #1 is exact
[1.0, 2.2, 3.14].sum  -- #2 is most specific candidate
["foo", "bar", "baz"].sum  -- only match 'sum'#3
["foo", "bar", "baz"].puts

This is just a simple example, but all above would be legal if no artificial limitations are added.

Selection priority order must be well defined.

type Foo
   fug(...a) -> say a.first

fug(me Foo, a Bool) -> say a

x = Foo()
x.fug true

Also a simple example: according to the above example the reasonable candidate would be the free func - it's more specifically typed.

ozra avatar Feb 29 '16 21:02 ozra

Hm. On the other hand, it could be argued that a true member function should take precedence over a free function when called as a member function. I can see the potential for bugs here.

We could take inspiration from .NET, and implement extension methods instead. Basically, a free function must be marked with some kind of attribute, whereupon it may be used like a member function of its first argument. Advantages:

  • Free functions that are not intended for use as member functions will not be eligible.
  • Free functions will not unintentionally shadow member functions.
  • Member functions will not be used as free functions.

Possible syntaxes:

def speak(*thing) -> puts thing          -- #1 (a use for unary * at last!)
def speak(&thing) -> puts thing          -- #2
def speak(of thing)() -> puts thing      -- #3

"woo".speak                              -- usage

Extension methods would have exactly the same resolution priority as true member functions.

stugol avatar Mar 01 '16 02:03 stugol

FWIW, Google Go uses this concept also.

stugol avatar Mar 01 '16 02:03 stugol

Yes, what's apparent is:

  • Empirical intell, currently, is languages that use it (D and Rust) - they have weak ass inference.
  • Discussions contemplating using it regards C++, which also has weak ass inference.
  • Having strong inference makes for a much more complex playground

These half-way alternatives are definitely worth looking into.

I'll work on other things and let this bubble in the back of my head meanwhile.

ozra avatar Mar 01 '16 09:03 ozra

Ideas for more approaches.

Speaking of the devil, the .. suggested for cascading. How about using this operator for the pipe-operator functionality requested in #34, calling functions as if methods: staying close to method-call notation and still making a distinction.

my-s(x) -> x.to-s
my-dbl(x) -> x * 2

say 47..my-dbl..my-foo

Hmm, does look a bit strange though.

ozra avatar Mar 23 '16 19:03 ozra

No, don't like it.

I still think we should go with the idea of extension methods.

stugol avatar Mar 24 '16 10:03 stugol

We'll have to keep bombarding this subject. I'll read through it all again when time permits to get a birds view.

ozra avatar Mar 24 '16 21:03 ozra

I forgot to mention one thing, regarding selective extension methods. I was reminded because I stumbled upon a macro by the developer of Temel. I show the non-macro principle, since Onyx still lacks the template/macro-syntax:

-- free func
speak(me Int|Str|F64, a) -> say "got: {me} of {me.class} with {a}"

-- "methodized"
type Any: speak(a) -> speak self, a

-- use it
speak "Steak", "sauce"
"Burger".speak "sauce"
47.speak "world peace"
3.12.speak "seeming lack of 0.02"

-- will barf as expected: remove restriction on `me` for all-types support
-- [1,2].speak

=>

got: Steak of String with sauce
got: Burger of String with sauce
got: 47 of Int32 with world peace
got: 3.12 of Float64 with seeming lack of 0.02

How to "macroify" it is pretty obvious. (You'd use a soft-lambda for the macro, pick out the params and the body and re-compose...)

ozra avatar Mar 27 '16 01:03 ozra

I don't see the purpose of this post. What am I missing?

stugol avatar Mar 27 '16 16:03 stugol

I should type less on issues when I'm half a sleep, hehe.

It simply means that "extension methods" can very simply be implemented with a simple macro.

Well... it also means, it might be a viable thing to put in to the language it self, since it will map to common constructs directly from sugar "the best way (TM)".

I have a few ideas expanding slightly on it, but I'll let them continue to simmer for a while.

ozra avatar Mar 29 '16 00:03 ozra

Incidentally, extension methods are like C++ "partial template specialisation", but better. Reason being, you can add a method to a generic class, but have the method only apply to those specialisations that you want:

type List[T]:
   ...

-- note the multiple parens and/or the ampersand here
sum(&list : List[Int] | List[Float] | List[Real])() ->
   reduce 0, (acc, n)~> acc + n

values = List{1, 2, 3}
say values.sum                 -- "6"

say List{true, false}.sum      -- error, no such method

stugol avatar Mar 29 '16 11:03 stugol

Yes, that's exactly how it would work out :-). I do think I'd prefer a pragma rather than annotating first param though, or some other means of marking the function, rather than arg1. Just a loose thought at this stage, mind you ('ufcs just being a naïve suggestion for the example):

sum(list List[Int] | List[Float] | List[Real]) -> 'ufcs
   reduce 0, \ %1 + %2  -- just getting a feel for the slant, ignore that :-) 

ozra avatar Mar 29 '16 17:03 ozra

The annotation wouldn't look right if the function is defined on a single line:

sum(list List[Int] | List[Float] | List[Real]) -> 'ufcs reduce 0, \ %1 + %2

Maybe replace the arrow with --> or -+>?

stugol avatar Mar 29 '16 17:03 stugol

Yes, those are valid options too.

Pragmas apply to either the containing construct, when defined first (as in the example). Or to the following construct when no empty line in between, so for one line it could have been written as:

'ufcs
sum(list List[Int] | List[Float] | List[Real]) -> reduce 0, \ %1 + %2

But, that kind of defeat the terseness of a one-liner ;-)

Let's pump out ideas so we can weigh them all against each other.

A couple of months back there was a multitude of arrow-variations for pure func, pure method, etc. etc. and in the end it didn't look good, so I only kept the "returns-nil exclamation mark", but a lot has happened since then, so it's up for re-evaluation.

ozra avatar Mar 29 '16 18:03 ozra

Having a syntax for pure makes a lot of sense. It would refuse to compile if any non-pure methods were called from it, I assume.

Suggestions for pure specifier: o->, -o>, ->> Unicode: ●->, ⚬->, -⚬>, -⩺, (⇴ preferred)

Suggestions for extension specifier: -->, -+> (--> preferred) Unicode: , , , (⤑ or ⥅ preferred)

Could have a special arrow (, , -/>) for brittle tuple functions instead of a special literal syntax for the brittle tuple, also. This would mean you could return a brittle array, or brittle hash.

stugol avatar Mar 29 '16 18:03 stugol

Cleaning up issues a bit. UFCS will not be implemented. It's so easy to define functions and methods in Onyx, and given the type inference and generics, it's a no-brainer to make mirroring variants - where specifically wanted - instead.

ozra avatar Sep 21 '16 21:09 ozra