zenscript icon indicating copy to clipboard operation
zenscript copied to clipboard

Orthogonal concepts

Open shelby3 opened this issue 8 years ago • 192 comments

I am trying to nail down in my mind, the fundamental orthogonal semantic concepts of our proposed programming language.

Concept Description
data Unified sum, product, recursive, and record data types. No early bound operations other than construction and access. Record types may optionally be mutable.
typeclass Delayed binding of operations to types at the use-site. Contrast to OOP (subclassing) which binds operations at construction of objects. For polymorphic types, typeclass objects delays binding operations until construction of the typeclass object (instead of prematurely binding operations at the construction of an instance of the implementing type); whereas, my posited and proposed solution to the Expression Problem employing unions, in theory further delays binding to the use-site for polymorphic types. Note that delaying is a positive attribute in this context, because it increases degrees-of-freedom.
module Encapsulating data with compile-time (statically) bound operations and access control, enabling more sophisticated types.
monadic effect system https://github.com/keean/zenscript/issues/4#issuecomment-248569044

Thus I am wondering if module is the most apt name for the concept? I think of a module as a unit of code that is imported separately. We still need to import data and typeclass, so is this done by files? Isn't each file then a module? If so, what is the apt name for the concept of module above? I am thinking the above concept for module is really a class (without the subclassing inheritance anti-pattern). Can't modules extend other modules?

Edit: note also OCaml functors and modules.

Edit#2: on typeclasses, I replied to @keean:

Optional implicit parameters (which Scala has) provide a way of unifying runtime interfaces (passed by value) and monomorphisable interfaces (passed implicitly by type).

I like that summary (in the context of what I had written). We should make sure we frame that and use it in our documentation.

Edit#3: @shelby3 wrote:

I think it is important to remember that we want to keep implementation of interface (i.e. typeclass) orthogonal to instantiation of data, otherwise we end up with the early binding of subclassing. For example, in subclassing both a rectangle and a square would be subclasses of an interface which provides width and height. Yet a square only has one dimension, not two. Any interfaces which need to operate on the dimensions of rectange and square need to be customized for each, which is what typeclasses do. We will still have subtyping such as when an interface extends another interface then we can subsume to the supertype (which is why we need read-only references and for supersuming then write-only references). Yet references will not have the type of an interface if we don't provide typeclass objects, so then the only case of subtyping will be the subset relationships created by conjunctions and disjunctions of data types.

shelby3 avatar Sep 25 '16 21:09 shelby3

Yes, I think that is correct. We basically split objects into three orthogonal concepts, data are effectively object properties, typeclass are inferfaces, an modules control data hiding (privite data etc).

Note: Elsewhere I proposed that we can combine the three into a single syntax and generic concept, but lets leave that to one side for now.

keean avatar Sep 25 '16 21:09 keean

Did you see my edits pointing out that 'module' has two meanings? Should we use another keyword than module?

shelby3 avatar Sep 25 '16 21:09 shelby3

Extension and functors are something ML supports on modules, at the moment I don't think we should. I like the ability to have multiple modules in one file, but effectively modules are able to be separately compiled and linked later. In the simplest form they are like namespaces where you cannot access their private data from outside. A module has to export things to be visible outside, and import the external functions and data in order for them to be visible inside.

keean avatar Sep 25 '16 21:09 keean

Note I added the idea for a monadic effect system. Well I guess it isn't orthogonal in some sense as its implementation relies I suppose on the other 2 or 3 concepts, but the systemic model for handling of impurity is an orthogonal concept.

shelby3 avatar Sep 26 '16 03:09 shelby3

@shelby3 wrote:

Concept Description
typeclass Delayed binding of operations to types at the use-site. Contrast to OOP (subclassing) which binds operations at construction of objects. For polymorphic types, typeclass objects delays binding operations until construction of the typeclass object (instead of prematurely binding operations at the construction of an instance of the implementing type); whereas, my posited and proposed solution to the Expression Problem employing unions, in theory further delays binding to the use-site for polymorphic types. Note that delaying is a positive attribute in this context, because it increases degrees-of-freedom.

I want to raise this conceptualization of solving the Expression Problem to a matrix, so that we can how this plays out on more levels.

Background

Dynamic polymorphism is instituted whenever we bind the interface to an object and refer to the type of that object as the interface; Because whether it be at data type object instantiation for subclassing or typeclass object instantiation (aka Rust trait objects), we then must incur dynamic dispatch because when holding a reference of said interface type at compile-time, there is no static (compile-time) knowledge of the types of data that can be referenced with the said interface type (rather the runtime dynamic dispatch handles the dynamic polymorphism). In subclassing the interface is a supertype and in typeclass objects, the interface is the typeclass bound (constraint) on data types that can be assigned to a reference with said type.

So the lesson that once we bind an interface at compile-time before the function call site, then we forsake static polymorphism and the opportunity for mono-morphisizing the function. More importantly, we lose the axis of the Expression Problem to add new interfaces (operations) to the data types referenced via an interface type. We can add new data types but can't add new operations.

Subclassing binds the interface very early at data type instantiation. Typeclass objects bind later at typeclass object instantiation. And typeclass bounds bind even later at the function call site.

So ideally, we wanted to always delay to typeclass bounds, but this conflicts with polymorphism of adding data types orthogonally when dealing with heterogeneous union types. Thus I pointed out that by retaining the union of data types instead of prematurely subsuming them to the typeclass bound of a typeclass object, then we could delay the binding to the function call site. Thus we can continue to add new orthogonal operations to the union type, unlike for typeclass objects. The trade-off is that adding a new type to a union type requires invariant collections (such as a List); whereas, typeclass objects will interopt fine with arrays because the type of the array doesn't need to change when adding new data types to an exist typeclass object type. I explained this is more detail in prior linked discussion. Readers can refer to the prior explanation, so I don't need to repeat it here.

Matrix Conceptualization

So what we can see is there is a tension with trade-offs between the competing choices of carrying around the specific data types and subsuming to the interface. So both are needed for different situations.

Recently @skaller pointed out another axis of this tension. That is in the case where don't want to discard the data types because we want to leave them open to new operations, but we want to dictate that a specific implementation is used for a specific typeclass where ever it might be needed. Apparently OCaml's functors accomplish this by being first-class and one can pass these along with a module and they will morph the module with the specific constraints desired, while leaving the rest of the module unspecialized, and thus through the chaining of functors, one can get the same effective result as early binding for one typeclass but remaining open to extension for other typeclasses.

So really what we need is to being able to attaching a typeclass object to a data type instance and pass them around as a pair every where a data type is expected. So in other words, dynamic patching of the dictionary for the dynamic dispatch, wherein if that data type is then applied to a trait object at a call site or trait object assignment, then the prior binding is not overwritten and any additional dynamic dispatch bindings are augmented.

So then we no longer view data types and typeclass objects separately. A data type always has a dynamic dictionary attached to it, yet it is augmented at runtime, not with compile-time knowledge. The compiler can only guarantee the required typeclass operations will be available at runtime, but it can't guarantee which implementation was selected, because at runtime when it writes the implementation to the dynamic dictionary of the typeclass object, it can't know if a preexisting one will already exist.

So we have lifted dynamic polymorphism to a 2nd order. We have dynamism of dynamic dispatch.

The matrix is we have two axes of data type and interface, and a third axis of implementations that cojoin the two. This third axis can't be tracked at compile-time without dependent typing, but the compiler only needs to insure that the required interfaces are present in order to insure soundness. So this works.

What this enables is for example @skaller's example of wanting to set the ordering of comparison operations, while leaving the data type open to extension with other typeclasses. @keean's proposed solution also achieves it, but it lacks atomicity of retaining this with the data type. For example, @keean's solution is not generic in type of algorithm that can be applied ex post facto and thus it fails another axis of the tension of solving the Expression Problem.

What we will see generally is that higher order dynamism is required to solve higher order tensions in the Expression Problem.

shelby3 avatar Oct 05 '16 10:10 shelby3

So the lesson that once we bind an interface at compile-time before the function call site, then we forsake static polymorphism and the opportunity for mono-morphisizing the function.

This is not quite right. We can still mono-morphise even with compile-time interface binding - in fact for the traditional requirement this is a requirement. The point is we can statically know both the interface and the exact type passed, and therefore chose and inline the exact function. Rust does this in all cases exept when dynamic polymorphism in involved.

So we can inline and monomorphise the functions when the interface is statically bound and the type is statically know. If either are false (so dynamic binding to the interface, or dynamic typing) we cannot monomorphise in languages like Rust.

keean avatar Oct 05 '16 12:10 keean

So really what we need is to being able to attaching a typeclass object to a data type instance and pass them around as a pair every where a data type is expected.

This is exactly what adding functions to an objects prototype does in JavaScript.

keean avatar Oct 05 '16 12:10 keean

@keean wrote:

So really what we need is to being able to attaching a typeclass object to a data type instance and pass them around as a pair every where a data type is expected.

This is exactly what adding functions to an objects prototype does in JavaScript.

No it is not the same. The JavaScript prototype impacts all instances. I am referring to impacting only instances which are assigned to the trait bound.

@keean I have decided that it is nearly impossible to try to explain these issues like this. I will need to make detailed code examples and write a white paper. You don't seem to understand me.

shelby3 avatar Oct 05 '16 12:10 shelby3

No it is not the same. The prototype impacts all instances. I am referring to impacting only instances which are assigned to the trait bound.

So you want the implementation used to depend on the value of the object not just its type.

This is monkey-patching, adding a method directly to the object itself.

keean avatar Oct 05 '16 12:10 keean

@shelby3 wrote:

So the lesson that once we bind an interface at compile-time before the function call site, then we forsake static polymorphism and the opportunity for mono-morphisizing the function.

This is not quite right.

It is right. I just haven't been able to communicate my points to you.

shelby3 avatar Oct 05 '16 12:10 shelby3

You have written it badly then because

once we bind an interface at compile-time before the function call site, then we forsake static polymorphism and the opportunity for mono-morphisizing the function.

This is exactly the opposite of what is true.

We must bind the interface statically in order to be able to monomorphise and we must statically know the type at the call site as well.

keean avatar Oct 05 '16 12:10 keean

@keean wrote:

The point is we can statically know both the interface and the exact type passed

If you had understood me, you would understand that I already explained that we don't know the exact type passed, that was entirely the point of binding early erasing the data types from further compile-time knowledge and you only have the trait bound remaining. That you didn't even get that point from what I wrote seems really bizarre.

shelby3 avatar Oct 05 '16 12:10 shelby3

If you had understood me, you would understand that I already explained that you don't know the exact type passed, that was entirely the point. That you didn't even get that point from what I wrote seems really bizarre.

We always know the exact type passed, that is the whole point of parametric polymorphism. The only time we do not is if there is what Rust calls a trait-object or what Haskell calls an existential-type involved.

keean avatar Oct 05 '16 12:10 keean

@keean wrote:

The only time we do not is if there is what Rust calls a trait-object

Bingo. Re-read my long comment again.

shelby3 avatar Oct 05 '16 12:10 shelby3

So in a traditional system like Rust, if there is a trait object we can never monomorphise, it doesn't matter how early or late the type-class is bound.

keean avatar Oct 05 '16 12:10 keean

@keean wrote:

once we bind an interface at compile-time _before_ the function call site, then we forsake static polymorphism and the opportunity for mono-morphisizing the function.

This is exactly the opposite of what is true.

The word 'before' does not mean 'at'. It means 'before'. I am referring to subsuming to the trait bound before the call site, such as a trait object. But then I generalize it to partial trait objects and 2nd order dynamism.

shelby3 avatar Oct 05 '16 13:10 shelby3

@keean wrote:

So in a traditional system like Rust, if there is a trait object we can never monomorphise, it doesn't matter how early or late the type-class is bound.

In a system where I have generalized it to partial trait bounds and 2nd order dynamism then we still can't monomorphise but we gain the ability to add operations ex post facto.

Your reading comprehension my long comment was really not adequate. Was it really that badly explained or is it your mind is ossified and not prepared for a generalization?

shelby3 avatar Oct 05 '16 13:10 shelby3

The word 'before' does not mean 'at'. It means 'before'. I am referring to subsuming to the trait bound before the call site, such as a trait object. But then I generalize it to partial trait objects and 2nd order dynamism.

But if you have dynamic polymorphism you cannot mono-morphise. It doesnt matter whether the interface is bound before or at the call site.

keean avatar Oct 05 '16 13:10 keean

@keean wrote:

But if you have dynamic polymorphism you cannot mono-morphise. It doesnt matter whether the interface is bound before or at the call site.

Precisely. Early binding of interface requires dynamic polymorphism in order to open to adding new data types to the interface bound, e.g. for collections.

Also, you are only considering one of the tradeoffs. You forgot the other one which my long comments states and I have reiterated:

In a system where I have generalized it to partial trait bounds and 2nd order dynamism then we still can't monomorphise but we gain the ability to add operations ex post facto.

shelby3 avatar Oct 05 '16 13:10 shelby3

But that is not what this sentence says:

So the lesson that once we bind an interface at compile-time before the function call site, then we forsake static polymorphism and the opportunity for mono-morphisizing the function.

This says we cannot mono-morphise if we bind the interface at compile time, when there is dynamic polymorphism. But we cannot mono-morphise at all with dynamic polymorphism, so the logic is wrong.

keean avatar Oct 05 '16 13:10 keean

@keean wrote:

But that is not what this sentence says:

So the lesson that once we bind an interface at compile-time before the function call site, then we forsake static polymorphism and the opportunity for mono-morphisizing the function.

This says we cannot mono-morphise if we bind the interface at compile time, when there is dynamic polymorphism.

Correct.

But we cannot mono-morphise at all with dynamic polymorphism,

Correct.

so the logic is wrong.

What is wrong? You just didn't apparently understand that the goal is not to achieve monomorphism but rather to solve the Expression Problem and keep the extension open to new trait bounds even though we've partially bound to some typeclasses:

@shelby3 wrote:

Precisely. Early binding of interface requires dynamic polymorphism in order to open to adding new data types to the interface bound, e.g. for collections.

shelby3 avatar Oct 05 '16 13:10 shelby3

The point is that we always pass a data type along with a dictionary, but that dictionary can be incomplete and we don't track at compile-time what may already be in that dictionary. When we are ready to bind some trait bounds at a call site, then we write some code which will add these implementations to the dictionary at runtime (insuring they will be available as required for compile-time soundness), but will not overwrite any that already existed in the dictionary.

The point of this is that our caller can set preferential implementations (such as which direction of ordering for lessThan operation), yet the code that is called doesn't have to prematurely specialize on any trait bound signatures. Premature specialization is the antithesis of combinatorial composition as per @skaller's excellent point.

shelby3 avatar Oct 05 '16 13:10 shelby3

I think we aren't actually attaching the dictionaries to the data objects. Rather the data objects are tagged (instanceof) and the dictionaries are passed in as arguments to each function. Perhaps we should model it with a monad?

shelby3 avatar Oct 05 '16 13:10 shelby3

When we are reading to bind some trait bounds at a call site, then we write some code which will add these implementations to the dictionary at runtime.

This is the same as asking if the implementations exist from the call-site. In order to add them, they must first exist in the source code, so the compiler can statically check they exist.

We can go further, we can ask, from the call site, get a list of all types for which there is an implementation of this function. Now we could put this in the type signature so that the linker knows statically what types it is safe to send to that function.

keean avatar Oct 05 '16 13:10 keean

Wow you really aren't getting it.

When we are reading to bind some trait bounds at a call site, then we write some code which will add these implementations to the dictionary at runtime.

This is the same as asking if the implementations exist from the call-site.

Read again:

When we are ready to bind some trait bounds at a call site, then we write some code which will add these implementations to the dictionary at runtime (insuring they will be available as required for compile-time soundness), but will not overwrite any that already existed in the dictionary.

The point of this is that our caller can set preferential implementations (such as which direction of ordering for lessThan operation), yet the code that is called doesn't have to prematurely specialize on any trait bound signatures. Premature specialization is the antithesis of combinatorial composition as per @skaller's excellent point.

shelby3 avatar Oct 05 '16 13:10 shelby3

There is still something odd about the way your description comes across to me, which might be my misunderstanding...

If I have a function that calls methods on something:

f(x) =>
    x.call_widget()

We can say with 100% certainty whatever type whether dynamic or static polymorphism that is passed to f must implement call_widget. Do you agree?

keean avatar Oct 05 '16 13:10 keean

@keean wrote:

We can say with 100% certainty whatever type whether dynamic or static polymorphism that is passed to f must implement call_widget. Do you agree?

Agreed. And the mistake in your thinking is you are conflating not knowing if an preexisting implementation will not be replaced at runtime, with knowing at compile-time that all required implementations will be placed into the dictionary as required. But I am repeating myself.

I understand the concept is a bit confusing at first for someone not expecting it. I do tend to come of out no where with mind twisting new things.

shelby3 avatar Oct 05 '16 13:10 shelby3

And the mistake in your thinking is you are conflating not knowing if an preexisting implementation will not be replaced at runtime, with knowing at compile-time that all required implementations will be placed into the dictionary as required. But I am repeating myself.

That is irrelevant, I am talking about at the call-site.

So when call_widget is called on x there must be an implementation for it available. Do you agree?

keean avatar Oct 05 '16 13:10 keean

So then when we have:

f(x) =>
    x.call_widget()

We can clearly see no code in f is inserting any implementations into anything. Sorry to have to take it this slowly, but I am feeling particularly stupid today. Are we agreed?

keean avatar Oct 05 '16 13:10 keean

I thought you were inferring the trait bound on x. Yes you need a trait bound there. So what is your point?

shelby3 avatar Oct 05 '16 13:10 shelby3