zenscript icon indicating copy to clipboard operation
zenscript copied to clipboard

Syntax summary

Open shelby3 opened this issue 8 years ago • 1191 comments

I will maintain in this OP a summary of current proposed syntax as I understand it to be. Note this is not authoritative, subject to change, and it may be inaccurate. Please make comments to discuss.

: Type is always optional.

  1. Sum, Recursive, Record, and Product parametrized data type with optional Record member names and optional Record (e.g. Cons) and Product (e.g. MetaData) names:

    data List<T> = MetaData(Nil | Cons{head: T, tail: List<T>}, {meta: Meta<T>})

  2. Typeclass interface and implementation:

    typeclass Name<A<B>, ...>   // or 'pluggable'?; optional <B> for higher-kinds¹?
      method(B): A<B>           // TODO: add default arguments, named arguments?
    
    List<A> implements Name
      method(x) => ...
    
  3. References:

    let x:Type = ...      // final assignment, not re-assignable (same as const in ES5/6)
    var x:Type = ...      // (same as let in ES5/6)
    
  4. Functions:

    Type parameters do not require declaration <A,B>.

    someCallback(x:Type y:Type(:Type) => x + y)     //            also  ():Type => x + y
    var f = x:Type y:Type(:Type) => x + y           // not named, also  ():Type => x + y
    f = x:Type y:Type(:Type) => x + y               // not named, also  ():Type => x + y
    let f(x:Type, y:Type):Type => x + y             // named f,   also f():Type => x + y
    let parametrized(x: A, y: B):A|B => x || y
    let parametrizedWhere(x: A, y: B):A|B where ... => x || y
    

    Note that iterator types can be specified for the return value to return a lazy list as a generalized way of implementing generators. The optional (:Type) is necessited for generator functions. Note the (x: Type y: Type): Type => x + y form is unavailable.

  5. Assignment-as-expression:

    if ((x = 0))      // force extra parenthesis where expected type is Boolean
    

† Not yet included in the syntax, as would be a premature optimization. May or may not be added. ¹ https://github.com/keean/zenscript/issues/10

shelby3 avatar Sep 23 '16 06:09 shelby3

I am not sure I like pluggable for a type-class. If it's going to have that many letters either interface or typeclass would be better.

interface List<A>

I am not sure we want to use | for both sum types and union types.

I prefer having 'implementation' before the type-class having the type class first for implements seems inconsistent to me. I also prefer to treat all type class parameters equally. The first is not special so why give it special syntax.

implement List<A>

I am not sure why you put types in the function call syntax? I don't think you need or want them, you only want typed in function definitions.

I don't like that the method syntax is different from the function definition syntax. I think we should have a unified record/struct syntax. If we have:

data List<A> = List(
    append : (l1 : A, l2 : A) : A
)

data MyList
let mylist : List<MyList> = List(
    append : (l1 : MyList, l2 : MyList) : MyList =>
        ... definition ...
)

A record above is like a type-class but you can pass it as a first class value.

If we can 'promote' this to implicit use, we can have a single unified definition syntax. Maybe:

let list3 = mylist.append(list1, list2) // use explicitly
use mylist // make mylist implicit
let list6 = append(list4, list5) // use implicitly

keean avatar Sep 23 '16 07:09 keean

@keean wrote:

I am not sure I like pluggable for a type-class. If it's going to have that many letters either interface or typeclass would be better.

Can't be interface because it would be confused with the way interface works in many other OOP languages. To me as a n00b, typeclass means class, so more misunderstandings. pluggable has some meaning to a n00b such as myself. Sorry I am not an academic and they are only something like 0.01 - 0.1% of the population.

Q: "What is a pluggable API?" A: "It means that you can replace the implementation."

I personally can tolerate typeclass.

I am not sure we want to use | for both sum types and union types.

Why not? Sum types are an "or" relationship. Unions are an "or" relationship.

I prefer having 'implementation' before the type-class having the type class first for implements seems inconsistent to me.

Inconsistent with what? implementation Thing Nil or implementation Nil Thing are not sentences and it is not clear which one is which. Nil implements Thing is a sentence and very clear which is the typeclass.

I am not sure why you put types in the function call syntax?

Afaik, I didn't. What are you referring to?

shelby3 avatar Sep 23 '16 08:09 shelby3

Ah I see:

someCallback(x:Type,y:Type => x + y)

This is ambiguous... is it calling someCallback with 'x' as the first parameter and y => x + y as the second? This would seem less ambiguous:

someCallback((x:Type,y:Type) => x + y)

keean avatar Sep 23 '16 08:09 keean

@keean wrote:

This is ambiguous... is it calling someCallback with x as the first parameter and y => x + y as the second?

Good catch. I missed that one. It indeed conflicts with comma delimited groups in general, not just function calls. I will remove after sleep.

You didn't point out that problem to me when I suggested it. Remember I was trying to make the inline syntax shorter, to avoid the _ + __ shorthand problems.

Edit: there is another option (again :Type are optional):

someCallback(x:Type y:Type => x + y)

But that is still NFG! Because it is LL(inf) because without the leading ( it must backtrack from the space, unless we require Type to be a single token in that context (i.e. use type if need to define complex type as one token).

shelby3 avatar Sep 23 '16 08:09 shelby3

Personally I would rather have a single syntax for function definitions. If that is (with Type optional):

let f = (x:Type, y:Type) : Type => x + y

Then passing to a callback would be:

someCallback((x:Type, y:Type) : Type => x + y)

and then things are consistent. I think keeping things short is important, but I think consistency is even more important.

keean avatar Sep 23 '16 09:09 keean

@keean the only point was to have an optional shorthand syntax (instead of the inconsistent semantics of _ + _ or the obfuscating _ + __) for inline functions and to get rid of those gaudy juxtaposed parenthesis someCallback((....

Thus we don't need : Type for the shorthand syntax, so I propose the following optional shorthand syntax which eliminates the LL(inf) problem as well:

someCallback(x y => x + y)

Which is shorter than and removes the garish symbol soup (( from:

someCallback((x,y) => x + y)

That being generally useful shorthand, enables your request for an optional syntax in the special case of single argument (which I was against because it was only for that one special case):

someCallback(x => x + x)

Instead of:

someCallback((x) => x + x)

However, it isn't that much shorter and the reduction in symbol soup isn't so drastic, so I am wondering if it is violating the guiding principle that I promoted?

Short inline functions might be frequent? If yes, then I vote for having the shorthand alternative since it it would be the only way to write a more concise and less garish inline function in general for a frequent use case. Otherwise I vote against.

shelby3 avatar Sep 23 '16 19:09 shelby3

Are we optimising too soon? I have implemented the basic function parser for the standard syntax, is that good enough for now? I think maybe we should try writing some programs before coming back to optimise the notation. I would suggest sticking to "only one way to do things" for now, because that means there is only one parser for function definitions, which will keep the implementation simpler for now. What do you think?

keean avatar Sep 23 '16 19:09 keean

Thanks for reminding me about when I reminded you about when you reminded others to not do premature optimization.

I agree with not including the shorthand for now. Then we can later decide if we really benefit from it. I'll leave it in the syntax summary with a footnote.

shelby3 avatar Sep 23 '16 19:09 shelby3

The compiler can now take a string like this

id = (x) => x
id(42)

compile it to:

id=function (x){return x;};id(42);

Next thing to sort is the block indenting, and then it should be able to compile multi-line function definitions and application.

keean avatar Sep 23 '16 19:09 keean

I think we should have an provisional section, so we can split the syntax into currently implemented, and under consideration.

keean avatar Sep 23 '16 19:09 keean

@keean wrote:

I think we should have an provisional section, so we can split the syntax into currently implemented, and under consideration.

I'll do if the † instances become numerous enough to justify duplication.

shelby3 avatar Sep 23 '16 19:09 shelby3

Link to discussion about unification of structural lexical scope syntax.

shelby3 avatar Sep 23 '16 20:09 shelby3

Are we sure having keywords let and var is the right way to go? If we have keywords for these we might want to have a keyword for functions? I quite like Rust's fn for introducing function definitions?

keean avatar Sep 23 '16 20:09 keean

@keean wrote:

Are we sure having keywords let and var is the right way to go? If we have keywords for these we might want to have a keyword for functions? I quite like Rust's fn for introducing function definitions?

Instead I have proposed unified functions around let and var.

What would be the alternative to not having let and var? I can't think of one that makes any sense. How would you differentiate re-assignment from initialization? Remember we already decided we can't prefix the type for reference initialization, because types are optionally inferred.

shelby3 avatar Sep 23 '16 21:09 shelby3

@keean wrote:

I think structs/objects would probably start with an upper case letter.

Agreed.

My suggestions on types of name tokens for the lexer to produce:

  • data and pluggable (or typeclass?): [A-Z]+[a-z][a-zA-Z]*

    (start uppercase, at least one lowercase, only alphabet)

  • named function references: [a-z][a-zA-Z]*

  • non-function and unnamed function references: [_a-z]+

  • type parameters: [A-Z]+

The exclusivity for type parameters for all uppercase is so they don't have to be declared with <A,B...>.

Edit: the distinction between named functions and non-functions references will be useful, because unnamed functions references should be rarer. However, I was incorrectly thinking that it wouldn't make any sense to give function naming to unnamed function references (which have re-assignable references) because the reference would indicate it is for a function but I had the incorrect thinking the reference could be reassigned a non-function type (but reference types can never change after initial assignment). So I think it would be safe to change the above to:

  • named and unnamed function references: [a-z][a-zA-Z]*
  • non-function references: [_a-z]+

The other advantage of that is the lexer can tell the parser to expect a function, which is more efficient and provides two-factor error checking.

Note the compiler must check that the inferred type of the reference matches the function versus non-function token for the name.

shelby3 avatar Sep 24 '16 04:09 shelby3

(Aside: Very few languages have a clean lexer and often you end up with lexer state depending on compiler state (string literals are a classic example). One of the advantages of parser combinators like Parsec is that you can write lexer-less parsers, and that cleans up the spaghetti of having the lexer depend on the state of the parse. )

  • If we do not introduce type-variables, we need to have different cases for type variables and types.
  • I don't like camel case :-( and prefer values and functions to_be_named_like_this.
  • There are not enough cases, as I would like to have something different for variables, types, and type-classes...

Conclusion, nothing is going to be perfect.

My favourite would be:

datatypes and typeclasses : [A-Z][a-zA-Z_0-9][']+ functions and variables : [a-z][a-zA-Z_0-9][']+

This would have both type variables and value variables lower case.

I like the mathematical notation of having a 'prime' variable:

let x' = x + y

keean avatar Sep 24 '16 07:09 keean

Comment about function syntax. Edited the OP to reflect this change.

@keean note where is already documented for functions in the OP.

shelby3 avatar Sep 24 '16 15:09 shelby3

@keean wrote:

  • If we do not introduce type-variables, we need to have different cases for type variables and types.

Agreed.

For readers, by "If we do not introduce type-variables" you mean if we do not prefix <A, B, ...> in front of functions. I am not proposing to remove that when it is a suffix of a type name.

  • There are not enough cases, as I would like to have something different for variables, types, and type-classes...

You have a point, but it is not an unequivocal one. We can require typeclasses begin with a lowercase i or uppercase I followed by a mandatory uppercase letter. If we choose the lowercase variant, we can disallow this pattern for function names.

It not only helps to read the code without syntax highlighting (and even 'with', if don't want rainbow coloring of everything), it also speeds up the parser (because the lexer provides context).

Very few languages have a clean lexer and often you end up with lexer state depending on compiler state (string literals are a classic example)

If the string literal delimiters are context-free w.r.t. to the rest of the grammar, then the lexer can solely determine what is inside a string literal and thus not parse those characters as other grammar tokens (aka terminals). Edit: the proposed paired delimiters will resolve this issue.

I believe if the grammar is context-free (or at least context-free w.r.t. to name categories) this will reduce conflation of lexer and parser state machines. That is why I suggested that we must check the grammar in a tool such as SLK, so that we can be sure it has the desirable properties we need for maximum future optimization. I am hoping we can also target JIT and ZenScript become the replacement for JavaScript as the language of the world. Perhaps the type checker for our system will be simpler than subclassing and thus performant. Even Google now realizes that sound typing is necessary to solve performance and other issues.

One of the advantages of parser combinators like Parsec is that you can write lexer-less parsers, and that cleans up the spaghetti of having the lexer depend on the state of the parse.

I still need to come up to speed on the library you are using to know what I think about tradeoffs. Obviously I am in favor of sound principles of software engineering, but I really can't comment about the details yet due to lack of sufficient understanding. I will just say I am happy you are working on implementing and I hoping to catch up and also look at other aspects you may or may not be thinking about.

I don't like camel case :-( and prefer values and functions to_be_named_like_this.

The _ is verbose (also symbol soup) and I try to avoid where ever I can. I try to use non-function references that are single letters or words. But function references very often can't be single words. Also calling methods with . gets symbol soup noisy when there are also _ symbols in there. I do understand that camel case for values (references) is similar to the camel case that is in type names and only difference being the proposed upper vs. lowercase first letter (and then further overloaded by the i variant of the proposal above for distinguishing typeclasses); but this is irrelevant because function names do not appear in typing declarations (unless we opt for nominal typing of functions which I am not sure what that would mean).

Note I had a logic error in my prior comment, in that single word function and non-function names were indistinguishable in what I proposed. But that doesn't destroy the utility of the cases where function names are camel case.

datatypes and typeclasses : [A-Z][a-zA-Z_0-9][']+

I want to make what I think should be a convincing rational point about proper names.

I don't like _ in type names. For me a type name should read like a proper name, headline or title where each word has its first letter capitalized. We don't put such punctuation in a title normally in English. Simulating spaces with _ is ugly symbol soup. It removes the elegance of a title. It is better to just keep the first-letter capitalization and smash together without the spaces. Instead you prefer to remove the first-letter capitalization and convert spaces to _, which is removing the first-letter capitalization attribute of a title which is the sole attribute that differentiates a proper name from other forms of English. Spaces are not the differentiating attribute of proper names. If you instead proposed to retain first-letter capitalization after each _, you would have a more consistent argument (but I would still argue that the _ is noise symbol soup redundancy since have the camel case to distinguish words).

So I can objectively conclude your preference is not consistent to types as proper names, headlines, or titles, which is what they are.

<joke>You are British, so you should be more proper than me, lol.</joke> Although my last name is "Moore" and first name was a family name "Shelby" originating from north England meaning "willow manor". And I've got "Hartwick" (German), "Primo" (southern France/Italian) and "Deason" (diluted Cherokee native American) ancestry as well.

I like the mathematical notation of having a 'prime' variable:

let x' = x + y

I don't think I have an objection to this as a suffix only. ~Why not allow unicode subscript characters as well?~(Edit: we have array indices for this)

Edit: however one issue with camel case and no underscores is when an entire word which is an acronym is not delimited by the capitalization of the word which follows it, e.g. NLFilter (for NL as an acronym for newline). In that example, I might prefer to name it NL_Filter, i.e. the underscore only allowed when it follows and is followed by a capitalized letter.

shelby3 avatar Sep 24 '16 17:09 shelby3

@keean wrote:

datatypes and typeclasses : [A-Z][a-zA-Z_0-9][']+ functions and variables : [a-z][a-zA-Z_0-9][']+

You didn't differentiate from ALLCAPS type parameters above. Also your regular expression seems incorrect, as + means 1 or more. Perhaps you are employing a syntax that is peculiar to your Parsec library?

Note that JavaScript allows $ in names, so if we want full interoperability then we need to allow it. Perhaps there are other ways we could work around and support interoperability with the $? Note JavaScript also supports some Unicode, but if we support that we are allowing ZenScript source code to resemble Dingbats art. Perhaps we should only allow $ and Unicode in names that have been declared as FFI?

So the ' will be emitted to JavaScript names as $prime same as for PureScript because it (nor the correct  ′ symbol) is not a valid character in identifier names? Or we could convert these to single and double x̿ (x̿) overline characters (or single and double vertical line above) characters which are valid for JavaScript identifiers names. Should we also offer the π, τ, , , (or more correctly gamma γ), 𝑒, and φ symbols or entire Greek alphabet αβγδεζηθικλμνξοπρςτυφχψω as identifier names since they are valid for JavaScript? Ditto double-struck alphanumerics 𝕒𝕓𝕔𝕕𝕖𝕗𝕘𝕙𝕚𝕛𝕜𝕝𝕞𝕟𝕠𝕡𝕢𝕣𝕤𝕥𝕦𝕧𝕨𝕩𝕪𝕫𝔸𝔹ℂ𝔻𝔼𝔽𝔾ℍ𝕀𝕁𝕂𝕃𝕄ℕ𝕆ℙℚℝ𝕊𝕋𝕌𝕍𝕎𝕏𝕐ℤ𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡, mathematical gothic 𝔞𝔟𝔠𝔡𝔢𝔣𝔤𝔥𝔦𝔧𝔨𝔩𝔪𝔫𝔬𝔭𝔮𝔯𝔰𝔱𝔲𝔳𝔴𝔵𝔶𝔷𝔄𝔅ℭ𝔇𝔈𝔉𝔊ℌℑ𝔍𝔎𝔏𝔐𝔑𝔒𝔓𝔔ℜ𝔖𝔗𝔘𝔙𝔚𝔛𝔜ℨ (also 𝖆𝖇𝖈𝖉𝖊𝖋𝖌𝖍𝖎𝖏𝖐𝖑𝖒𝖓𝖔𝖕𝖖𝖗𝖘𝖙𝖚𝖛𝖜𝖝𝖞𝖟𝕬𝕭𝕮𝕯𝕰𝕱𝕲𝕳𝕴𝕵𝕶𝕷𝕸𝕹𝕺𝕻𝕼𝕽𝕾𝕿𝖀𝖁𝖂𝖃𝖄𝖅), and mathematical script 𝓪𝓫𝓬𝓭𝓮𝓯𝓰𝓱𝓲𝓳𝓴𝓵𝓶𝓷𝓸𝓹𝓺𝓻𝓼𝓽𝓾𝓿𝔀𝔁𝔂𝔃𝓐𝓑𝓒𝓓𝓔𝓕𝓖𝓗𝓘𝓙𝓚𝓛𝓜𝓝𝓞𝓟𝓠𝓡𝓢𝓣𝓤𝓥𝓦𝓧𝓨𝓩 (also 𝒶𝒷𝒸𝒹ℯ𝒻ℊ𝒽𝒾𝒿𝓀𝓁𝓂𝓃ℴ𝓅𝓆𝓇𝓈𝓉𝓊𝓋𝓌𝓍𝓎𝓏𝒜ℬ𝒞𝒟ℰℱ𝒢ℋℐ𝒥𝒦ℒℳ𝒩𝒪𝒫𝒬ℛ𝒮𝒯𝒰𝒱𝒲𝒳𝒴𝒵)?

Here is what I arrive at now in compromise:

  • type parameter: [A-Z][A-Z0-9]*
  • data: (?:(?:(?:[A-H]|[J-Z])[A-Z]*)|I)[a-z][a-zA-Z0-9]*[']*
  • typeclass:I[A-Z]+[a-z][a-zA-Z0-9]*[']*
  • function references: [a-z_$][a-zA-Z_0-9$]*[']*
  • non-function references: [a-z_$][a-z_0-9$]*[']*

I like the leading I on typeclasses, so we capture the notion they are interfaces without conflating the keyword with the incompatible semantics of interface in other programming languages.

Edit: no need to allow uppercase in non-function references. Who on God's earth is using camel case for variable (i.e. non-function) reference names? :laughing:

shelby3 avatar Sep 24 '16 17:09 shelby3

@keean wrote:

Please no all caps, it's like shouting in my text editor :-(

:eyes:

Type parameters will nearly always be a single letter. We both must compromise to what is rational. I have compromised above forsaking required camel case on functions. I also compromised (well more like I fell in love once we eliminated need for subclassing syntax) and accepted Haskell's data unification of Sum, Product, Record, and Recursive types.

I don't want the noise of declaring <A, B...> on functions. That is egregiously more DNRY noisy, than any choice between uppercase and lowercase single letters. Function declarations are too cluttered.

Also the lowercase letter choice for type parameters is not idiomatic and it is has no visual contrast in the x: a arguments. You can't compare to Haskell, because Haskell puts the function type declaration on a separate line. Sorry the lower case type names don't work once we merge typing into the same line.

Also type parameters are types, thus they should not be lowercase. That would be inconsistent with our uppercase first-letter on all types.

The lowercase type parameters of Haskell (combined with lack of <>) still causes me to not be able to read Haskell code quickly. It took me many attempts at learning Haskell where I failed, because of differences like that from mainstream Java, C++ languages.

If you are making a Haskell language, I don't think it will be popular. I am here to make a popular language, thus I will resist you on this issue.

One of my necessary roles here is to provide the non-Haskell perspective.

Let's do something very cool and eliminate the need to declare <A, B...>. We need advantages to our language in order to attract love and attention. Programmers love DNRY.

ML prefixes type variables with a character (happens to be 'a but could be anything)

:anguished:

I absolutely hate that. First time I saw that, I was totally confused. And I hate Rust's lifetime annotations littering the code with noise. I don't like Haskell and ML syntax. Not only am I lacking familiarity (not second-nature) with their syntax, but I dislike much of the syntax (and even some of the concepts) of those academic languages for logical reasons which I have explained in prior comments. I realize their target market is the 0.01 - 0.1% of the population that are academics (and what ever subset of that which are programmers). If you want to bring in most of the syntax and the obtuseness from those languages, then I think we have different understanding of what the mainstream wants.

I am not a verbal thinker. I always score higher on IQ tests that are measuring visual mathematical skills, rather than verbal skills. My I/O engine is weaker than my conceptual thought engine (I think this is why I get fatigued with discussions because my I/O engine can't keep up with my thoughts). My reading comprehension of English is 99th percentile, but my articulation and vocabulary are in the high 80s or low 90s. So apparently I dislike complex linguistic computation. I seem to struggle more with sequencing or the flattening out what I "see" in multi-dimensions into a sequential understanding. My math and conceptual engine is higher (more rare) than 99th percentile, but not genius.

So someone with more highly developed linguistic computation than myself, would probably find my desire for linguistic structure to be arbitrary and unnecessary. I've been working on my weakness, but I do find it takes energy away from my thought engine, which is where I feel more happy and efficient.

Also I note you want to get rid of the type parameter list, you do realise sometimes you need to explicitly give the parameters if they cannot be inferred like:

let x = f<Int>(some_list)

Please differentiate between function declaration and function call.

I had written about that 3 days ago:

It is much less noisy and often these will be inferred at that function call site, so we won't often be doing f<Int,Int,Int>(0, 1) so the explicit correspondence to <A, B, C> [on function declaration] probably isn't needed for aiding understanding.


@keean wrote:

This also makes me realise another ambiguity in our syntax, it is hard to tell the difference between assigning the result of a function and assigning the function itself as a value. In the above you might have to pass to the end of the line to see if there is a '=>' at the end. This requires unlimited backtracking which you said you wanted to avoid.

Please catch up with recent corrections to the syntax.

shelby3 avatar Sep 24 '16 18:09 shelby3

@shelby3 wrote:

I absolutely hate that. First time I saw that, I was totally confused. And I hate Rust's lifetime annotations littering the code with noise.

So I am totally with you on the above. The problem is without introducing the type variables, how do we distinguish between types and variables, for example:

let x(a : A, b : B) : C

Are they single letter types, or type variables?

We often want to re-use type variables like 'A' a lot consider:

let f(x : A) =>
    let g(y : A) =>
        // is 'y' the same type as 'x' ?

The problem with making type variables all uppercase is it does not distinguish type names. Do we insist that all type names have more than one letter?

keean avatar Sep 24 '16 18:09 keean

@keean wrote:

Are they single letter types, or type variables?

Type variable per the regular expressions I proposed.

:thought_balloon: I see you are preferring "type variables" to the term "type parameters". I suppose this is to distinguish from function parameter (arguments).

Do we insist that all type names have more than one letter?

Yes. However...

I see now our conflict in preferences. I am thinking type names should be informational; single letter proper names are extremely rare and not self documenting, so I thought it was okay to just not allow them. You are apparently thinking of supporting math notation in code. Which is evident by your data R example and your suggestion to allow ' at end of all names.

Mainstream programmers typically don't (or rarely) want do math notation in code.

In my proposal they can still get math notation with data R' instead of data R.

:bulb:

I think there is another solution which would give you single-letter data, and keep my desire to eliminate the <A, B...> declaration noise. When a data type is intended, then for the first mention of the single-letter, put x: data.R. If the first mention is a product (tuple) or record constructor in code, then data.R(...) or data.R{...}.

:hammer:

And when there is a single-letter data name in conflict with a type parameter in scope, then I think we should have a compiler warning that must be turned off with compiler flag. The warning should tell the programmer to use data. prefix if that is what is intended (which turns off the warning) else use compiler flag or remove the conflict from scope. Or alternatively, we could not allow single-letter data names in scope, unless compiler flag is turned on.

Would that solve the problem for you? I don't think the single-letter data will be used by most or often, so those who need it can pay this slight verbosity and special case cost, so that everyone else can enjoy brevity and simplicity more often.

shelby3 avatar Sep 24 '16 19:09 shelby3

@shelby3 wrote:

Also I note you want to get rid of the type parameter list, you do realise sometimes you need to explicitly give the parameters if they cannot be inferred like:

let x = f<Int>(some_list)

Please differentiate between function declaration and function call.

I had written about that 3 days ago:

It is much less noisy and often these will be inferred at that function call site, so we won't often be doing f<Int,Int,Int>(0, 1) so the explicit correspondence to <A, B, C> [on function declaration] probably isn't needed for aiding understanding.

There is a problem remaining. The order of the type parameters in the optional <...> list on function calls may be ambiguous? I think we can adopt the rule that it is the ~order in which they appear in the function declaration.~ (Edit: I propose instead alphabetical order so that the programmer has more flexibility to order them so that the implicit ones can be first on function call <…> annotations, and this will also defeat some refactoring bugs.)

:warning:

Edit: and that leads to a very obscure and probably very rare programmer error, in that if not all the type variables are specified in the argument list (i.e. some are only in the where clause) and if some change is made to the where clause which doesn't change call site type, but changes the order of the type variables. But all programming languages have some sort of rare obscure programmer errors.

Edit#2: and note it should be quite odd and extremely rare that the programmer wants to constrain at the call site, a type variable that is not in the argument list or result value. Also the following function call is much more informational than f<Int,Int,Int>(0, 1):

:bulb:

f(x:Int, y:Int): Int

And allows us to specify only some constraints:

f(x:Int,y):Int

And it is more consistent with the syntax of function declaration.

Of if we prefer:

(Int)f((Int)x,y)

So maybe we can disallow <A,B...> on functions (declaration and call site) except for specifying type variables which don't appear in the argument list or result? Which should be almost never.

shelby3 avatar Sep 24 '16 19:09 shelby3

The number and function of the type parameters is not the same as the number and type of the arguments, some type parameters may only occur in the where clause. Consider:

f<A>(x : B) where Subtype<B, A>

Note, Rust would not allow this, as you have to introduce all type parameters, which makes them less useful.

Really we have to have the type parameters if we want to have parametric types (that is types that are monomorphisable). If we are happy to give up monomorphisation we can have universally quantified types instead, and then there is no need to have type parameters at all.

In some regards I would prefer universally quantified types from a purely type system perspective, but it is much easier to implement monomorphisation with parametric types.

If you really want to get rid of the type parameters, then lets switch to universal quantification.

keean avatar Sep 24 '16 20:09 keean

I would rather say minimum two letters the second of which must be lower case for datatypes, and all caps for type variables.

Also we can use universal quantification to get rid of type parameters (although it does change what types are valid in the type system).

keean avatar Sep 24 '16 20:09 keean

I would suggest lexical scoping for type variables, so in my example above the A would be the same for both.

I think this satisfies the principle of least surprise.

keean avatar Sep 24 '16 20:09 keean

@keean wrote:

The number and function of the type parameters is not the same as the number and type of the arguments, some type parameters may only occur in the where clause. Consider:

f<A>(x : B) where Subtype<B, A>

Did you not read the comment of mine immediately before yours?

I also explained that exact issue and offered a solution.

shelby3 avatar Sep 24 '16 20:09 shelby3

Here's an interesting one, we need to write the type of a function, and we agree function definition should be an expression.

let x : (Int, Int) : Int = x y => ...

This should be possible too because I will want to pass functions to other functions:

let f(g : (Int, Int) : Int) : Int =>

keean avatar Sep 24 '16 20:09 keean

@shelby3 wrote:

Did you not read the comment of mine immediately before yours?

I also explained that exact issue and offered a solution.

type parameters are like function arguments but for types, you cant just have some rule for inferring them from the rest of the declaration.

If you don't want type-parameters for functions, can we move to universal quantification which does not require them.

keean avatar Sep 24 '16 21:09 keean

@keean wrote:

If we are happy to give up monomorphisation we can have universally quantified types instead, and then there is no need to have type parameters at all.

No. We long ago realized that we can't do these higher-order features and have global inference. Impossible or at the minimum beyond our available resources, time, and brain power.

In some regards I would prefer universally quantified types from a purely type system perspective, but it is much easier to implement monomorphisation with parametric types.

I wouldn't prefer to have universally quantified types. We discussed this already. I don't want to repeat that discussion. I explained that all public APIs need types. I gave my reasons. No need to repeat here.

If you really want to get rid of the type parameters

I never proposed to get rid of type variables aka type parameters. Why are you introducing a tangent. Please we need to stop wasting time on tangents that were already decided before.

shelby3 avatar Sep 24 '16 21:09 shelby3