compiler icon indicating copy to clipboard operation
compiler copied to clipboard

Loads of things (information overload ;)

Open ozra opened this issue 11 years ago • 5 comments

Your ideas about the language awoke my curiousity about some ideas:

As you know, I'm intrigued by the SugarCpp project, syntactically. I've been in the web/nodejs business for a couple of years now, coding LiveScript. I've also attended a bunch of meetups with one of the founders of Erlang, one of the first external (outside of Ericsson LM AB) Erlang users, now a devote Haskell coder, where a lot of focus has been on Haskell. And, I'm since a few months back in to C++-coding. It's basically like writing pseudo code, changing a few characters here and there, and you're good to go.. Well, that's some background, which might hint to my perspectives.

I really like your ideas about two way type inference - not having to template manually for trivial cases, UFCS - perhaps 'both ways' (using a method as a function), matching capabilites, the for..else is such a clever idea - so natural for so many (imperative) cases.

Significant ws + voluntary curlies

  • Would you consider significant whitespace? I really do think it makes for much clearer, readable code, and when it's needed, voluntary block braces would really be a heaven sent. If someone wants to dabble with the language and get started, braces can be used through out if wanted, like in C/C++/D/E/JS/Java, well - the bracy ones..
  • Would you consider, in that perspective, "newline indented to same level as expression start line" = expression termination. That is as soon as a new line arrives that is not indented, the expression is closed, unless ofcourse it's still in an open parantheses. C++ terminates with semicolons. Rust, Pascal etc. separates with semicolons. I must say that unfortunately I do not find the Rust model intuitive - utilizing an empty statement in order to return void! Why clutter up all source for being able to not make an empty statement last in the block? I think then it would even be better to specifically use return, or always return last, and specifically write void in the end instead of ;. There are many other parts of Rust I like a lot, just not the shallow, on the surface, part of semis and braces, in my eyes they obscure.

C++11 as "assembly" / "object code" language

  • Also, I believe strongly in using C++11 as intermediary language. The strength of a language for high performant systems coding - in my eyes - is not whether it compiles directly to CPU-specific assembly, through LLVM IR or whatever. No. It's simply that it is more productive, safer, and gets the job done! An LLVM code producer could simply be added at a later stage when the language itself has gone through the experimental developing phases, matured and stabilized. Developing the language with such an easily readable and recognizable target helps speed up the development process. And I don't think there is any case that can't be covered with C++ as target "object code" since it has all low level capabilities (which is what we want, with added ruggedness). For instance open multi methods could be aided with https://github.com/jll63/yomm11, generalized delegates/closures can be implemented so that they compile to two assembly instructions per call. I've got a need for speed, and I don't want to give up clarity for it. It would be much easier to make plugins for IDE's, because one can always use the C-code behind the scenes for symbol lookups / search etc. to aid auto completion etc. Any language that could replace C++ "raw syntax" for me, I would make plugins and tools around for.

What I personally would enjoy to see

  • compatibility with existing high performant libraries in the de facto fastest system language (C++)
    • This way you can use headers directly, you can benefit from link time optimizations, and even compile time cross unit optimizations based on inlining and templating from a huge existing base.
  • readability, clarity of intent (you write a line once, you look at it thousands of times)
  • ease of adoption (hence good "how to"-guides for a coders from different languages - with C++ this is dead simple when the output is C++, dabblers can easily use an online compiler to learn for instance.
  • tools support (editors, building, meddling with the source in any way
  • there are far more CPU targets (embedded etc.) reached through C++ than with LLVM alone.
  • with a bit of will, it could be rewritten to support Java, C# or even JavaScript with perhaps some limits in functionality..
  • when users know that it "bascially is C++", but better, they'll easier adopt it, because they can back out, by simply generating C++ and continue in that, this makes room for people to dare take the step faster. This is what we've seen in the coffeescript scene - javascript is basically just seen as "the object code of the web" (there's actually a project aiming at defining a subset of javascript so that it can be used specifically as "assembly" without breaking compat - but that's a another story)

When it comes to syntax, I believe keeping the mind open to, as scientifically as possible, identify the absolutely clearest most readable syntax one can come up with for the code that is most commonly written to get the most stable results.

Now, this is all ideas and wishful thinking, in any event, I'm really happy to see your project! New languages are always cool! :-)

It saddens me that I don't have the time required to implement a language from scratch in my current life situation, but I'd be happy to collaborate and contribute on one, and I'd be happy to start using one in practical projects soon.

ozra avatar Dec 08 '14 22:12 ozra

Oh, and yes, Haskell has the notion of "layouts", you can code it in either significant whitespace or curlies!

ozra avatar Dec 08 '14 23:12 ozra

but I'd be happy to collaborate and contribute on one,

first and foremost I'd welcome any collaborators, I realise 1 person making a language for 1 user is highly uneconomical .. (this is why I've gone with this hybrid 'rust' idea hoping I can benefit from rust tooling , even just their syntax highlight appearing in editors)

  • I realise I'll have to cleanup my source base , you'll find my code is an awkward mix of C-with-classes & C++11 at the minute. If you've got any suggestions for cleanup or want some docs let me know.

I always bounce because I'm reluctant to commit to C++'s header complexity, traditionally I always preferred rolling a simplified vector too vs std:: but I've gone with that now.

Well, that's some background, which might hint to my perspectives.

My own background is game development for which C++ rules. I know relatively less about web languages etc, but have always been fascinated by things like Lisp,Haskell, Erlang. Its' a shame C++ took so long to get lambdas. This is basically "the language I wish I had..". Language choices without GC are so limited. Rust focusses too much on safety, although I can see its probably superior for most users. Gamedevelopment needs performance & productivity.

Runtime data-structures are simple,reworked for minimal allocation/pointerchasing, conditioned by offline tools. Tools don't need to be so tuned, but it is useful to move code back & forth... I've tried to explain this to the Rust community and they don't really get it... how one can have a need for C/C++ like performance AND productivity in the same package , moving between different extremes in different places. For performance oriented code, I still sometimes like the purity of raw pointers over iterators. Most of my own programming time is spent debugging by writing visualisations of intermediate states.. there's a lot of code that doesn't need to be fast but does need to be in the same place.

Would you consider significant whitespace? / "newline indented to same level as expression start line" = expression termination

YES. At the very least I want to get the logic in there to give errors when braces & indentation don't match - after all the time one spends 'hunting the missing brace..'. Then it'll be a simple step for "brace insertion". I think the same parser engine should be able to handle both. For the minute, my brain freaks out a little without {..} but I conceptually prefer significant whitespce. I might start calling braces and template parameters abstract tokens e.g. BLOCK_BEGIN/END and TYPARAM_BEGIN/END for switching on these. (there's similar template ambiguity to consider and I've already gone down the Scala [T] route, i'll need to make that switchable for rust-user-appeal)

one thing though I've come to realise is { } ; and even C++'s extraneous if () all help to disambiguate <T> from lt/gt, but the hack i;ve tried there also declares type-params must be on the same line..

I have other priorities right now - but I am thinking about it.

or always return last, and specifically write void in the end instead of ;.

I'll make sure I parse a return keyword - and basically my parser basically 'inserts void', thats in the AST dump... writing 'void' might work accidentally already. Perhaps in "significant whitespace mode" a mandatory "return" would be a great compromise.

C++11 as assembly language

I'm in the middle on this - I definitely want a C backend; I'll say I like the simplicity of C vs C++ as a target too, but I can see from SugarCpp that you can leverage "the full power of C++11" and not have to replicate everything. I agree that C or C++ is a nice universal base.. I personally hope the C language lives on - replace C++ with alternative OOP & metaprogramming schemes. Or perhaps a C++ compiler could get an alternative modernised fronted to the same AST & 'middle'.

when users know that it "bascially is C++", but better, they'll easier adopt it, because they can back out, by simply generating C++ and continue in that,

I agree that a compile to Readable C++ , just like SugarCpp , would be a great thing to have - inspiring confidence for people (like myself) who are reluctant to try anything not C++.

I already worked through some complexity in generating LLVM 'phi nodes' (half broken so I've shelved it) and I've tried to abstract the way the code generator passes values around , a 'CgValue' that can be a register, a literal/variable, or a reference- I'm adding slightly higher level entities to the 'CodeGen' interface i.e. "c-like for" is the looping "primitive" (make anything else by omitting parts) so it should be possible to slot in an alternate back end; i've got both 'get element(index) and 'getelement(name)' abstractions in there. there's still some residual mess where my AST node "::compile()" methods spit out LLVM text, its very fidly and error prone but i'm getting there cleaning it up.

r.e. universality of C vs LLVM - From my background in game development (similar to embedded) i've experienced many platforms with custom from early days and of course have seen how C was their ubiquitous choice. For example when the xbox 360 first came out, they initially made their SIMD extensions accessible through C intrinsics, but not C++ classes initially. Of course in 2014 people take a C++ compiler for granted.

dobkeratops avatar Dec 08 '14 23:12 dobkeratops

compatibility with existing high performant libraries in the de facto fastest system language (C++) This way you can use headers directly, you can benefit from link time optimizations, and even

my name mangle might be buggy but right now if you 'extern' it's just generating a C++ symbol so hopefully it can link to C++ overloaded functions. Translating headers etc is more complex ,but its' on my roadmap: I want to self-host by translating my own source. I'm experimenting with a simplified "<T>" parser hack.

I can't really ditch C++ - I'm still way faster at writing C++ than any other language - years of familiarity and mature tools.

I'm not sure when I'll go through with all this: I'm hoping someone in the Rust community can, in parallel, write a raw C++ to Rust transpiler, then I can merely ask them to relax their rules and it'll give me the transpiler for my 'hybrid' language for free. (there are already various binding generators). I can also try to write in a subset of C++ that does translate to Rust.

I don't have "clang" inside this project yet;

I don't think i'll get much of this done soon - I have more work on basics to do e.g. no module system yet, and also to get interest from the Rust community I think I need a "match/enum" implementation (despite the fact what I have now is suitable for my needs.. the improved templates can do what boost variant tries too, more elegantly)

compatibility with existing high performant libraries in the de facto fastest system language (C++) This way you can use headers directly, you can benefit from link time optimizations, and even

Right this is another reason generally why I'm not sticking with Rust: you can't represent interfaces to existing C++ libraries 1:1, since they don't have arbitrary overloading. They declare it a misfeature - I can see their logic but I'm firmly in the camp that sees a "special parameter" as more of a misfeature (because it forces upfront choices and creates hell when you want to refactor). Rust basically only selects functions based on the first parameter (even for compile-time polymorphism) similar to Go although they've recently added 'multi parameter traits', it still requires boilerplate. They also severely restrict where 'methods' can be added ("your type or your library") - you can't add helper methods to library types without deriving another interface first.

I've got 'methods' in this language for compatibility but if C++ didn't exist I'd just have freefunctoins, UFCS, arbitrary overload,(& optional gather into vtable) and be done with it.

Having said that I would like to introduce a special "partial name mangle" e.g. struct Foo { fn "C" bar() } // will compile extern "C" Foo_bar(Foo*);

and I'll suggest that to the Rust community, they can only link with C functions directly and I think that would make extern "C" libraries more pleasant

dobkeratops avatar Dec 08 '14 23:12 dobkeratops

"and even compile time cross unit optimizations based on inlining and templating from a huge existing base."

i'll have to check up on details but I think the LLVM infrastructure does support whole program inlining. C++ Templates via C++ of course require C++ , so you're right this is where compiling to C++ would be a win. Whilst I'm aiming for a similar feature set to C++ templates I'm sure I'll have some omissions and subtle differences that defeat exact translation of existing C++ source.

dobkeratops avatar Dec 09 '14 00:12 dobkeratops

i'll have to check up on details but I think the LLVM infrastructure does support whole program inlining. A yes, it should, however I've recompiled the whole LLVM suit over and over different ways, downloaded patches, spent hours and finally gave up. So I haven't experienced it in practise unfortunately. But still, as far as I've understood it should be in place. I compile with clang/LLVM during dev, and then gcc for release atm.

ozra avatar Dec 29 '14 21:12 ozra