haxe icon indicating copy to clipboard operation
haxe copied to clipboard

Haxe-in-Haxe notes

Open ncannasse opened this issue 7 years ago • 76 comments

The first goal is to have the compiler written in Haxe and generating the corresponding OCaml code, allowing for a more familiar syntax for potential contributors and compiler team. At longer term, this will allow the compiler to run on other platforms although this is not something that will be possible or that we should focus at first.

The following needs to be resolved to have a working Haxe-in-Haxe implementation.

Note to Haxe users : this is an exploration of the possibility, we are not sure yet we will have something working in the end.

OCaml Haxe target (Haxe-to-OCaml) (@ncannasse) :

  • [ ] OCaml generator for Haxe compiler: we shouldn't try to support the whole Haxe specification at first, in particular everything involving reflection / dynamic is not used by the compiler and thus not required.
  • [x] Class/structures representation: we should go with something like C, with a data structure representing the classes fields and methods taking the class as first argument.
  • [x] Ability to access externs, such as ExtLib PMap module, etc. This should work pretty will with abstracts or extern classes given the class representation choice
  • [x] List support: not an easy one, we need to have some way to easily construct and pattern match ml immutable lists
  • [ ] More list support: :: as proper operator with type inference, correctly infer :: in pattern matching

Automated Haxe compiler OCaml-to-Haxe ML2HX (@nadako) :

  • [x] Extracting the OCaml typed AST
  • [ ] Generating the corresponding Haxe code
  • [ ] Lexing and parsing: I think they should be left as "externs" for now, and we can keep the original OCaml source
  • [ ] Making sure the corresponding Haxe code compiles (gets typed) with the ML target

Merge of both ML target and ML2HX :

  • [ ] Making sure the Haxe compiler code successful generates to ML
  • [ ] Making sure the ML generated code compiles and run
  • [ ] Performance/Styling : replace consecutive var assignment with lets in ML target, replace consecutive lets with same type by single var decl in ML2HX
  • [ ] Deal with Lexer and Parser at this point?
  • [ ] Refactoring: after porting, several steps of refactoring will be possible by reorganizing the modules and packages

ncannasse avatar Feb 10 '18 09:02 ncannasse

I have added :: operator and ability to construct and pattern match immutable lists, this can be followed on this branch https://github.com/HaxeFoundation/haxe/tree/genml

ncannasse avatar Feb 10 '18 11:02 ncannasse

Since Haxe currently support C# would F# provide a simpler path. http://web.archive.org/web/20080410181630/http://research.microsoft.com/fsharp/manual/ml-compat.aspx https://stackoverflow.com/questions/179492/f-changes-to-ocaml The task could then be split between modifying the C# target to support F#, and a task to change the compiler to F# code while keeping information on changes made so that Ocaml can be supported later. Haxe c# dll could be consumed which would allow you to plugin hard functionality fairly early on and optimise and rewrite as needed. Anyway just an idea I have no idea if it fits the needs but if someone was able to get a minimal F# version of Haxe compiler working quickly it would prove the concept.

nanjizal avatar Feb 10 '18 12:02 nanjizal

I have compiled F# mono graphics examples on mac so it's not just windows although doubt it would provide the same speed as Ocaml initially?

nanjizal avatar Feb 10 '18 13:02 nanjizal

I also wonder about using hxcpp but I suppose libraries like https://github.com/GJDuck/libf Don't produce the richness that Ocaml has.

nanjizal avatar Feb 10 '18 13:02 nanjizal

@nanjizal I don't think F# produces as fast code as OCaml do, as it's built on top of DotNet

ncannasse avatar Feb 10 '18 13:02 ncannasse

http://fsharpnews.blogspot.co.uk/2012/09/performance-of-compiler-translated-from.html

nanjizal avatar Feb 10 '18 14:02 nanjizal

@nanjizal anyway I have little interest in targeting F#, and this requires extra profiling and extensive changes as the article explains

ncannasse avatar Feb 10 '18 14:02 ncannasse

@nanjizal let's not diverge the focus here. Porting from the compiler from OCaml to Haxe is a big task already. Let keep this post focused on this single aspect.

Using other target language is beyond the scope of the post as said in OP already:

... to run on other platforms ... is not something ... that we should focus at first.

kevinresol avatar Feb 10 '18 14:02 kevinresol

Awesome, Keep up the good work guys! I think this will improve Haxe a lot!

markknol avatar Feb 10 '18 14:02 markknol

@ncannasse what about performance? I mean, if feature X will develop in Haxe->Ocamel it will be slower than if written in Ocamel directly? the difference is negligible?

mikicho avatar Feb 10 '18 16:02 mikicho

It would be very interesting if there was an example of what Haxe code might look like for targeting Ocaml so a snippet showing a small aspect of functionality. I am not even sure on what support Haxe has for immutable code at the moment ( there is now final, but I have not seen it used ). I think it would be really helpful to flush out some small examples even if incomplete, rather than just describe them above, because when we can start to see how Haxe might be used it gives more idea of current limitations. For instance to capture the input and interest of really able Haxe developers who don't use Ocaml currently. Also I would be keen if macros were used sparingly, I think they are very powerful but increase the entry point of contribution and could make code more difficult than the current Ocaml codebase.

nanjizal avatar Feb 10 '18 19:02 nanjizal

@mikicho the idea is to have 0 loss of performance in the process, so we are looking for something almost 1-to-1 conversion

ncannasse avatar Feb 11 '18 08:02 ncannasse

So I started a little ocaml project for converting ml to hx here https://github.com/nadako/ml2hx. It's really rough but the foundation is there to improve upon. It works by invoking ocaml as a lib and processing the typed AST to generate Haxe code. Currently requires 4.06 and probably won't compile with earlier versions, because the compiler API incompatibilities, but oh well.

nadako avatar Feb 11 '18 18:02 nadako

I'm a little surprised this didn't get mentioned: https://github.com/elnabo/haxe_in_haxe ^^

Either way, this is very exciting news. I really hope something will come out of it ;)

back2dos avatar Feb 12 '18 08:02 back2dos

@nadako hi very interesting, obviously my first thought was to try the code on itself so I can understand the Ocaml code better:

$ ./main main.ml File "main.ml", line 2, characters 5-13: Error: Unbound module Asttypes

Should this work?

( By the way could the make file work without ocamlfind and use opam instead, it's not in homebrew and not needed for make haxe compiler? Well I have installed ocamlfind here, but for other mac users might be better to avoid? I don't know? )

nanjizal avatar Feb 12 '18 08:02 nanjizal

@nanjizal Please discuss about his awesome project in its repo. Guys, please stay focus on the subject, we want this issue will be readable in the near future :)

mikicho avatar Feb 12 '18 09:02 mikicho

Could someone shed some light on https://github.com/elnabo/haxe_in_haxe? Is it a coincidence that the creation of this repo somehow matches the creation of this issue?

I guess the idea is to first automatically transpile the OCaml codebase to Haxe using https://github.com/nadako/ml2hx. and then fix / improve manually as we go?

That said, those are really great news! 🎉

fullofcaffeine avatar Feb 13 '18 16:02 fullofcaffeine

Is it a coincidence that the creation of this repo somehow matches the creation of this issue?

It's not, it was published when discussions started about Haxe in Haxe :) I'm not sure about the fate of this repo though, since porting this amount of code by hand is not really an option.

I guess the idea is to first automatically transpile the OCaml codebase to Haxe using https://github.com/nadako/ml2hx. and then fix / improve manually as we go?

Yeah, that's the idea at the moment. Thanks to the fact Haxe is actually quite similar to ocaml (enums, pattern matching, var shadowing, everything as expression, etc.), it's pretty straightforward to generate readable-ish haxe code from ocaml.

nadako avatar Feb 13 '18 17:02 nadako

Could someone shed some light on https://github.com/elnabo/haxe_in_haxe? Is it a coincidence that the creation of this repo somehow matches the creation of this issue?

I started to try and do a copy of the compiler in Haxe a month ago. And when the issue of writing Haxe in Haxe arose, @ibilon told me to make my code public.

I don't know how long, I'll continue it, because it's likely other options will succeed first. But for the moment, it help me understand a bit better the compiler.

elnabo avatar Feb 14 '18 08:02 elnabo

@elnabo well it may turn out that for hxcpp this approach is better, we can't assume that ocaml/functional output will provide the best results, but I can imagine that an automated initial port that nadako has started should provide a solution that more easily covers all the edge cases currently. But you can never be sure where such projects leads I certainly urge you to continue sometimes working from different angles also brings a better overall solution and new opportunities. Can you compile the JS target with your code?

nanjizal avatar Feb 14 '18 21:02 nanjizal

I tried to compile with hxnodejs but I get weird error Class<Sys> has no field stderr so I must have misconfigured something.

But as I try to be as similar to the source as possible. Me and nadako should end with similar looking code. Except I'm likely to have bugs hidden somewhere, as I think I currently have.

elnabo avatar Feb 14 '18 22:02 elnabo

Rant

You're crazy

I just read through this issue properly for the first time and think y'all are crazy. Porting the Haxe sources to Haxe only to then compile them back to OCaml is insanity.

Let's look at what's bad about OCaml:

  1. It's a pain the ass to set up, especially if you're not on Linux.
  2. Its compilation time can be very slow.
  3. It does not support true parallelism.
  4. Its tooling in general is a bit shit, even if you are on Linux.

Which of these problems would be alleviated by having the Haxe sources in Haxe and compiling them to OCaml? Well, we would still be stuck with the same crappy toolkit, except that we now have another layer on top of it which makes most of the issues even worse:

  1. We still have to set up OCaml in order to compile Haxe.
  2. We still have a slow OCaml compile time, plus we now add the Haxe compile time.
  3. We could still not use parallelism because we go through OCaml.
  4. Alright, we could use some of Haxe's better tooling.

Furthermore, all that is assuming that we would even make it that far in the first place. I don't share your optimism that the code base could be converted to maintainable Haxe and be generated back to efficient OCaml. There's no way that you could generate idiomatic Haxe to efficient OCaml in general, so we would have to gimp ourselves by essentially writing OCaml-idiomatic code in Haxe.

Alternatives

So in summary, I think this entire idea of going through OCaml should be dropped. Which obviously raises the question of what we should do instead. Given that we want to write Haxe-in-Haxe, I can see two possibilities:

  1. Write a JVM-target and use that.
  2. Focus on hxcpp.

Given that 1. requires an additional step, it would seem that 2. is the best option. At this point, hxcpp is pretty battle-hardened. And guess what, it already has a generational GC, which is the reason why Nicolas clings to OCaml. I can't speak to the efficiency of it, but in my experience all it takes to get Hugh on a job is a benchmark showing a problem.

What's more, hxcpp supports parallelism. Even if its GC doesn't beat OCaml's (which I expect to be the case), I'd wager that the usage of multithreading would make up for that.

And yes I'm implying that we should rewrite Haxe instead of trying to awkwardly port it. I know people think it's not an option, but I disagree. We're currently sitting at just above 100kloc, which is not much. I can easily write you a parser + typer + optimizer + JS generator in one month of focused effort if I have a clear vision of what to go for.

I'd have to think a bit about how to port eval, but that can also be figured out. The analyzer would require some redesign that doesn't depend on functors. Porting generators requires some manpower, but there's nothing inherently difficult about it either.

IMO this is what we should go for with Haxe 5. There's a million questions still to be answered obviously, but I really want to get the point across that targeting OCaml is a huge strategic mistake in 2019.

Simn avatar Mar 18 '19 18:03 Simn

I'm not sure how much different the rewrite would look like from the "OCaml-idiomatic code", especially if we talk about multi-threading. I'm pretty sure it'll still be a bunch of recursive pattern matching functions and persistent data structures, unless "we" want to go for the OOP visitor pattern madness (I can see different array handling and some proper non-exception based control flow though:)). That's why I think that automated port is technically viable, even though with all the extra punctuation that Haxe will bring, our single-letter names will look even more messy :)

Personally I have no problems with OCaml nowadays, and it's tooling is slowly getting better, but I could of course work on Haxe in Haxe. In some situations Haxe would be actually nicer (e.g. reification and/or macros for expr building).

Regarding the run-time, I was recently writing a compiler for AS3 in a functional style, simiilar to how Haxe is written, and I can say this requires a really really good GC, because even V8 (node.js) chokes badly with a lot of persistent data object allocations. I had to add quite some boilerplate checks to do less allocations so it doesn't die. HXCPP seemed to perform better, but I didn't do deeper benchmarking, but I feel like at this point "we" have a lot of experince to improve GC :)

nadako avatar Mar 18 '19 20:03 nadako

I partly agree with the rant against OCaml and its "difficulties" to setup and use.

I think an interesting question would be ask ourselves why OCaml performs so good. I think it boils down to two things:

a) good code analysis / auto inlining / tail call optimization, resulting in functional code being entirely lifted into more imperative constructors b) a very efficient generational GC that allows to allocate a lot of small objects very fast

I think we could provide the (a) as part of some of our native targets (HxCPP and/or HL) The (b) excludes Java and DotNet IMHO, we could have a specific GC for either HxCPP or HL - or even port Ocaml's one there - but that's not a small work.

ncannasse avatar Mar 18 '19 22:03 ncannasse

Regarding multithreading, I there it could work well for some very specific cases (pure map, some truly parallel stateless filters, etc.) but most of the rest of the compiler code has to remain single threaded for our own sanity.

ncannasse avatar Mar 18 '19 22:03 ncannasse

I think it would be a lot nicer to skip the OCaml layer if we could use something like HashLink or HxCPP instead. I really like how tiny HashLink is and how easy it is to compile. Reducing the need for OCaml tooling to just needing a C/C++ compiler would go a long way in making it easier to get setup for development.

zicklag avatar Mar 19 '19 01:03 zicklag

This is quite interesting on lack of parrallel Ocaml, but I expect you have read it, but there are many interesting aspects if you read all the comments. https://www.reddit.com/r/ocaml/comments/61pep4/ocaml_multicore_support/

nanjizal avatar Mar 19 '19 01:03 nanjizal

HL

The reason I didn't mention HL is because I'm not sure if this kind of performance is going to be your focus. At the moment, our mandelbrot benchmark is 40 times slower on HL than it is on hxcpp, Java and nodejs. This benchmark is very relevant to the performance discussion here because it creates a lot of small objects.

If we can address this, HL could be considered as an alternative. It would require quite a bit of energy to bring the GC to the required level, but I'm not saying it's impossible.

P.S.: Don't underestimate Java performance. A JVM target would be a tough contender, believe it or not.

Multi-threading

I'm quite sure that we could at least run large parts of our post-processing multi-threaded. Some things are obviously linear, like DCE and purity-inference, but there's also a lot of local mapping going on.

Technically, it should be possible to run our compiler passes multi-threaded as well. That's more challenging though and I'm inclined to agree that it's better for our sanity to not venture down that area.

Simn avatar Mar 19 '19 06:03 Simn

@Simon : yes, the mandelbrot benchmark is actually very GC oriented, unless compiled with -D reduce_allocs, I think outside of GC you should get similar or better perfs in HL/C than in HxCPP. But yes HL GC is very basic compared to what would be necessary for Haxe-in-Haxe Regarding JVM : yes I know the JIT is quite good, but I'm not sure how much the GC can cope with functional approach. Maybe writing a small example which does build some kind of expr and do a lot of stupid "map" over it would be a good micro benchmark to compare different platforms ?

On Tue, Mar 19, 2019 at 7:51 AM Simon Krajewski [email protected] wrote:

HL

The reason I didn't mention HL is because I'm not sure if this kind of performance is going to be your focus. At the moment, our mandelbrot benchmark https://github.com/HaxeFoundation/haxe/tree/development/tests/benchs/mandelbrot is 40 times slower on HL than it is on hxcpp, Java and nodejs. This benchmark is very relevant to the performance discussion here because it creates a lot of small objects.

If we can address this, HL could be considered as an alternative. It would require quite a bit of energy to bring the GC to the required level, but I'm not saying it's impossible.

P.S.: Don't underestimate Java performance. A JVM target would be a tough contender, believe it or not. Multi-threading

I'm quite sure that we could at least run large parts of our post-processing multi-threaded. Some things are obviously linear, like DCE and purity-inference, but there's also a lot of local mapping going on.

Technically, it should be possible to run our compiler passes multi-threaded as well. That's more challenging though and I'm inclined to agree that it's better for our sanity to not venture down that area.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HaxeFoundation/haxe/issues/6843#issuecomment-474219544, or mute the thread https://github.com/notifications/unsubscribe-auth/AA-bwL5Nfgb79GGojoQ53ZvKqkraj-orks5vYIj3gaJpZM4SA3nI .

ncannasse avatar Mar 19 '19 11:03 ncannasse

yes, the mandelbrot benchmark is actually very GC oriented, unless compiled with -D reduce_allocs, I think outside of GC you should get similar or better perfs in HL/C than in HxCPP.

Yes, it's only slightly slower than hxcpp with -D reduce_allocs. But that's a bit like saying that your parser tests work fine except for the failing parser tests.

One appealing thing about HL is its non-ridiculous compilation time. But overall this is all moot unless we get a GC that can compete. I suppose we would ultimately write this in a way that we could compile it to one or the other anyway, so the main question for now isn't really "what target should we use?" but "is there a target that might be/become viable?".

And I want that answer to be "yes". If none of the Haxe targets is suitable for running a compiler, we're back to being just a bunch of clowns who used to make flash games.

Simn avatar Mar 19 '19 15:03 Simn

That's a bit oversimplifying : a compiler is a quite specific piece of software which can highly benefit from some very particular runtime features (highly GC dependent, TCO, low memory overhead per "object") etc.

As I suggested the next step to evaluate our best target candidate would be to write some micro benchmarks evaluating these things, and see if some platform is already on-par with what OCaml provides.

ncannasse avatar Mar 19 '19 16:03 ncannasse

I agree. Fortunately, we have nadako working on his Actionscript compiler at the moment and we should get some nice performance metrics from that.

Simn avatar Mar 19 '19 16:03 Simn

I don't this that's the best benchmark. Unless @nadako widely uses ocaml-like lists the way we do in Haxe compiler?

On Tue, Mar 19, 2019 at 5:49 PM Simon Krajewski [email protected] wrote:

I agree. Fortunately, we have nadako working on his Actionscript compiler at the moment and we should get some nice performance metrics from that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HaxeFoundation/haxe/issues/6843#issuecomment-474465738, or mute the thread https://github.com/notifications/unsubscribe-auth/AA-bwMWk1jpM9GC12Kyel4zmsVrICHUGks5vYRUjgaJpZM4SA3nI .

ncannasse avatar Mar 19 '19 16:03 ncannasse

That's what I meant regarding Haxe-idiomatic and OCaml-idiomatic. Cons lists are very untypical for Haxe and our Haxe-in-Haxe should certainly avoid them. It's not an option without a reliance on TCO anyway. They are nice in many situations, but not a hard necessity for compilers.

Simn avatar Mar 19 '19 16:03 Simn

I don't use cons-lists, but I do use a lot of enums and immutable structures (I even have a a.with(field=value) macro).

nadako avatar Mar 19 '19 17:03 nadako

Perhaps we could salvage https://github.com/elnabo/haxe_in_haxe for performance metrics?

Also C# might be an interesting target too, since you can explicitly hint things to be structs (not so good for the recursive enums, but things like Field/ClassField could be quite a bit cheaper). Also AOT is a lot more mature for .NET/mono, and since the compiler doesn't have much need to use all the bells and whistles from the framework, chances are not-so-huge standalone binaries might fall out in the end.

As a side question: has anyone with experience in these matters tried what happens if you try building Haxe via js_of_ocaml / BuckleScript / ReasonML?

back2dos avatar Mar 20 '19 09:03 back2dos

Not sure it's ideal for haxe to follow c# NET core even if there maybe some interesting approaches to explore. OCaml-to-F#-conversions console F# with NET core - notice they target web and mobile net-lib Program.dll ??

nanjizal avatar Mar 21 '19 14:03 nanjizal

What's the current strategy of the haxe in haxe implementation?

djaonourside avatar Aug 02 '19 09:08 djaonourside

We discussed extensively again Haxe-in-Haxe today. We agree that we need more data before making a decision regarding if OCaml should be the runtime VS JVM or HL or HxCPP.

One solution I propose to learn about this is to port a "mini compiler" from Haxe to OCaml, and I think HScript Parser + Checker would be a good candidate as it's "vanilla" Haxe with no much optimizations.

Once ported, I can provide several medium sized scripts from our games in order to benchmark this. Depending on the benchmark results, we can tell a few things:

  • how OCaml compares to existing targets in terms of speed
  • how "vanilla" Haxe would be expressed in OCaml syntax
  • how much OCaml output needs to be "pure" (no references for instance) compared to what could output a potential Haxe-OCaml target

ncannasse avatar Apr 27 '20 11:04 ncannasse

I think the OCaml target is completely unnecessary for moving forward. It should go:

  1. Write ocaml ast -> haxe generator
  2. try it

We can worry about whether ocaml is going to be faster or not later. My guess is it will all depend on how Dynamic the code looks.

I would also add that a Lexer is a very finite piece of code and should be hand-optimized with haxe idioms rather than with ocaml conversion (Arrays vs list, loops vs recursion etc). There are several ways of writing a parser, but again, it is a finite piece of code and can be rewritten especially for haxe. Removing the ocaml dependence is the main point of this exercise. It is going to be easier than writing a new target.

hughsando avatar May 10 '20 14:05 hughsando

I have added :: operator and ability to construct and pattern match immutable lists, this can be followed on this branch https://github.com/HaxeFoundation/haxe/tree/genml

I'm trying to understand and possible make haxe_in_haxe run with Haxe 4.2.1 and I'm getting errors referring to the genml extension that doesn't seem to have any documentation explaining how/what it's suppose to mean/do, could someone shed some light here:

haxe "build.hxml"
src/compiler/Main.hx:67: characters 18-19 : Expected }
...
			case Flash:
				function loop (l:ImmutableList<{a:Float, b:String}>) {
					switch (l) {
						case []:
						case {a:v}::_ if (v > com.flash_version): //!!!!<<< here
						case {a:v, b:def}::l:
							var l:ImmutableList<{a:Float, b:String}> = l;
							context.Common.raw_define(com, "flash"+def);
							loop(l);
					}
				}
				loop(context.Common.flash_versions);
				context.Common.raw_define(com, "flash");
				com.package_rules = PMap.remove("flash", com.package_rules);
				add_std("flash");
				"swf";
...
haxe "build.hxml"
src/compiler/Main.hx:121: characters 62-63 : Missing ;
...
			case Cpp:
				context.Common.define_value(com, HxcppApiLevel, "332");
				add_std("cpp");
				if (context.Common.defined(com, Cppia)) {
					classes = core.Path.parse_path("cpp.cppia.HostClasses") :: classes; //!!!!<<< here
				}
				"cpp";

mingodad avatar Mar 03 '21 09:03 mingodad

If I remember correctly, the :: operator is used to manipulate list the same way as ocaml

In the context of a switch it is used to separate the elements of the list. The last value is the rest of the list.

So in the first example in {a:v}::_ {a:v} would be l[0] and the _ would be l.slice(1)

Otherwise it is an operator that create a new list that start with the value before :: and is followed by the values after.

So in the second example with parse_path(...) :: classes it would be similar to

classes = classes.copy(); // due to immutability
classes.unshift(parse_path(...)

The :: operator was only defined in the genml branch and worked with the specific ImmutableList type. I think they never lived outside of the branch. Unless someone implement them in a 4.2 branch, the best bet would be to do the operation manually using enums or to use Array.

elnabo avatar Mar 03 '21 10:03 elnabo

From the name (genml) I would say it is an experimental ml target. I would say comment out any reference to the target, but keep the AST changes. IIRC, the target was for presumed speed and is therefore an optimization that can be added once correctness is achieved.

hughsando avatar Mar 03 '21 11:03 hughsando

So if I understood it correctly this one will become (see bellow):

	public static function process_params(create:ImmutableList<String>->compiler.Server.Context, pl:ImmutableList<String>) {
		var each_params = new Ref<ImmutableList<String>>([]);
		function loop (acc:ImmutableList<String>, l:ImmutableList<String>) : Void {
			switch (l) {
				case []:
					var ctx = create(List.append(each_params.get(), List.rev(acc)));
					init(ctx);
					ctx.flush();
				case "--next"::l if (acc == Tl): // skip empty --next
					loop([], l);
				case "--next"::l:
					var ctx = create(List.append(each_params.get(), List.rev(acc)));
					ctx.has_next = true;
					init(ctx);
					ctx.flush();
					loop([], l);
				case "--each"::l:
					each_params.set(List.rev(acc));
					loop([], l);
...
	public static function process_params(create:ImmutableList<String>->compiler.Server.Context, pl:ImmutableList<String>) {
		var each_params = new Ref<ImmutableList<String>>([]);
		function loop (acc:ImmutableList<String>, l:ImmutableList<String>) : Void {
			switch (l) {
				case []:
					var ctx = create(List.append(each_params.get(), List.rev(acc)));
					init(ctx);
					ctx.flush();
				case "--next" if (acc == Tl): // skip empty --next
					l = l.slice(1);
					loop([], l);
				case "--next":
					l = l.slice(1);
					var ctx = create(List.append(each_params.get(), List.rev(acc)));
					ctx.has_next = true;
					init(ctx);
					ctx.flush();
					loop([], l);
				case "--each":
					l = l.slice(1);
					each_params.set(List.rev(acc));
					loop([], l);
...

mingodad avatar Mar 03 '21 15:03 mingodad

And this ones will become:

				case "--cwd"::(dir::l):
					// we need to change it immediately since it will affect hxml loading
					try {
						std.Sys.setCwd(dir);
					}
					catch (_:Dynamic) {
						throw new ocaml.Arg.Bad("Invalid directory: " + dir);
					}
					loop(acc, l);
				case "--connect"::(hp::l):
...
				case "--cwd":
					var dir = l.slice(1);
					l = l.slice(2);
					// we need to change it immediately since it will affect hxml loading
					try {
						std.Sys.setCwd(dir);
					}
					catch (_:Dynamic) {
						throw new ocaml.Arg.Bad("Invalid directory: " + dir);
					}
					loop(acc, l);
				case "--connect":
					var hp = l.slice(1);
					l = l.slice(2);
...

mingodad avatar Mar 03 '21 15:03 mingodad

Probably the second should become this:

				case "--cwd":
					var dir = l[1];
					l = l.slice(2);
					// we need to change it immediately since it will affect hxml loading
					try {
						std.Sys.setCwd(dir);
					}
					catch (_:Dynamic) {
						throw new ocaml.Arg.Bad("Invalid directory: " + dir);
					}
					loop(acc, l);
				case "--connect":
					var hp = l[1];
					l = l.slice(2);
...

mingodad avatar Mar 03 '21 15:03 mingodad

And now I can see that for it to possible work I'll need to get the extra haxe.ds data structures like haxe.ds.ImmutableList from this branch https://github.com/HaxeFoundation/haxe/tree/genml and so on.

It seems that this thread/idea lost traction !

mingodad avatar Mar 03 '21 15:03 mingodad

I mean, you've been replying to a thread where the last activity was almost a year ago, and that last activity was someone questioning the approach that was taken here.

I still maintain that involving OCaml in this process is silly, and it doesn't even matter what kind of benchmarks we could construct to suggest otherwise.

Simn avatar Mar 03 '21 16:03 Simn

I somehow agree with you, one point that holds me to invest more time and effort in learn/adopt Haxe is it's implementation in Ocaml that I do not know (and I did some attempt to learn it but for several reasons I didn't went forward with it).

I like a lot the overall idea behind Haxe but not been able to adapt/change it to my needs is a hard to swallow pill.

I saw this thread and thought that if Haxe could bootstrap itself and run in all targets that would make a lot easier for several users to jump in and fix/add new targets and improve the compiler itself.

mingodad avatar Mar 03 '21 16:03 mingodad

Yes, forget OCaml as a target initially. Forget adding AST components. Forget changing the haxe compiler at all - these are just distractions. The is a project written in haxe, not ocaml. No new "switch" syntax - just use if/else - after all, you are generating code so you can be as verbose as you like. For the list matching, I think you want some kind of "Array View" class to refer to ranges within an immutable Array somewhere.

So

 ... (l:ListView<String>) : Void {
			switch (l) {
				case []:
				case "--next" if (acc == Tl): // skip empty --next
				case "--next":

becomes:

   if (l.empty())
      ...
   else if (l.startsWith("--next") && acc==TI)
      ...
   else if ( l.startsWith("--next"))
    ...

and recursion would be something like:

function sum(l:ListView<Int>)
{
    if (l.empty()) return 0;
    if (l.size==1) return l[0];
    return l.head() + sum(l.afterHead());
}

where l.afterHead creates a new list view starting at the next element

No changes to the haxe compiler are required for this. You just need to start with a haxe-readable ocaml ast dump of some sort.

hughsando avatar Mar 04 '21 06:03 hughsando

Also there is another scripting language with similar ideas lily but slight different syntax and it's vm/compiler is written in C, I think that both communities would benefit by cooperating.

See this thread.

mingodad avatar Mar 04 '21 12:03 mingodad

What is the current status of supporting OCaml as a target language? I would like to (ab)use Haxe to define complex data structures (Algebraic data types) and get those structures available in multliple languages. Haxe sounds perfect for this, but I also need those data to be accessible from OCaml. Think JSON, but with enums in it instead of just arrays/objects. That would be better than protobuf/thrift/... which supports only very simple data structures.

aryx avatar Apr 04 '21 17:04 aryx

cc @ncannasse who wrote the first protototype. I would love to help if needed (I am an OCaml programmer).

aryx avatar Apr 04 '21 17:04 aryx

I think that it's a good idea to first have a Haxe in Haxe like the one I started here https://github.com/HaxeFoundation/code-cookbook/issues/160

mingodad avatar Apr 05 '21 07:04 mingodad

Also would be nice to have a EBNF grammar for HAXE before going forward, otherwise the code is the specification will make it hard.

mingodad avatar Apr 05 '21 08:04 mingodad

I have this crude HAXE grammar that I expect to publish here https://github.com/mingodad/plgh it can be visualized here https://www.bottlecaps.de/rr/ui , help to improve it is welcome !

program ::= stmt*

stmt ::=
	var_decl
	| 'final'? 'class' ID ('<' ID '>')? extends* implements* '{' class_members '}'
	| 'final'? 'interface' ID extends* '{' interface_members '}'
	| 'enum' ID '{' enum_item_list '}'
	| function_decl ';'
	| structure_type
	| abstract_decl
	| '@' ':' anotation_id ('(' expr ')')?

type ::=
	Void
	| Bool
	| Float
	| Int
	| Dynamic
	| ID ('<' ID '>')?

anotation_id ::=
	| 'arrayAccess'
	| 'coreType'
	| 'enum'
	| 'forward'
	| 'from'
	| 'notNull'
	| 'op'
	| 'to'

var_decl ::= ('var' | 'final') ID (':' type)? '=' expr

extends ::= 'extends' ID
implements ::= 'implements' ID

class_members ::=
	var_decl
	| function_decl

interface_members ::=
	function_decl

enum_item_list ::= enum_item+

enum_item ::= ID ';'

function_def ::= function_decl block_stmt

function_decl ::= 'static'? 'inline'? ('public' | 'private')? function_id_decl

function_id_decl ::= 'function' ID '(' param_decl_list? ')' (':' type)?

param_decl_list ::= param_decl (',' param_decl)*

param_decl ::= '?'? ID ':' type ('=' literal_expr)?

structure_type ::= 'typedef' ID ('<' ID '>')? '=' struct_type_def

struct_type_def ::=
	'{'  ('>'  ID ',')? structure_type_member_list '}'
	| ID '&' '{'  structure_type_member_list '}'
	| ID '<' ID '>'
	| function_id_decl

structure_type_member_list ::= structure_type_member (',' structure_type_member)*

structure_type_member ::=  ('var' | 'final') '?'? ID ':' type

abstract_decl ::= 'enum'? 'abstract' ID '(' type ')' ('from' type 'to' type)? '{' stmt '}'

block_stmt ::=
	var_decl
	| if_stmt
	| switch_stmt
	| 'return' expr
	| for_stmt
	| debug_stmt

if_stmt ::= 'if' '(' expr ')' block_stmt (else block_stmt)?

for_stmt ::= 'for' '(' ID 'in' range_def ')' block_stmt

range_def ::= expr '...' expr

switch_stmt ::= 'switch' '(' expr ')' switch_body switch_default?

switch_body ::=
	'case' (literal_expr | ID) ':' block_stmt*

switch_default ::= 'default' block_stmt

debug_stmt ::= '$' function_call

literal_expr ::=
	BOOLEAN
	| INTEGER
	| FLOAT
	| STRING

expr ::=
	ID
	| literal_expr
	| function_call
	| 'new' type
	| anonymous_struct
	| literal_array
	| access_field
	| expr op_binary expr

op_binary ::=
	'='
	| '||'		//Logical or
	| '&&' 	//Logical and
	| '>=' 	//Comparison and equality
	| '>'
	| '<'
	| '<='
	| '=='
	| '!='
	| '&' | '|' | '^' //Bitwise and, or, xor
	| '<<' | '>>'	//Bitwise shifts
	| '+' | '-'	//Plus, minus
	| '%' | '*' | '/'	//Modulo, multiply, divide

function_call ::= ID '(' param_list? ')'

param_list ::= param (',' param)*

param ::= expr

anonymous_struct ::= '{'  anonymous_struct_member_list '}'

anonymous_struct_member_list ::= anonymous_struct_member (',' anonymous_struct_member)*

anonymous_struct_member ::=
	(ID | STRING) ':' expr
	| '?'? ID ':' type

literal_array ::= '[' expr_list? ']'

expr_list ::= expr (',' expr)*

access_field ::= ID ('.' ID)+

mingodad avatar Apr 05 '21 11:04 mingodad

I just extended the crude Haxe grammar and published it here https://github.com/mingodad/plgh/blob/main/haxe.ebnf again any help to improve it is welcome !

mingodad avatar Apr 05 '21 17:04 mingodad

I think for haxe 5 it would be wise to make it haxe in haxe. Based off just some generic internet benchmarks C++ G++ and C GCC are technically faster, although I don't know if this holds up for haxe compiled code (especially 🤢 windows compiled code, considering this uses gcc and g++). Between HL/C and C++ it seems its a tossup depending on the task (which emphasizes the importance of what you are doing), but C++ wins for more complex tasks (or tasks that take longer in general) and C wins for shorter tasks.

However, both of them seem to beat out ocaml (C i'm not too sure about, as there is no direct comparison), which for our purposes is all that matters. Personally, if I were rewritting/porting the compiler to haxe, my target would be C++ as C++ is nicer to work with and is lax with its' types (unlike a certain OTHER HAXE VM), and it also runs faster for more complex tasks.

Porting to haxe should be done not just because performance - although it's a nice benefit - but because the HAXE community wants to hack onto the compiler. Currently, I would like to edit how the compiler handles java/jvm jar interface importing. Haxe doesn't have default interfaces and makes you do them yourself (which imo is a bit silly but that's another thing). I downloaded haxe repository, downloaded ocaml (which by the way was a pain to do on windows) and looked at the code and said "What the heck is this language, what does it mean?" OCaml, at least to a person who has only written in C#, Javascript, Java, and Haxe, is very unreadable and has a steep learning curve. I would have been probably completely done with the Haxe4Java project by now if it was written in haxe. Rewriting the compiler to haxe would make it easier to recruit new contributors who know what they're doing, because it's written in the language that you want to help with.

TL;DR; C++ and C are faster than OCaml, C++ is generally faster, OCaml has a steep learning curve, and if haxe was rewritten in haxe the project could be maintained by more people and more easily.

TheDrawingCoder-Gamer avatar Sep 30 '21 12:09 TheDrawingCoder-Gamer

Compiler looks like it makes a big difference for C, and I assume the same applies to C++. So Hxcpp will probably make a big difference, same with HL/C. OCaml may have consistent speed (which some would say is better than just being fast). this invalidates the first 2 paragraphs of my mini-essay a bit but I think the 3rd point still stands. However, we don't have benchmarks for it. When I get home and finish my work I'll rewrite some of those programs in haxe.

TheDrawingCoder-Gamer avatar Sep 30 '21 12:09 TheDrawingCoder-Gamer

You won't like hearing this, but Haxe is (like all compilers) complicated and people mostly just use OCaml as an excuse once they realize that.

... Haxe should still eventually move away from OCaml, but this "more contributors" argument has always been nonsense. The only way to prove me wrong is by making substantial contributions to haxelib. :|

Simn avatar Sep 30 '21 13:09 Simn

I don't think speed is a great issue. I'm sure that if some auto-port is done, it will be way slower initially and this will be about the memory usage patterns rather than raw language speed. It would then be a matter of profiling and optimizing, which can be an ongoing process. Ultimately, I think the speed gains would come from multi-threading. The point about the pain in setting up OCaml/Opam is a much better reason in my mind. That and, you know, we've made a language, it would be nice to use it.

hughsando avatar Sep 30 '21 14:09 hughsando

The OCaml installation is our hazing ritual.

Simn avatar Sep 30 '21 14:09 Simn

Hey, I half understand haxelib. I hacked onto it with silk and added commands to it (my favorite being the why command, which required writing new functions that iterate thru the libs installed). I'm sure if I tried I would be able to add it to the real haxelib. And just to be petty I may do that today lmao

TheDrawingCoder-Gamer avatar Sep 30 '21 14:09 TheDrawingCoder-Gamer

Do we know what the major use-cases are for hacking on the compiler?

My main use is adding new targets and other code generation on the back-end. Context.onGenerate works well for this.

Improving the performance of interp would make this use better. It might be nice to bridge the internal haxe.macro.Type api to HashLink so that generators written in Haxe could be compiled to HL C and incorporated in a custom compiler build.

nickmain avatar Sep 30 '21 19:09 nickmain

Going to just leave this here for anyone feeling adventurous https://github.com/zshipko/ocaml-rs

nanjizal avatar Sep 30 '21 21:09 nanjizal

@hughsando would ocaml-hxcpp be possible in a similar manor?

nanjizal avatar Sep 30 '21 21:09 nanjizal

Main use case for me would be adding targets, and adding new fully supported features (like default implementations). Macros technically work for this however if I hack on the compiler I can add new keywords

TheDrawingCoder-Gamer avatar Sep 30 '21 22:09 TheDrawingCoder-Gamer

I could see many advantages for having Haxe in Haxe:

  • we have a nice language - use it
  • it would make Haxe more powerful since in order to make it performant and reliable, the language and the targets would improve.
  • much easier setup of a development environment
  • writing plug-ins for the compiler could be much easier (first test the idea in a macro, then turn the macro in a pre-compiled plug-in)
  • we could use the --interp target as a scripting language with full Haxe power (my favorit)
  • and maybe it would be more attractive to contribute for more people
  • ...

AdrianV avatar Oct 01 '21 14:10 AdrianV

I'm not sure if useful, but it would grant the ability to compile client side in a browser.

elnabo avatar Oct 01 '21 14:10 elnabo

making hscript be actually haxescript would be amazing

TheDrawingCoder-Gamer avatar Oct 01 '21 14:10 TheDrawingCoder-Gamer

hscript is already very powerful; if we wrote the compiler in haxe we could include it as a lib. recommendations: make a class that just calls interp/make a “haxe” library

TheDrawingCoder-Gamer avatar Oct 01 '21 14:10 TheDrawingCoder-Gamer

Recently I have been working with some concepts of transpilers for academic purposes, and Haxe would be a good framework to test it (instead of creating yet another transpiler). Maybe this concept is the same of genml (GenOCAML) in the beginning of this thread.

It is easier to show the concept with examples. For instance, GenSwift.hx could be an implementation for Swift target.

# handwritten GenOCAML in /haxe/src/generators/GenOCAML.ml

# convert handwritten GenSwift.hx to GenSwift.ml
haxe GenSwift.hx --ocaml GenSwift.ml
copy GenSwift.ml /haxe/src/generators/GenSwift.ml

# convert handwritten MyApp.hx to MyApp.swift
haxe MyApp.hx --swift MyApp.swift

The file GenSwift.hx could be implemented with something like this:

import Ast; // wrapper to AST of OCAML
import Globals;
import Type;
import Common;

class GenSwift {

  /*
 
    Anyone who wants to contribute to Haxe would need to learn reusable
    skills: Haxe and (in this example) Swift. The only specific skill would
    be to learn how the Ast module works. Instead of docs to explain the
    Ast module, maybe a simple generator with a few lines (not a real world
    programming language), could be a better documentation.

    Since Ast is a wrapper (to AST of OCAML), it could be replaced by another
    implementation of AST (in Haxe-in-Haxe, or as any other wrapper, in a non
    GC language like: C/C++ or Rust).

    To simplify the development of GenOCAML.ml, the Ast wrapper could replace
    some feature (that should be implemented in GenOCAML.ml). Also, the
    structure of the Ast wrapper would impact: in learning the Ast, and in the
    implementation of an eventual new parser.

   */  
   
}

If this concept is correct, the Ast wrapper could replace these steps:

Automated Haxe compiler OCaml-to-Haxe ML2HX (@nadako) : Merge of both ML target and ML2HX :

otuyama avatar Oct 11 '21 00:10 otuyama

Revisiting, I do think that ocaml is probably a better language. After learning haskell it is much more understandable - and for language parsing, the more functional is usually better. Perhaps what we really want is more powerful access to haxe compiler. I know JS already has something like this, but a good way to write in haxe a target would be amazing. I also noticed that haxe plugins exist, but I am unsure of the power they provide. A haxe in haxe may be cool, as it would allow easy installation and building of new haxe compilers via haxe (would surely be a boon for e.g. lix).

TheDrawingCoder-Gamer avatar Jul 07 '22 00:07 TheDrawingCoder-Gamer