hashlink
hashlink copied to clipboard
Feature/gc
https://benchs.haxe.org/formatter_noio/index.html#6;true;true;true;DataAndAverage;SMA;2020-04-18;;all
it seems this version of hl powerful.
current branch is not support cmake in ubuntu.
@sonygod Yes, this is a work in progress. No guarantees that anything works for anyone yet. My setup is on OS X using the Makefile
.
Looking forward to updates on this PR @Aurel300 ;)
Looking forward to updates on this PR @Aurel300 ;)
Looking forward to updates on this PR @Aurel300 ;)
Is this dead?
The benchmarks are really promising. Will this stream ever be finished?
Since this PR was posted with no actual description, I thought I would provide some background for anyone that stumbles upon this.
This PR is about improving memory management, specifically the conservative tracing garbage collector from "mark and don't sweep" to "immix-style mark region and sweep":
- 2017-12-07: "Notes on Garbage Collector" GitHub wiki page on original (and current) GC
- 2020-06-15: "Modern garbage collector for HashLink and its formal verification" Master's thesis paper
- 2020-06-23: "Modern GC for HashLink" YouTube video of MEng thesis presentation
- 2020-07-07: "The new HashLink garbage collector" subsequent blog post
The changes in this PR are currently based on a fork with last common commit of 2020-01-16 (125 commits behind master) and 13 changes between 2020-03-30 and 2020-05-25.
The paper states that the prototype GC was developed separately using only the HashLink GC API for reference for ease of eventual future integration back in to HashLink.
The paper also mentions (section 3.6):
Although HashLink’s original GC was designed with some amount of extensibility in mind, we chose to not tie the new GC to this framework too closely, as we found it easier to make rapid changes with a “blank slate” in this area. Re-aligning the new GC with the original framework will be the subject of future work[...].
and (Appendix A):
At the time of writing, the new collector is only usable with 64-bit architectures, and was tested on Mac OS X 10.9.5 and Ubuntu 18.04 only. Compilation fails entirely on Windows, since the underlying OS calls assume a POSIX environment. As noted[...], re-aligning the codebase with the framework of the original HashLink project is subject to upcoming future work.
The paper also covers several potential improvements (some requiring changes to the GC API which this set of changes attempted to avoid).
The blog post mentions:
Some bugs still need to be fixed, and the codebase needs to be "re-aligned" to the existing HashLink codebase, so that the two collectors can be tested in parallel, by simply compiling HashLink with a different compile-time switch.
So presumably the current "immix" derivative GC is currently just replacing the original in the branch (instead of adding so they can both exist in parallel) and does not support some architectures (Windows and 32-bit x86, etc.) that the original GC supports.
I imagine such things would would have to be rectified to get this merged into master.
I've read through a good chunk of that material. I was just hoping that this stream might end up completing, but based on your elaboration, it really sounds like it's dead.
There are a few of the benchmarks that really show the potential of this approach, and it would remove some of the benefit hxcpp has over hashlink. Hashlink is so amazing for development, but I may consider shipping on hxcpp due to the perf, but that's such a along way off it's academic atm.
Thanks @Uzume for the detailed explanation.
@onehundredfeet Well the fact that @ncannasse opened, then closed, then reopened and marked it as a draft all on 2020-04-27, provides a hint that as it stands this is unlikely to ever be accepted/merged. However, it also implies the concept is very interesting and important so a new GC based on this work is still likely in the future (provided someone does the work to get it integrated properly).
The fact that all the work by the author was done before the paper was published also seems to establish that the author has no immediate plans to actually work on getting this integrated (he got his degree and moved on; although it should be noted the author is still about as I have seen newer Haxe blog posts by him).
After 2022-05-25 (currently about a month away), there will have been no actual changes to the branch code for over two years (unless something is in the works very soon which seem unlikely).
So I see this particular PR intentionally open but dead until a better one comes along because the concept is still valid and it is a somewhat functional prototype (as evidenced by the benchmarks that are still kept up-to-date mentioning "HashLink Immix" and "HashLink/C Immix").
@Uzume Thanks for the summary! I am indeed still around and do things with Haxe, although I have limited free time and my current focus is ammer
. I don't want to completely abandon this, though I do remember towards the end of the project I was struggling not so much with re-aligning the APIs, but more with making some benchmarks work at all. The problematic case was, IIRC, the Dox benchmark. On the benchmarks website, you can see there is no runtime listed for Hashlink Immix. This is because (again, IIRC) the benchmark was crashing at runtime probably because of very heavy use of Dynamic
s. Debugging GC problems is really tricky and I didn't manage to make this work in the end. When I return to this I think one of the first things I need is a better way to trace and follow the GC's behaviour.
@Aurel300 Thanks for the update! I am definitely interested in any progress here.
@Aurel300 A good universal FFI like ammer is very useful.
However, even if there are bugs and this fails in a number of cases, it might be possible to get this merged if it was "aligned" as only one possible GC at compile time (and certainly not the default). Then others could more easily work on solving such issues and perhaps making it more robust and perhaps even one day the default.
I do find it somewhat amusing that such things are so hard to debug but are then often used to debug memory leak issues. I suppose it is a bit like debugging debugger development.
Just posting a quick note to mention I'm also super interested in any improvement to the GC. Hope to see this PR merged one day :)