proposal-ecmascript-sharedmem icon indicating copy to clipboard operation
proposal-ecmascript-sharedmem copied to clipboard

Formal memory model

Open jfbastien opened this issue 9 years ago • 13 comments
trafficstars

TL;DR: IMO getting formal memory model work started (not necessarily finished) should block moving the spec to Stage 3. I like SAB, but I don't trust English when we could have Maths.

A formal memory model akin to the work that was done for C++ would be useful in ensuring the SAB memory model is solid. Doing so found a bunch of issues for C++. This issue is related to #22.

The SAB model isn't quite like C++'s in that it's:

  1. Based on memory locations and not objects and their lifetime.
  2. Allows mixing accesses of different types sizes (but does require alignment).
    • But doesn't have type-based aliasing rules.
    • Doesn't have memcpy / memmove / memset (which are a trouble with the C11 spec).
  3. Allows mixing non-atomic accesses and atomic accesses (see this paper and #13).
    • This has extra issues w.r.t. non-lock-free operations, which would typically go through a lock shard provided by the VM.
  4. Only supports seq_cst for now (but I'd like the SAB model to also support acquire / release as in #15 to show it'll work).
  5. Doesn't support fence as in #25 since they aren't needed when only seq_cst is available.
  6. Specifies futex (C++ puts mutex in the library), and maybe micro-wait (see #87).
  7. Doesn't deal with signals and setjmp/longjmp at the moment, and doesn't have signal_fence.
  8. Has to lower to different types of hardware (x86, ARM, A64, MIPS, POWER are all likely targets).
  9. Tries to specify what happens when there are races (see #37, #48, #51, #71, and #82).
  10. There can be multiple SABs, which would be similar in C++ to having multiple disjoint "memories".
  11. There's another realm of JavaScript objects outside of the SABs, as well as an event loop and Web APIs, which could observe ordering of SAB operations indirectly.
  12. SAB will likely interact with WebAssembly's own atomics (detailed proposal), similar to intra-process shared memory but without C++11's volatile escape hatch.
    • This will be even more complicated if both don't have exactly the same memory model.

In that sense it's closer to what a hardware memory model looks like.

Specifically, I'd like something similar to Jade's thesis or Mark's thesis. Background history in these slides, it seems like SAT or SMT are ideally suited for this purpose.

Without this model the best case is that the English spec happens-to-work, but the worst case is that we move to Stage 3, browser ship without a flag, devs rely on things which we have to relax and browsers can't / won't break them. That would be unfortunate, but not the end of the world: witness Java, pre-C11 C, pthread, Linux, etc all having broken memory models and still working. Having a formal memory model is one of the rare cases in CS where Maths can be used to show tricky things work, I think it would be silly to ignore the last few years' advance in this field.

Having a formal model can also help figure out which optimizations are now invalid in implementations, e.g. 1 and 2, but I think this is just a nice side-effect of having a formal model in the first place.

jfbastien avatar Mar 24 '16 20:03 jfbastien

EDIT: Moved to #91.

littledan avatar Mar 26 '16 00:03 littledan

EDIT: Moved to #91.

jfbastien avatar Mar 26 '16 04:03 jfbastien

EDIT: Moved to #91.

littledan avatar Mar 26 '16 08:03 littledan

EDIT: Moved to #91.

jfbastien avatar Mar 26 '16 18:03 jfbastien

EDIT: Moved to #91.

lars-t-hansen avatar Mar 27 '16 15:03 lars-t-hansen

EDIT: Moved to #91.

lars-t-hansen avatar Mar 27 '16 15:03 lars-t-hansen

I'd like this bug to focus solely on the memory model. My intent is to list what's different from other work, and get the memory model work started. I'm happy to fork the stage 3 discussion to another bug, but we need a bug tracking memory model work and I believe this is the right place for it.

jfbastien avatar Mar 27 '16 16:03 jfbastien

@jfbastien, I will change the title back and move/copy the comments about the work blocking stage 3 to another bug, but I think in this case you set yourself up for hijack ;-)

lars-t-hansen avatar Mar 27 '16 16:03 lars-t-hansen

I only have the lightest of opinions on this matter at the moment, but others have raised the issue that an operational model may be a better fit than an axiomatic model, given how low-level our semantics are. Pichon-Pharabod and Sewell (POPL 2016) don't call their semantics "operational" but to my eyes it has that flavor. Gustavo Petri's thesis (INRIA, 2010) is explicitly operational.

lars-t-hansen avatar Jul 05 '16 15:07 lars-t-hansen

I only have the lightest of opinions on this matter at the moment, but others have raised the issue that an operational model may be a better fit than an axiomatic model, given how low-level our semantics are. Pichon-Pharabod and Sewell (POPL 2016) don't call their semantics "operational" but to my eyes it has that flavor. Gustavo Petri's thesis (INRIA, 2010) is explicitly operational.

Is there work towards this, or plans to advance such work? I think it's worth pursuing the proposed research regardless (I'll post a full proposal once we secure funding). It would nonetheless be good to have details of any complementary work.

jfbastien avatar Jul 07 '16 16:07 jfbastien

At this stage there is no work going on toward an operational formalization and there are no plans to kick off an effort in that direction independently of the research that you are proposing. I think it is highly worthwhile to pursue the current research plan.

lars-t-hansen avatar Jul 07 '16 17:07 lars-t-hansen

My attempt at a formal model: https://github.com/asajeffrey/ecmascript_sharedmem/blob/master/MEMMODEL_WIP.md

It's a lot like the C/C++11 and LLVM models. The main differences are:

  • Like the LLVM model, it is per-location rather than per-object, and there are no type requirements on accesses.
  • It tries to tie down the semantics of mixed-size access, in particular it has only has isolation on accesses of the same size, so you can spot tearing between accesses at different size.

The document is the result of conversations with James Riely, who's the co-author on my LICS 2016 paper on relaxed memory semantics.

asajeffrey avatar Sep 26 '16 17:09 asajeffrey

Work is continuing here so I'll leave that open, but this matter no longer blocks the spec.

lars-t-hansen avatar Feb 09 '17 07:02 lars-t-hansen