MBrace.Core icon indicating copy to clipboard operation
MBrace.Core copied to clipboard

MBrace cross platform story

Open isaacabraham opened this issue 10 years ago • 12 comments

This is an issue to investigate what is required to get MBrace running x-plat. Do we look at Mono or wait for CoreCLR? What are the cost / benefits of both approaches? What are the deployment options etc. etc.

isaacabraham avatar Dec 01 '15 16:12 isaacabraham

Linked to #93.

isaacabraham avatar Dec 01 '15 16:12 isaacabraham

As I understand it we should in theory be able to run on Mono Linux servers if we can remove the dependency on the .NET service bus API and just go direct to REST (or if the .NET team have made that stuff cross-platform). Vagabond has a Mono problem on the client side (@eiriktsarpalis that shouldn't affect the server side should it?)

dsyme avatar Dec 01 '15 18:12 dsyme

Service Bus would have to be HTTP calls I believe - not only is it (I believe) not x-plat compatible but it's also still (again, AFAIK) one of the only Azure .NET SDKs that remain closed source.

An alternative would be to consider another messaging service in Azure - obviously Service Bus is the obvious candidate on Azure but other ones exist - Storage Queues could potentially be an option, whilst we could also look something like Kafka - but this would require more work from an infrastructure / provisioning point of view.

isaacabraham avatar Dec 01 '15 20:12 isaacabraham

Here's a breakdown of all the issues that (may) inhibit the mbrace cross-platform story:

  1. Vagabond is known to not function well in mono. Even though continuous bug reports have somewhat improved the situation (runtime crashes are less common), some have still not been fixed (c.f. https://bugzilla.xamarin.com/show_bug.cgi?id=33208). It remains to be proven whether Vagabond works well in CoreCLR.
  2. FsPickler: works in mono, has not been tested in CoreCLR yet. It heavily relies on runtime code generation, not sure whether that will bode well in CoreCLR.
  3. AppDomains: not included in CoreCLR, used by MBrace.Runtime for user code isolation.
  4. Mono <--> CLR communication: many of the mscorlib types use field-based serialization so serializations are highly implementation sensitive. FsPickler does little to bridge any such discrepancies, so establishing communication between mono and CLR process might yield serialization errors.
  5. MBrace.Thespian does not work in mono. Possibly related to Nessos.Thespian issues, which is no longer being maintained.
  6. Azure libraries that are windows-specific.

eiriktsarpalis avatar Dec 01 '15 21:12 eiriktsarpalis

That's a great list.

  1. Is it fair to assume Vagabond is only buggy in its mode of use the MBrace client, not the workers?
  2. runtime code gen should be ok on CoreCLR
  3. what happens if we use processes for isolation on the worker (which I think we will have to)? You keep a pool of appdomains, correct?
  4. Urgh, field sensitivity could be painful. For CoreCLR I'd imagine it would be much less so
  5. We should definitely gradually wean ourselves off any non-x-plat Azure services.

dsyme avatar Dec 01 '15 23:12 dsyme

  1. That's a fair assumption, although Vagabond does not distinguish between client and server. There exists the theoretical possibility of runtime code generation at worker A, which would then have to be persisted and sent to worker B.

  2. I've been thinking about doing this at some point. For standalone worker deployments, it offers a greater deal of isolation and protects the main process from things like user stack overflows. Another possibility is investigating System.Runtime.Loader, already utilised by C# interactive. Not sure if that works with mono though.

eiriktsarpalis avatar Dec 01 '15 23:12 eiriktsarpalis

3 - I assume that fact that app domains are not in .NET Core indicates a move to processes as the unit of isolation. So I'd avoid System.Runtime.Loader if we want any real control.

dsyme avatar Dec 01 '15 23:12 dsyme

Just a thought, how about simplifying the story a bit: isolate clusters per platform. Realistically, what is the use case for a cluster that spans platforms? Even assuming there is a need to do platform-specific stuff, is it even possible/desirable to describe this sort of "task affinity"?

et1975 avatar Apr 04 '16 15:04 et1975

Just a thought, how about simplifying the story a bit: isolate clusters per platform. Realistically, what is the use case for a cluster that spans platforms

Depending on what is meant by "platform", I think there's a number of use cases for non-homogeneous MBrace-style distribution (if it can be made to work reliably - it's hard!). For example, a subset of machines may be GPU-enabled, and for some reason be running a different OS version, or different .NET version, or even a different OS. With .NET Core a possibility, MBrace is tantalizingly close to achieving code mobility across these form factors.

dsyme avatar Apr 05 '16 01:04 dsyme

Hi, I just came across MBrace, interested in NET solutions to big data problems, and I'm curious to know if there is a definitive status regarding MBrace for Linux. As this issue is still "open", MBrace doesn't support Linux yet, right? - Thanks

DenisDollfus avatar Jan 18 '17 20:01 DenisDollfus

@DenisDollfus we are more or less blocked by the issues described above, yes.

eiriktsarpalis avatar Jan 18 '17 21:01 eiriktsarpalis

@DenisDollfus maybe not what you had in mind, but if you consider Apache Storm a platform, then there's a .NET solution that works on linux (and Windows) today: https://prolucid.github.io/FsShelter You can try it out in a container: https://hub.docker.com/r/prolucid/fsshelter-samples/

et1975 avatar Jan 18 '17 21:01 et1975