MBrace cross platform story
This is an issue to investigate what is required to get MBrace running x-plat. Do we look at Mono or wait for CoreCLR? What are the cost / benefits of both approaches? What are the deployment options etc. etc.
Linked to #93.
As I understand it we should in theory be able to run on Mono Linux servers if we can remove the dependency on the .NET service bus API and just go direct to REST (or if the .NET team have made that stuff cross-platform). Vagabond has a Mono problem on the client side (@eiriktsarpalis that shouldn't affect the server side should it?)
Service Bus would have to be HTTP calls I believe - not only is it (I believe) not x-plat compatible but it's also still (again, AFAIK) one of the only Azure .NET SDKs that remain closed source.
An alternative would be to consider another messaging service in Azure - obviously Service Bus is the obvious candidate on Azure but other ones exist - Storage Queues could potentially be an option, whilst we could also look something like Kafka - but this would require more work from an infrastructure / provisioning point of view.
Here's a breakdown of all the issues that (may) inhibit the mbrace cross-platform story:
- Vagabond is known to not function well in mono. Even though continuous bug reports have somewhat improved the situation (runtime crashes are less common), some have still not been fixed (c.f. https://bugzilla.xamarin.com/show_bug.cgi?id=33208). It remains to be proven whether Vagabond works well in CoreCLR.
- FsPickler: works in mono, has not been tested in CoreCLR yet. It heavily relies on runtime code generation, not sure whether that will bode well in CoreCLR.
- AppDomains: not included in CoreCLR, used by MBrace.Runtime for user code isolation.
- Mono <--> CLR communication: many of the mscorlib types use field-based serialization so serializations are highly implementation sensitive. FsPickler does little to bridge any such discrepancies, so establishing communication between mono and CLR process might yield serialization errors.
- MBrace.Thespian does not work in mono. Possibly related to Nessos.Thespian issues, which is no longer being maintained.
- Azure libraries that are windows-specific.
That's a great list.
- Is it fair to assume Vagabond is only buggy in its mode of use the MBrace client, not the workers?
- runtime code gen should be ok on CoreCLR
- what happens if we use processes for isolation on the worker (which I think we will have to)? You keep a pool of appdomains, correct?
- Urgh, field sensitivity could be painful. For CoreCLR I'd imagine it would be much less so
- We should definitely gradually wean ourselves off any non-x-plat Azure services.
-
That's a fair assumption, although Vagabond does not distinguish between client and server. There exists the theoretical possibility of runtime code generation at worker A, which would then have to be persisted and sent to worker B.
-
I've been thinking about doing this at some point. For standalone worker deployments, it offers a greater deal of isolation and protects the main process from things like user stack overflows. Another possibility is investigating System.Runtime.Loader, already utilised by C# interactive. Not sure if that works with mono though.
3 - I assume that fact that app domains are not in .NET Core indicates a move to processes as the unit of isolation. So I'd avoid System.Runtime.Loader if we want any real control.
Just a thought, how about simplifying the story a bit: isolate clusters per platform. Realistically, what is the use case for a cluster that spans platforms? Even assuming there is a need to do platform-specific stuff, is it even possible/desirable to describe this sort of "task affinity"?
Just a thought, how about simplifying the story a bit: isolate clusters per platform. Realistically, what is the use case for a cluster that spans platforms
Depending on what is meant by "platform", I think there's a number of use cases for non-homogeneous MBrace-style distribution (if it can be made to work reliably - it's hard!). For example, a subset of machines may be GPU-enabled, and for some reason be running a different OS version, or different .NET version, or even a different OS. With .NET Core a possibility, MBrace is tantalizingly close to achieving code mobility across these form factors.
Hi, I just came across MBrace, interested in NET solutions to big data problems, and I'm curious to know if there is a definitive status regarding MBrace for Linux. As this issue is still "open", MBrace doesn't support Linux yet, right? - Thanks
@DenisDollfus we are more or less blocked by the issues described above, yes.
@DenisDollfus maybe not what you had in mind, but if you consider Apache Storm a platform, then there's a .NET solution that works on linux (and Windows) today: https://prolucid.github.io/FsShelter You can try it out in a container: https://hub.docker.com/r/prolucid/fsshelter-samples/