orleans
orleans copied to clipboard
First-class on-premise Stream Provider
Dear .NET Team,
We are using Orleans in the on-premise Kubernetes clusters. The reason for this is that our clients insist on running our software (manufacturing AI) within the walls of their factories so none of their intellectual property gets lost.
So far Orleans has been working very well in on-premise for us but we have been missing out on Persistent Streams, as it is only really available in a cloud deployment with Azure Storage Queues.
While there are some 3rd party implementations (Kafka, Redis, RabbitMq), they lack good quality of code and often miss the mark.
Would it be possible to put an on-premise Stream Provider for Orleans in the set of first-class libraries developed by the .NET team? Ideally something that is not a resource hog such as Kafka, more like Etcd.
PS. This has been previously highlighted in https://github.com/dotnet/orleans/issues/7271 but it was not pulled into the backlog.
I guess right now you could use MemoryStreams
, but they are not really persistent...
Your guess is right @benjaminpetit
Please triage as feature request.
@turowicz We are not planning to implement this feature in Orleans at the moment. If you have some on-prem database which is available, a stream provider could be created for that database
That's very unkubernetesy
If you have a target backend system in mind, you could build a provider for it and release it, or improve the other providers which you mentioned as lacking.
I wouldn't recommend etcd for this, since it was designed as an unpartitioned metadata store (low volumes of data, generally low update rate). NATS may be more appropriate.
@ReubenBond thanks for suggesting NATS, it looks like a good candidate.
RE etcd capacity, have you seen this? https://www.cncf.io/blog/2019/05/09/performance-optimization-of-etcd-in-web-scale-data-scenario/
RE etcd update rate - totally right
A NATS provider would be a great addition, I think
It looks like I have no choice but to implement it. How should I go about making sure it goes in the Contrib?
@ReubenBond do you know how NATS compares to Etcd in terms of key-value storage? I need something that can run on small arm devices and basically be a layer over the filesystem (small json objects). Etcd has delivered on this promise except for some reason its caches grow exponentionally high at times.
How should I go about making sure it goes in the Contrib?
Coordinate with the community in the #development channel in Discord.
do you know how NATS compares to Etcd in terms of key-value storage?
It's not built for it - I was suggesting NATS for a stream provider, not KV storage. If the goal is to run on small devices, maybe these kinds of systems aren't ideal. They're mostly built for clusters of servers. Do you need something replicated/resilient? If not, maybe use SQLite for everything and store data locally.
Actually Etcd works pretty well for me in single instance with low resources. I was thinking if I could swap it with NATS as it in addition has queues (similar to azure storage). I'm fine with spinning up NATS aside of Etcd, but it would save me CPUs if I didn't need two systems.
https://nats.io/blog/kv-cli/#:~:text=JetStream%20for%20Key%2DValue%20Store%20Tech%20Preview%20%7C%20NATS%20blog&text=Key%2DValue%20stores%20are%20specialised,main%20storage%20engine%20of%20Kubernetes.
That said, we also have massive clusters (100s of CPUs & tons of RAM), but our solution runs on a wide variety of hardware.
To clarify what that blog post says, etcd is used for storing metadata/specs/status, but not application state. Small, low velocity data. If it meets your needs, then by all means go for it.
@ReubenBond
I have contacted you guys on the Discord development channel
I've been reading about nats today and I might highlight something I found in the docs about NATS Streams (also known as "STAN"). It seems that NATS Streams is effectively a legacy product, and that future new streaming work with NATS should be done with NATS JetStream instead. This is fully integrated into the primary NATS product and adds incrementally more robust behaviour (persistence, producer/consumer buffering etc) onto the base messaging caps, i.e. pub/sub, queues and request/reply.
https://docs.nats.io/nats-concepts/jetstream/streams
@oising I am well aware. This will be a JetStream integration with Retention Policy = WorkQueuePolicy
We've moved this issue to the Backlog. This means that it is not going to be worked on for the coming release. We review items in the backlog at the end of each milestone/release and depending on the team's priority we may reconsider this issue for the following milestone.
I have halted the work on this as the .NET lib for NATS is not async-await capable, therefore it's not very feasible to use within Orleans model.
https://github.com/Surveily/Orleans.Streaming.NATS
There are other unofficial - and active - libraries that are async enabled.
https://link.medium.com/FxlT6t7J2qb (pubsub only, but alloc free and highly optimized)
And:
https://github.com/danielwertheim/mynatsclient
@oising we will be circling back to this issue very soon. Good find, thanks!
@turowicz -- having played with https://github.com/Cysharp/AlterNats for a bit, I can say it's absolutely great. They seem to have full support for rpc, pubsub and messaging (commands) -- highly optimized, supports sharded nats servers and uses modern dotnet idiomatic API design.
Oops, they don't have Jetstream support (yet) - this would be essential for a stream provider, of course.
https://github.com/Cysharp/AlterNats/issues/5
He're is an on-prem provider written by me, that is using stateful grains as queues:
https://github.com/Surveily/Orleans.Streaming.Grains