redis-om-dotnet
redis-om-dotnet copied to clipboard
Can this library be used for distributed processes?
hi,
From the documentation, its not very clear, at least to me. Hence thought of getting it clarified. Is it possible to use this library by 2 independent processes? If we use the same example given in docs, where one process is loading the customer collection and second independent process (one or more) is simply using the collection in its processing. Is this possible?
Other question that I had was when does the data inserted to collection get deleted?
Thanks!
Hi @askids,
Not associated with this repo, but maybe I can answer some what. From redis perspective, this client is just that, a client, and redis can accept multiple clients at the same time. So if you have multiple processes (clients) making calls to redis-server, each one will use the Multiplexer and reach out to the server, no problem. We stick the Provider in our container, and resolve it from which ever process needs to query / work with redis.
As for persistence, it stays stored until you delete it, or set an expiration time; as the same as traditional redis.
@VagyokC4 thanks. But my question is can 2 clients read the same collection and are they getting the same copy of data?
From the example, when we use provider.RedisCollection<Employee>() and multiple clients are connected, are they reading the same data ? Typically, we provide the keyname to store/retrieve data. Here we are pointing to RedisCollection. So which keyname/Id will get used to retrieve data?
var provider = new RedisConnectionProvider("redis://localhost:6379");
var connection = provider.Connection;
var employees = provider.RedisCollection<Employee>();
var employee1 = new Employee{Name="Bob", Age=32, Sales = 100000, Department="Partner Sales"};
var employee2 = new Employee{Name="Alice", Age=45, Sales = 200000, Department="EMEA Sales"};
var idp1 = await connection.SetAsync(employee1);
var idp2 = await employees.InsertAsync(employee2);
Also, since I am using this om library, I would expect the option of setting the expiry as well as deleting the collection is provided by the library. At least from the samples, I could not find anything.
- The RedisCollection materializes from Redis, so yes, they are all reading from the same database, you can
FindById
to get an item by id, or you can just filter based on indexed attributes (see the docs) - Redis OM .NET doesn't have any way to set the expiration time of something that's been set or inserted, but you can follow up any set/insert immediately with an
EXPIRE
to set the expiry - it's a feature we can look into adding.
@slorello89 thanks for the details. I think its important to provide feature to set expiry at collection level.
Given that there is no default TTL and the collection items are being added individually, the expiry time should ideally be same for all items within the collection. Setting individual expiry for each item after its added is not ideal. Also, without the expiry time, the data will be sitting there forever using unnecessary memory. This is particularly important for Redis Enterprise customers as they pay for license per shard, which provides some X GB of usable memory. So developers using this feature unknowingly can fill up the usable memory. OR the other option that can be provided is to clean up the collection after the processing is done. But to support cleanup, I think the library will need to have a track of all ids that was created for the collection. I think that the former option of expiry setting at collection level may be more appropriate as that can be set behind the scenes, when the data is inserted into Redis.
@slorello89 thanks for the details. I think its important to provide feature to set expiry at collection level.
Given that there is no default TTL and the collection items are being added individually, the expiry time should ideally be same for all items within the collection. Setting individual expiry for each item after its added is not ideal. Also, without the expiry time, the data will be sitting there forever using unnecessary memory. This is particularly important for Redis Enterprise customers as they pay for license per shard, which provides some X GB of usable memory. So developers using this feature unknowingly can fill up the usable memory. OR the other option that can be provided is to clean up the collection after the processing is done. But to support cleanup, I think the library will need to have a track of all ids that was created for the collection. I think that the former option of expiry setting at collection level may be more appropriate as that can be set behind the scenes, when the data is inserted into Redis.
Keep in mind this library is just a wrapper for redis, so unless redis supports the feature, this library won't support the feature. Also, there is a paradigm shift in the way you can now think of redis. Before redis-search, there were ways to do things, and hoops you had to jump through to write an app that used redis key/value structure. Now that we have redis-search, we can think about redis with a different mind. It's kind of like how the introduction of async/await allowed you to write cleaner code, that's what redis-search did for redis.
So for you example, you insert your records with an expiration time, and once all items have expired, the "collection" is gone, as redis doesn't store empty collections, and in the mean time you simply query the collection for the records you want to work with.
With that said, I do like the idea of an overload for InsertAsync to be able to pass a TimeSpan, and set the expiration all in one go like the StackExchange.Redis library provides.
heartbeat.Insert(model,TimeSpan.FromSeconds(15));
But if we do it insert level, wouldn't it become expiry at individual record level? I was thinking something at the collection level.
var collection = provider.RedisCollection<Person>(TimeSpan.FromSeconds(15));
and internally, whenever Insert is called, OM library sets the absolute expiry time based on what was set at collection level. That will allow all records to expire at same time.
I don't think that a collection-level TTL is something we're going to support. It's pretty un-redis-like, I can certainly see adding something along the lines of what @VagyokC4 is talking about, with an overloaded Insert, that's how string sets work after-all, and even if there is no expire option in JSON/Hashes, it's something that should be simple enough to add with pipelining.
But purely in terms of data consistency, collection level expiry makes more sense. You are no longer dealing with individual records, when you are doing the processing of a data as a collection. So records disappearing in between the processing would give inconsistent results.
But purely in terms of data consistency, collection level expiry makes more sense. You are no longer dealing with individual records, when you are doing the processing of a data as a collection. So records disappearing in between the processing would give inconsistent results.
That's not how redis works... the "collecion" that you see is "imaginary" in the sense that it only exists as long as records exists containing that namespace. Soon as the last record using the "key" is gone, so is the "collection" as you call it. It's just a grouping of keys broken up by the separator. At it's heart, redis is a key/value dictionary, and the only collection that truly exists, is the collection of keys for each dictionary entry.
I think you have a couple of options here to get the functionality you are looking for. 1) You can set an absolute expiration time by dynamically calculating your TimeSpan based on a fixed time, and then they will all expire at the same time; or 2) you can fallback to the StackExcahnge.Redis library and store all your keys under a hashset, and set an expiration on the hashset key. Then you can update the one key as required, and all values in the hashset will disappear once the key has expired.
I think option two is the closest to what you are describing, but then keep in mind you lose what this library provides, the redis-search ability. For some instances the search is more important, maybe in yours, the uniform expiration is more important.
@slorello89 For Redis OM Spring, I followed the recipe from Spring Data Redis https://docs.spring.io/spring-data/redis/docs/current/reference/html/#redis.repositories.expirations, which is to have a @TimeToLive
annotation that you can place at the class level to give objects a default TTL upon creation, or at the property/method level to use it for specific instance TTL setting.
That's not how redis works... the "collecion" that you see is "imaginary" in the sense that it only exists as long as records exists containing that namespace. Soon as the last record using the "key" is gone, so is the "collection" as you call it. It's just a grouping of keys broken up by the separator. At it's heart, redis is a key/value dictionary, and the only collection that truly exists, is the collection of keys for each dictionary entry.
I get that. But even though its imaginary collection, we will be trying add the data into collection and process the data as a whole in the collection. So consistency of the data in the collection being processed matters. If I was dealing with individual keys, I know that individual keys will expire based on what I set individually.
- You can set an absolute expiration time by dynamically calculating your TimeSpan based on a fixed time, and then they will all expire at the same time
Right. When we define the collection, we need to provide the optional input for setting expiry time, which has to be internally converted to absolute expiry time. Then whenever we call the library method to Insert new entries into the collection, this OM library should set the absolute expiry time on individual items. So all items in the collection will have same expiry time. Falling back to Stackexchange client is not an ideal option as I will need to set it individually for every item, which takes away the benefit of using an abstraction library.
@slorello89 For Redis OM Spring, I followed the recipe from Spring Data Redis https://docs.spring.io/spring-data/redis/docs/current/reference/html/#redis.repositories.expirations, which is to have a
@TimeToLive
annotation that you can place at the class level to give objects a default TTL upon creation, or at the property/method level to use it for specific instance TTL setting.
@bsbodden I do like this idea. I can see this option added to the DocumentAttribute
that would set a default time for each document, in addition to the override allowing the TTL to be set all in one go.
@askids What does this mean exactly?
So consistency of the data in the collection being processed matters.
Your use case is vague, so I'm not really following. If the idea is to process all records, and remove of them once processed, then that can be done without TTL, and with this library. In your scenario, what happens if you set a TTL and something (environmental or out-side your control) prohibits you form processing in time? Those records are gone, and unprocessed. This is okay for your use case?
If the priority is to process all records, and save space once done, for what I think your describing, I would have a class with an indexed Boolean property to differentiate processed from unprocessed records. When processing I would search for Unprocessed, maybe sort by oldest to newest, and once processed, I would set the Boolean property as such and save the record. Then if I was concerned about space I would then expire the key after an appropriate amount of time, and if I was concerned about concurrency and locking I would use something like RedLock to lock my processing code while I am processing the Unprocessed.
Closing with closed by #166