thinky icon indicating copy to clipboard operation
thinky copied to clipboard

Range feeds return new instances of the doc

Open Morgul opened this issue 8 years ago • 4 comments

I have what might be an interesting case. I'm attempting to use thinky to implement an MMO. So, there's that. :sweat_smile:

The problem is, I want to use a range feed to watch for entity creation / updates / destruction, where an entity is essentially:

Entity = thinky.createModel('entity', {
    id: String,
    behavior: String,
    ownerID: String,
    entityManagerID: String,
    state: {}
});

So, I simply subscribed to the feed as such:

Entity.changes()
    .then(function(feed)
    {
        feed.each(function(error, entity)
        {
            // Do stuff here!
        });
    });

To my surprise (and subsequent horror), I found that the model instance returned is a newly created model instance. This means a few things for me, near as I can tell:

  1. Every update creates a new model instance (typically, N * 60 updates a second, where N is the number of entities in my system, shooting for 5,000)
  2. I can't store the model instance as a way of avoiding the overhead from subscribing every entity to a point feed.

I think that 1 is a bigger deal than 2, but both are a bit frustrating. I guess the real question is, am I just going way off the map here? It seems like you could get yourself in trouble real quickly if you ever actually tried to store or subscribe to events on the model instances returned from a range feed...

Morgul avatar Aug 06 '15 04:08 Morgul

So this is a bit more tricky than what it looks like.

Creating new documents means that the documents will be GC as soon as they are not used anymore. If we keep a reference of all of them and then update it, it may blow the memory of your machine.

The current behavior was mostly because at the begining, changefeeds were available on a full table. In which case you don't want to keep a in memory copy of your table :)

However there are now range queries like User.between(10, 12, {index: "score"}).changes() in which case it can make sense to keep track of all these documents. Things get more tricky when it's an ordered range since there's also the position of documents that trigger changes.

I don't think we can figure out a good default, so the best thing I can come up with would be to add an option that would keep a reference of all changes and update the documents in place. That's doable, a bit tricky since primary keys can be weird values but still doable.

neumino avatar Aug 09 '15 02:08 neumino

This might be a good candidate for an optional dependency: node-weak. If we store the cached documents as weak references, they should still get garbage-collected as usual, and we can check the cached objects with weak.isDead(). We could also use the callback functionality to be notified when a document gets GC'ed, but I don't think that'll be necessary.

Of course, it could also be argued that something like this should be external to thinky; in that case, this seems like a good place for a hook, so external code can override how documents are fetched. (the default implementation would be the current behavior of creating a new document each time, but this could be overridden using e.g. node-weak)

whitelynx avatar Aug 10 '15 03:08 whitelynx

Creating new documents means that the documents will be GC as soon as they are not used anymore. If we keep a reference of all of them and then update it, it may blow the memory of your machine.

That is what I assumed, tbh, but the documentation should make this clear, as I had to go code spelunking to figure out this was the case, and then wrote a test case just to make sure I wasn't missing something.

However there are now range queries like User.between(10, 12, {index: "score"}).changes() in which case it can make sense to keep track of all these documents. Things get more tricky when it's an ordered range since there's also the position of documents that trigger changes.

Yeah. For my case, Entity.filter({zone: myManagedZone, manager: myManagerID}).changes() is likely to be the craziest I get. You are right about ordered ranges, but afaik, thinky doesn't support those yet?

I don't think we can figure out a good default, so the best thing I can come up with would be to add an option that would keep a reference of all changes and update the documents in place.

That's likely to meet my need, or the hook that @whitelynx suggested. Either one, really. But before we get these features, could we get some minor documentation explaining the memory considerations of the current implementation?

Morgul avatar Aug 10 '15 15:08 Morgul

I realized today that a better fit for the weak reference caching implementation I mentioned above would be ES6's WeakMap type. That would take care of pretty much all of the needed cleanup, and we could easily integrate it, provided a suitable hook is added to thinky.

whitelynx avatar Aug 10 '15 23:08 whitelynx