NexusData icon indicating copy to clipboard operation
NexusData copied to clipboard

why it's "not yet optimized for large data sets"

Open malhal opened this issue 9 years ago • 4 comments

First thanks for this project, it's awesome! So I was trying to figure the reason why you said the above in the Limitations list. From what I can tell the ManagedObjects registered from fetches are never unregistered, and their corresponding entry in the entityCache is never removed, thus the memory will never be freed for objects that are queried but not deleted. I would imagine that CoreData on the Mac makes use of reference counting to decide when it is suitable to release a ManagedObject from the cache. It likely monitors if any of the objects have a reference count of 1 and if so removes them from the context's object array, resulting in a release. I had a quick look at it seems Java doesn't have any mechanism to detect if an object is referenced by any other making this quite a challenging problem to solve elegantly. Since the feature is likely a long way off, as a work around in the mean time would you consider making a temporary method that both unregisters the object and also removes it from the entityCache? Perhaps make unregisterObject public then in ManagedObject setManagedObjectContext it could check if its null and if so it could get the cacheNode and remove it? My reason for this method would be although we don't have a way to track when the right time is to free an object, at least having the method to do it would be a good start.

malhal avatar Feb 26 '15 13:02 malhal

Thanks for your suggestions. Your reasoning is correct and is the main reason why I stated this limitation.

One potential approach in Java is to consider storing the ManagedObjects using a WeakReference with a ReferenceQueue that will also clear the cacheNode. I haven't thought out the complete details yet and whether this will actually work, but it's something to consider.

Failing that, your proposal of providing a manual mechanism to unregister objects and reclaim their memory is reasonable.

dkharrat avatar Feb 27 '15 05:02 dkharrat

Amazing your effort you put in this project. I like Core Data and even if in your implementation many Core Data feature missing. But performance issues is serious. To fetch a record by id, like:

id == "tt0040175-10475670"

take 0.2 sec. It is way too much. If I have to repeat this 1000 times it last for minutes. In Core Data it is done in some second. I guess there the predicate string is cached and they do not tokenize all the time when performing a new fetch. Small issue, but in big project it matters.

j4nos avatar Dec 16 '16 09:12 j4nos

Doing 1000 queries will be slow regardless. Instead, use an id IN array predicate so one query fetches all objects.

malhal avatar Dec 16 '16 10:12 malhal

I agree that the performance is unacceptable for large datasets, but I believe there are many opportunities for optimizations, which I haven't done. Most of the time I spent on this project was toward developing the foundation to provide a Core Data-like functionality. Also, as you mentioned, since there are a lot of missing features, development was focused on implementing the must-have functionality, and so I haven't had time for optimization.

Pull-requests are welcome :)

dkharrat avatar Dec 17 '16 02:12 dkharrat