Parse-SDK-iOS-OSX icon indicating copy to clipboard operation
Parse-SDK-iOS-OSX copied to clipboard

fix: CollectDirtyChildren causing app to hang when saving object with many pointers

Open shlusiak opened this issue 4 years ago • 28 comments

Please see https://github.com/parse-community/Parse-SDK-Android/pull/1058 for the Android PR and https://github.com/parse-community/Parse-SDK-Android/issues/1006#issuecomment-682448919 for the analysis of the issue.

The collectDirtyChildren forgets seen siblings when traversing, which causes extra traversals if those siblings point to the same objects and causes lots of garbage to be created in memory. It also has the potential of adding the same dirty object twice to the dirtyChildren collection, which may or may not be a set.

Imagine the simplified setup of an Object A with an attribute which is an Array containing two pointers to object B. When traversing A, the array will be looped over. For each object found, the seen set would be duplicated and B would be added and B would be traversed. When backing out and iterating over the next attribute in that array, the original seen variable would be used and B would be traversed again and all of it's child pointers.

Removing the copy of the seen set may improve performance significantly.

Closes: https://github.com/parse-community/Parse-SDK-iOS-OSX/issues/1588 Possibly closes: https://github.com/parse-community/Parse-SDK-iOS-OSX/issues/982

shlusiak avatar Jan 05 '21 07:01 shlusiak

This may be related to #1588. My suggestion would be to port the test cases I listed first (in this PR), then if that utility method is broken, add the fix (it’s in the comment also). I say this because saving child objects relies on batch saves and looks like the batching decision for large arrays is messed up.

If the Android SDK is using the same algorithm for the batch calculation, it will probably need to be fixed also.

cbaker6 avatar Jan 05 '21 13:01 cbaker6

Eyeballing android, it seems to have may the batch count issue also https://github.com/parse-community/Parse-SDK-Android/blob/a5cfa1f9c6654281c76367ad8b3252e2499bb821/parse/src/main/java/com/parse/Lists.java#L67

cbaker6 avatar Jan 05 '21 18:01 cbaker6

Can you rebase this with the master branch to run the CI?

Also, it sounds like you have a possible testcase to verify your bug fix, can you add that also?

cbaker6 avatar Jan 06 '21 19:01 cbaker6

Can you rebase this with the master branch to run the CI?

Also, it sounds like you have a possible testcase to verify your bug fix, can you add that also?

Rebased to master. I don't have a test case written and my knowledge of ObjC is limited and don't think I'd be able to write one. I'd also probably not write a test case to catch this bug (which is an infinite loop given the right schema), but a test case to verify that the fix doesn't break the intended behavior. It would be unfortunate if there were no such test cases yet, but I have not checked.

Does my explaination of the flaw in the algorithm make sense though?

shlusiak avatar Jan 07 '21 03:01 shlusiak

Does my explaination of the flaw in the algorithm make sense though?

The explanation makes sense, but I need to think more if the case you presented should work and not introduce other issues. The problems listed in #982 look like circular dependencies to me that the algorithm couldn't detect. In those examples, it doesn't seem like they setup the schema in a way conducive to Parse. What have you seen after your change? Are you using the change in production?

Do you have any sample code you can show that demonstrates the issue?

cbaker6 avatar Jan 07 '21 04:01 cbaker6

@dplewis @mtrezza @TomWFox what do you all think? This was also brought up in the Android repo. Sounds like it can be a big problem. The questions I have is “what should the algorithm do in this scenario?”. “Does the change fix the problem without introducing other problems?”. I left some other comments above also...

cbaker6 avatar Jan 07 '21 04:01 cbaker6

Yes, we are using this fix in production now as it has fixed the issue we were seeing of the CPU spinning when trying to save a dirty object that contains an array of pointers to other objects, which point to quite a lot of other objects.

I at first also thought about circular dependencies that this algorithm wouldn't detect and that led me down to this code here, where objects would be visited multiple times. Now of course we have circular dependencies everywhere, but the algorithm is 1) detecting the only problematic case just time (two new objects pointing to each other) and 2) is supposed to stop when an object has already been seen. Now this second part, where objects are traversed multiple times smells like an infinite loop, but I couldn't proof that and believe the algorithm is actually meant to terminate eventually.

In our case it has been working fine without issues but a recent addition of another level of pointers to our schema made the whole thing spin forever. Again, I cannot proof that this is stuck in an infinite loop or only for 200+ years.

We have the same fix in production in Android since a few month and have not seen any issues related to this, but our saving usually only saves a single dirty object in one operation.

shlusiak avatar Jan 07 '21 05:01 shlusiak

Forgive my limited ObjC / Swift / iOS skills, but a simple test case would be:

let parent = PFObject(className: "parent")
        let obj = PFObject(className: "object")
        let ch1 = PFObject(className: "child")
        let ch2 = PFObject(className: "child")
        ch1["parent"] = parent
        ch2["parent"] = parent
        
        obj["values"] = [
            ch1,
            ch2
        ]
        
        try! obj.save()

No circular dependencies. This call should succeed in both cases, with or without my fix, but without my fix parent and all it's children will be visited twice, with my fix only once. I don't know how you'd cement that into a test case as the result is exactly the same, just one is more efficient.

shlusiak avatar Jan 07 '21 05:01 shlusiak

No circular dependencies. This call should succeed in both cases, with or without my fix, but without my fix parent and all it's children will be visited twice, with my fix only once. I don't know how you'd cement that into a test case as the result is exactly the same, just one is more efficient.

in this particular framework, every PFObject is a class (reference type) and since you didn’t save the parent first it might have trouble hashing to a unique value (It doesn’t have an objectId yet) causing the parent not to be recognized as the same object. If you really wanted to do this it seems you should save the parent first (Gets an objectId), then make the children point to the parent after the parent is saved

cbaker6 avatar Jan 07 '21 05:01 cbaker6

Two objects attempting to point to the same object that is unsaved and therefore not unique can cause a problem

cbaker6 avatar Jan 07 '21 05:01 cbaker6

If PFObjects were structs and equatable then there probably wouldn’t be an issue with this, but they are not here

cbaker6 avatar Jan 07 '21 05:01 cbaker6

This would have disastrous consequences already as you'd end up with multiple dirty instances if that was the case. Also I'm pretty sure that because PFObject is a class and not a struct, two pointers to the parent will be identical (reference) and the hash will be identical and the object would only be added to the set once.

But this is not the point I am trying to make, my algorithm change will not change what happens when two objects point to the same unsaved object.

shlusiak avatar Jan 07 '21 05:01 shlusiak

Classes don’t hash to the same value unless you make them conform to hashable/equatable or do some bitwise comparison. Since PFObject subclasses NSObject, you may be getting some help, but that’s not something I would personally depend on. Parse considers an object “Identfiable” when it has an objectId, in your case, none of them do yet but are pointing to the same object. You are claiming they should be identifiable because they are referencing the same local object, but these objects are being encoded and sent to the server with no objectId. Two objects can look exactly the same in encoded form, does that make them the same object? No way to tell, this can either be resolved as a circular dependency or treated as separate ParseObjects...

Your coded example looks like a corner case circular dependency when using the current design of the SDK and it seems to be the case with the infinite loop you are getting. I gave a couple of examples of how to offset this, at least when it comes to saving to the server. I don’t know much about how save to local storage work as I assume they get some localId before saving.

My questions above about will this change fix the problem without introducing others still remain. If the algorithm is indeed missing this corner case or can be improved to mitigate it, it should.

cbaker6 avatar Jan 07 '21 13:01 cbaker6

In my example the parent object is the same instance. This should have the same hash because it is literally the same instance, not another object of the same class qith the same values. The two child classes are different and will get different objectIds. Also there is no circle and this is no edge case. Feel free to assume that all objects are already persisted and have objectIds and that there is no local storage. The algorithm is simply inefficient and with larger data sets is likely to quickly churn on CPU and may appear to never finish even if none of the objects are dirty.

shlusiak avatar Jan 08 '21 11:01 shlusiak

Feel free to assume that all objects are already persisted and have objectIds and that there is no local storage.

This was not the example you presented, it’s completely different. You presented saving 3 “new” objects, that means no objectIds. Your latest comment says assume objectIds... if the children have Parse Ponters to “p”, then their shouldn’t be a traversal issue. I’m assuming in your new scenario you have all keys included, having the complete object of p pointed to which is why p’s keys are even being looked at from the children...

At any rate, in order to understand the possible improvements or issues your change may introduce you should post the updated code depicting the scenario that’s causing the issue as closely as possible.

cbaker6 avatar Jan 08 '21 12:01 cbaker6

I fail to see why my example is not sufficient enough to outline what is happening in the algorithm and what I believe is a flaw. Whether objects are new or not is not relevant as they will be traversed regardless, to find circular dependencies of new objects. Neither does my example contain a circular dependency, to keep it simple.

Can we agree that the algorithm would be expected to:

  • Traverse all objects and find no cycle
  • come up with all objects being dirty / new
  • persist parent first
  • persist the two ch objects, once parent has an objectId
  • persist the obj instance last once the ch objects have an objectId

The algorithm does all that and there is no issue. Except that it will traverse the parent object twice. And that is should not do that since that might really blow out the search tree.

My fix is a way to stop the algorithm from visiting nodes it has already seen. Now this will ideally not change the end result of what the algorithm is producing.

Whether my fix is negatively impacting other aspects of this algorithm other than outlined above would be up for debate and I'd like to focus on that rather than batching issues outside the scope of the affected function.

My simplified test case is of course not suitable to find regressions my fix could introduce but I was hoping we could get back to discuss my proposed changes and whether we can agree on them having positive or negative effect on the algorithm.

shlusiak avatar Jan 09 '21 05:01 shlusiak

Is there any update on this? We have the occasional issue when saving complex objects, and I suspect it's related.

eeallen1 avatar May 26 '21 00:05 eeallen1

This issue has been automatically marked as stale because it has not had recent activity. If you believe it should stay open, please let us know! As always, we encourage contributions, check out the Contributing Guide

stale[bot] avatar Jul 13 '21 02:07 stale[bot]

This project is stale. :(

shlusiak avatar Jul 13 '21 02:07 shlusiak

This issue has been automatically marked as stale because it has not had recent activity. If you believe it should stay open, please let us know! As always, we encourage contributions, check out the Contributing Guide

stale[bot] avatar Oct 02 '21 02:10 stale[bot]

Thanks for opening this pull request!

I will reformat the title to use the proper commit message syntax.

Rebased onto master, as this issue was never fixed. Is there any progress on this or has this project become stale?

shlusiak avatar Jun 26 '23 04:06 shlusiak

@shlusiak Could you try to write up a test case? A test case is usually required to sustainably fix a bug. Maybe someone from @parse-community/ios-sdk could help. You could look at existing test cases, choose and that is most similar to what you're trying to test, then copy and modify it. Also maybe take a look at the related issue in the Android SDK if there is any test case you can use here.

mtrezza avatar Jun 26 '23 12:06 mtrezza

Welcome to Codecov :tada:

Once merged to your default branch, Codecov will compare your coverage reports and display the results in this comment.

Thanks for integrating Codecov - We've got you covered :open_umbrella:

codecov[bot] avatar Jun 26 '23 12:06 codecov[bot]

@stephanboner Would you be interested in joining the iOS SDK team there for occasional reviews?

mtrezza avatar Jun 26 '23 12:06 mtrezza

@stephanboner Would you be interested in joining the iOS SDK team there for occasional reviews?

Hey @mtrezza, thanks for asking! Unfortunately I'm not working as a software developer anymore and therefore I no longer have the required infrastructure (a Mac with Xcode). Thank you though!

stephanboner avatar Jun 27 '23 12:06 stephanboner

@stephanboner Thanks for replying, wish you all the best in your new career!

mtrezza avatar Jun 27 '23 13:06 mtrezza