Hive2Hive icon indicating copy to clipboard operation
Hive2Hive copied to clipboard

Loading UserProfile, MetaFile and Locations optimization

Open ippes opened this issue 11 years ago • 1 comments

In H2H currently three types of data stored in the network can be updated: the user profile, the meta files holding information about chunks etc. and the locations map. All other data never gets changed and only occasionally removed. Processes which work with the mutating data and are demanding the newest one are forced to download the data from network every single time. This is unnecessary in case of no change. It would make sense to hold a copy respectively to cache the loaded data. In order to be sure to have the newest data we could execute a get meta from the network of the corresponding data to avoid loading the whole object. Common solutions use for this purposes hashes. We could use the version keys (forth dimension in the key range of tomp2p). If an object gets modified it generates a new version key. TomP2P traces this 'versions' and the corresponding version keys. We could just compare this keys.

ippes avatar Apr 05 '14 22:04 ippes

I noticed long delays while testing sharing use cases (two users share a folder with write or read permission, move file/folders in/out/internal, add/delete files/folders and update files). I identified the UserProfileManager as the main bottleneck: most of the mentioned processes get and put the UserProfile several times from/into the network. Beside the network delays, the encryption and decryption needs most time and performance. Therefore I refactored the get user profile procedure:

  • A process requesting the latest UserProfile contacts the UserProfileManager.
  • The UserProfileManager performs a get digest (list of version keys).
  • The UserProfileManager compares the version key of the digest with the cached UserProfile's version key.
  • If the version keys are the same return the cached UserProfile to the requesting process.
  • If not perform a normal get (with decryption) and cache it.

The benefit is that we avoid the loading and decryption. A further improvment is that the put userprofile procedure also caches the userprofile. When I was running some share use case tests I noticed a 2 to 3 times faster execution of the tests! We are on the right way ;)

ippes avatar Apr 20 '14 17:04 ippes