pm-discussify Perform performance tests on the data-model

Due to lack of time to develop more a more sophisticated CRDT, we will be using a simple orset CRDT for the discussion (comments), where each entry has the following structure:

{
  "id": "hash(authorDId + date + text)",
  "parentId": "xxx",  // optional
  "date": "2018-06-26T21:19:27+00:00",
  "contentHash": "QmYwAPJzv5CZsnA625s3Xf2nemtYgPpHdWEz79ojWnPbdG",
  "contentSize": 432432,
}

The contenHash points to a ipfs file that contains the same information as above with the following additional fields:

author (did, name, photo, etc)
comment itself
public key used to sign
signature

The contentSize is used to fetch up to that amount of bytes to avoid having bad peers pointing to huge files. Moreover, comments that have a high contentSize are not fetched without user consent.

Advantages:

Simple to implement
Scales well to hundreds, possibly thousands of comments
Knowledge of the whole tree before-hand
Easy to perform pagination and sorting
Simple to implement a security model on top of it

Disadvantages:

Does not scale indefinitely
Requires additional round trips to fetch the comments

I will be performing some tests with the above data-model to see how it performs. Regarding security checks, we will be discussing that on a separate issue.

Jun 26 '18 23:06 satazor

//cc @pgte

Jun 26 '18 23:06 satazor

In the same light of the research work you did for Identity, it would be great that the discussify team captured all the great tools to organize data that were presented during the Lisbon Hack Week as a way to take inspiration, identify shortcomings and find possible collaborations.

Jun 27 '18 08:06 daviddias

@diasdavid are you referring to https://www.notion.so/ and the like? If so, I've created https://github.com/ipfs-shipyard/kipster/issues/1

Jun 27 '18 08:06 satazor

Notion, Jupyter Notebooks and all the others Juan presented with the focus on the data model.

Jun 27 '18 09:06 daviddias

Maybe this is not critical for the PoC, let's move this to the backlog, as something to address after we have an initial delivery.

Jul 02 '18 14:07 marcooliveira

pm-discussify pm-discussify copied to clipboard

Perform performance tests on the data-model

pm-discussify
pm-discussify copied to clipboard