estuary icon indicating copy to clipboard operation
estuary copied to clipboard

Add /collections/{coluuid}/content/{contentid} [delete] endpoint

Open gmelodie opened this issue 2 years ago • 14 comments

This PR adds the endpoint to remove a content from a collection, also refactors some of the code so it's more reusable.

TODO:

  • [x] Create GetContent function
    • [x] Create contents package
  • [x] Can one content be twice in a collection? if so we need the path specified (YES)
    • [x] Can we have the same content on the same collection and on the same path? (NO)
      • [x] Specify path on delete endpoint

Fixes #421

Verification:

+ APIKEY=EST599c28b4-fd89-4b44-b579-74af0997a5ecARY
+ path1=/my/path/1
+ path2=/my/path/2
+ curl -X POST -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/create -d { "name": "A new collection", "description": "A new collection test" } -s

+ coluuid=5b73a91a-0dfc-4f6b-94d1-aad8c9881366
+ echo coluuid: 5b73a91a-0dfc-4f6b-94d1-aad8c9881366
coluuid: 5b73a91a-0dfc-4f6b-94d1-aad8c9881366
+ curl -X POST -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/content/add?coluuid=5b73a91a-0dfc-4f6b-94d1-aad8c9881366&dir=/my/path/1 -F+  data=@estuary -s

+ file1=58
+ echo added file1 (58) to estuary
added file1 (58) to estuary
+ curl -X POST -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/content/add?coluuid=5b73a91a-0dfc-4f6b-94d1-aad8c9881366&dir=/my/path/2 -F data=@estuary -s

+ file2=59
+ echo added file2 (59) to estuary
added file2 (59) to estuary
+ curl -X GET -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/content?coluuid=5b73a91a-0dfc-4f6b-94d1-aad8c9881366 -s
+ jq
[
  {
    "id": 58,
    "updatedAt": "2022-09-14T18:40:21.279946787Z",
    "cid": "bafybeiezx5rydlqhyqzlqnc7zz7eawqowyzfjoon6b43dbkvws5rop3ivy",
    "name": "estuary",
    "userId": 1,
    "description": "",
    "size": 193993620,
    "type": 0,
    "active": true,
    "offloaded": false,
    "replication": 6,
    "aggregatedIn": 0,
    "aggregate": false,
    "pinning": false,
    "pinMeta": "",
    "replace": false,
    "origins": "",
    "failed": false,
    "location": "local",
    "dagSplit": false,
    "splitFrom": 0,
    "path": "/my/path/1/estuary"
  },
  {
    "id": 59,
    "updatedAt": "2022-09-14T18:40:22.910605199Z",
    "cid": "bafybeiezx5rydlqhyqzlqnc7zz7eawqowyzfjoon6b43dbkvws5rop3ivy",
    "name": "estuary",
    "userId": 1,
    "description": "",
    "size": 193993620,
    "type": 0,
    "active": true,
    "offloaded": false,
    "replication": 6,
    "aggregatedIn": 0,
    "aggregate": false,
    "pinning": false,
    "pinMeta": "",
    "replace": false,
    "origins": "",
    "failed": false,
    "location": "local",
    "dagSplit": false,
    "splitFrom": 0,
    "path": "/my/path/2/estuary"
  }
]
+ curl -X DELETE -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/5b73a91a-0dfc-4f6b-94d1-aad8c9881366/content/58 -s
+ echo deleted file1 (58) to estuary
deleted file1 (58) to estuary
+ curl -X GET -H+  Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/content?coluuid=5b73a91a-0dfc-4f6b-94d1-aad8c9881366 -s
jq
[
  {
    "id": 59,
    "updatedAt": "2022-09-14T18:40:22.910605199Z",
    "cid": "bafybeiezx5rydlqhyqzlqnc7zz7eawqowyzfjoon6b43dbkvws5rop3ivy",
    "name": "estuary",
    "userId": 1,
    "description": "",
    "size": 193993620,
    "type": 0,
    "active": true,
    "offloaded": false,
    "replication": 6,
    "aggregatedIn": 0,
    "aggregate": false,
    "pinning": false,
    "pinMeta": "",
    "replace": false,
    "origins": "",
    "failed": false,
    "location": "local",
    "dagSplit": false,
    "splitFrom": 0,
    "path": "/my/path/2/estuary"
  }
]

gmelodie avatar Sep 05 '22 20:09 gmelodie

Looks good. I can test out the third TODO in a moment, however I mentioned that I had issues building estuary on WSL, which version of golang should I be using to build the project?

Secondly, I think it's important to consider permissions before pushing these features to the main branch because this could cause issues with malicious users removing content from collections which should not be removed and/or removing content from the database that is critical to a application's operation.

  1. There should be a permission flag when generating an API KEY that says whether or not the user can remove content from the database.
  2. Ideally, there should be collection-level permissions too, i.e. when generating an API KEY I can set some collections with read, add and/or delete permissions.

pythonicode avatar Sep 05 '22 23:09 pythonicode

@gmelodie to your 3rd point, yes content can be in a collection multiple times as long as the path is different:

{
  .....
 "2": {
  "id": 30001344,
  "updatedAt": "2022-05-26T04:09:42.650004Z",
  "cid": "bafkreidifokvrano355jsrh3kc2ehka565qy7jvw5rr5fgcfrfnz6i2wre",
  "name": "starfire.png.enc",
   .....
  "path": ""
 },
 "3": {
  "id": 30001344,
  "updatedAt": "2022-05-26T04:09:42.650004Z",
  "cid": "bafkreidifokvrano355jsrh3kc2ehka565qy7jvw5rr5fgcfrfnz6i2wre",
  "name": "starfire.png.enc",
  .....
  "path": "/images/starfire.png.enc"
 },
 "4": {
  "id": 30001344,
  "updatedAt": "2022-05-26T04:09:42.650004Z",
  "cid": "bafkreidifokvrano355jsrh3kc2ehka565qy7jvw5rr5fgcfrfnz6i2wre",
  "name": "starfire.png.enc",
   .....
  "path": "/images/t.png.enc"
 },
 "5": {
  "id": 30001344,
  "updatedAt": "2022-05-26T04:09:42.650004Z",
  "cid": "bafkreidifokvrano355jsrh3kc2ehka565qy7jvw5rr5fgcfrfnz6i2wre",
  "name": "starfire.png.enc",
   ....
  "path": "/images/te.png.enc"
 },
 "6": {
  "id": 30001344,
  "updatedAt": "2022-05-26T04:09:42.650004Z",
  "cid": "bafkreidifokvrano355jsrh3kc2ehka565qy7jvw5rr5fgcfrfnz6i2wre",
  "name": "starfire.png.enc",
   ....
  "path": "/images/tes.png.enc"
 },
 "7": {
  "id": 30001344,
  "updatedAt": "2022-05-26T04:09:42.650004Z",
  "cid": "bafkreidifokvrano355jsrh3kc2ehka565qy7jvw5rr5fgcfrfnz6i2wre",
  "name": "starfire.png.enc",
  ...
  "path": "/images/test.png.enc"
 }
}

pythonicode avatar Sep 06 '22 00:09 pythonicode

@pythonicode thank you so much for reviewing this! To your point on security, I think that should be fine since GetCollection checks if the collection from which you're trying to delete belongs to you.

About the path thing. Thank you so much for checking! As a user, how would you expect the path to be specified? Here's a couple options I thought of:

  1. Required path field on the body of the request
  2. Optional path field on the body of the request (it'll delete the content if it only finds one, and if it finds more and there's no path specified, it errors out)

which version of golang should I be using to build the project?

I'm using go1.18.1

Ideally, there should be collection-level permissions too, i.e. when generating an API KEY I can set some collections with read, add and/or delete permissions.

Humm I think that's complicated. I think ideally collections would be an actual filesystem (that would make a lot of things easier), but at the same time that seems to be something easy to do in a client level. I had an interesting chat with @jimmylee about not having estuary do all the work recently, which I think applies here.

Also another question: can we have the same content on the same collection and on the same path?

gmelodie avatar Sep 06 '22 14:09 gmelodie

@gmelodie I think most prefer passing parameters in the URL when possible and I think that applies here.

Could you explain how to deal with security on the client side? Assuming that I expose an API KEY to the client they can remove content from any collection associate with my account no? As far as I can tell there's no way to prevent the client from removing content from one collection and allowing them to remove content from another collection without some sort of serverside logic involved (ideally without having to setup my own server considering I'm running a standalone application).

And to your last question: no two pieces of content can exists in the same collection at the same path (this includes content with different IDs as well).

pythonicode avatar Sep 06 '22 17:09 pythonicode

@pythonicode I thought about URL but that would be weird for paths I think as they'd contain weird characters (dots, slashes, etc.).

As for the security, what you could is not give your key to your clients and only allow them to interact with your files through your application. Your application would then keep track of the permissions each user has.

gmelodie avatar Sep 06 '22 18:09 gmelodie

@gmelodie yeah I suppose that it you pass the path via body it can support UTF-8/UTF-16 characters and symbols. I think it should be optional and the behavior you described sounds good to me since some people don't use paths in their collections.

How can my application call the API without a key though? I would need to implement my own server to redirect client requests which is doable but not ideal since I would need my database to associate clients with collection IDs, I suppose this is too complicated to support natively as well?

pythonicode avatar Sep 06 '22 21:09 pythonicode

Yes you need to call the API through your own server.

There is an option for an upload only perms API key with expiration. Some other products take advantage of that and are less worried about the key being exposed since it will be used one time.

I'm a fan of the concept of a single use API key, that deletes itself after usage. We can explore that option in the future.

jimmylee avatar Sep 07 '22 03:09 jimmylee

@pythonicode I'd love your review as well!

gmelodie avatar Sep 14 '22 19:09 gmelodie

@pythonicode I'd love your review as well!

Hi,

It looks good. One comment I would make is that in my original PR I had the concept that a user can delete content by collection ID and path WITHOUT using content id at all.

That way you can recursively delete content (say you want to delete all content under the path /private)

Potentially that could be a separate endpoint and I'm happy to create a PR for that if you're interested.

pythonicode avatar Sep 14 '22 21:09 pythonicode

@gmelodie I think we should add support to delete by content ID (single content) and path (multiple contents) using the body to specify the intention

by content ID - deletes only that content of that collection

DELETE collections/{collection_id}/contents 
{
   "by": "id"
   "value": "content_id"
}

by path - deletes every content in that collection that has that path

DELETE collections/{collection_id}/contents
{
   "by": "path"
   "value": "path"
}

Also, how does this affect the committed collection state?

en0ma avatar Sep 15 '22 12:09 en0ma

Secondly, I think it's important to consider permissions before pushing these features to the main branch because this could cause issues with malicious users removing content from collections which should not be removed and/or removing content from the database that is critical to a application's operation.

  1. There should be a permission flag when generating an API KEY that says whether or not the user can remove content from the database.
  2. Ideally, there should be collection-level permissions too, i.e. when generating an API KEY I can set some collections with read, add and/or delete permissions.

Things like these are why I think we should Identify some boundaries around Estuary. Some domains will be small and focused on core protocols and the others will be product feature-rich (like a SaaS). This approach will allow us to innovate and build really cool things on what we currently have. It would easily have solved this ask by @pythonicode

I have mentioned this, but I have not been able to pass my idea better - I hope to do so soon enough.

en0ma avatar Sep 15 '22 13:09 en0ma

@en0ma and @pythonicode I love your suggestions! I'll implement them.

Also, how does this affect the committed collection state?

It shouldn't, I'll test

gmelodie avatar Sep 15 '22 16:09 gmelodie

Implemented remove by path. Validation:

+ APIKEY=EST599c28b4-fd89-4b44-b579-74af0997a5ecARY
+ path1=/my/path/1
+ path2=/my/path/2
+ curl -X POST -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/ -d { "name": "A new collection", "description": "A new collection test" } -s
+ jq
+ + grepcut uuid -d
 " -f 4
+ coluuid=d0cd317f-23a1-47de-af00-c96c08031529
+ echo coluuid: d0cd317f-23a1-47de-af00-c96c08031529
coluuid: d0cd317f-23a1-47de-af00-c96c08031529
+ curl -X POST -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/content/add?coluuid=d0cd317f-23a1-47de-af00-c96c08031529&dir=/my/path/1 -F data=@estuary -s
+ jq
+ grep estuaryId
+ cut -d   -f 4
+ tr -d ,
+ file1=95
+ echo added file1 (95) to estuary
added file1 (95) to estuary
+ curl -X POST -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/content/add?coluuid=d0cd317f-23a1-47de-af00-c96c08031529&dir=/my/path/2 -F data=@estuary -s
+ jq
+ grep estuaryId
+ cut -d   -f 4
+ tr -d ,
+ file2=96
+ echo added file2 (96) to estuary
added file2 (96) to estuary
+ curl -X GET -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/d0cd317f-23a1-47de-af00-c96c08031529 -s
+ jq
[
  {
    "id": 95,
    "updatedAt": "2022-09-15T17:09:11.282718151Z",
    "cid": "bafybeic7gxilewowgp3ms743jug7wqojl4wxnfypxmi522xk5rermby2wi",
    "name": "estuary",
    "userId": 1,
    "description": "",
    "size": 193998228,
    "type": 0,
    "active": true,
    "offloaded": false,
    "replication": 6,
    "aggregatedIn": 0,
    "aggregate": false,
    "pinning": false,
    "pinMeta": "",
    "replace": false,
    "origins": "",
    "failed": false,
    "location": "local",
    "dagSplit": false,
    "splitFrom": 0,
    "path": "/my/path/1/estuary"
  },
  {
    "id": 96,
    "updatedAt": "2022-09-15T17:09:11.901528982Z",
    "cid": "bafybeic7gxilewowgp3ms743jug7wqojl4wxnfypxmi522xk5rermby2wi",
    "name": "estuary",
    "userId": 1,
    "description": "",
    "size": 193998228,
    "type": 0,
    "active": true,
    "offloaded": false,
    "replication": 6,
    "aggregatedIn": 0,
    "aggregate": false,
    "pinning": false,
    "pinMeta": "",
    "replace": false,
    "origins": "",
    "failed": false,
    "location": "local",
    "dagSplit": false,
    "splitFrom": 0,
    "path": "/my/path/2/estuary"
  }
]
+ curl -X DELETE -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/d0cd317f-23a1-47de-af00-c96c08031529/content -d {"by": "path", "value": "/my/path"} -s
+ echo deleted files on /my/path from estuary
deleted files on /my/path from estuary
+ curl -X GET -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/d0cd317f-23a1-47de-af00-c96c08031529 -s

gmelodie avatar Sep 15 '22 17:09 gmelodie

Looks good to me. Nice work!

pythonicode avatar Sep 15 '22 17:09 pythonicode