rmapy WIP: New api

This PR is my WIP on the new api.

Closes #25

Progress:

[x] fetch metadata and create Collection using get_meta_items()
[x] speed up get_meta_items() -> currently too many requests
[x] fix missing metadata (possible key mismatch from old api)
[ ] upload files
[ ] download files

Jul 26 '21 10:07 AaronDavidSchneider

I have started to reverse engineer the new storage API. This is the basic structure:

get the base storage url (currently https://rm-blob-storage-prod.appspot.com)
query the storage url for the root folder and yield the id of the root folder
query the storage url for the id of the root folder and yield all ids of all files and folders on the rm.
query the storage url for each of these ids to yield the metadata.

Each query has two parts (see get_url_response):

Query the storage url and retrieve a google cloud download link
Query the google cloud download link and retrieve the result

It looks to me like a lot of requests... I am not sure yet if there is a more straight forward way to retrieve the metadata.

Jul 26 '21 10:07 AaronDavidSchneider

For reference: This is how a response of step 3 looks like (slightly adapted):

676b2ff23cbf0bec01ddf6cc7c63df82011cab8f418b822050e026a85293b0ef:80000000:ff2fa02f-a2ad-45f4-b65f-ee572e0c499b:4:0
aa0f54b98224e74bd5ab78f5f513b153c141ff86ac1493788914feae48cda5f7:80000000:ffbea9f9-7859-456b-b301-7dfdbbae059a:4:0

Each line represents one document/ one folder. I am currently trying to make sense of it. Since this seems to be everything that the client needs to know. The first part (until the :) is the fileid which can be used to download metadata and the file it self. The third one is the filename on the remarkable (see e.g. .local/share/xochitl).

I have no clue how you can get the visible name etc from this information alone.

My guess is that the remarkable client accesses this list and tries to see if it is different to last time in which case it will sync (download) the changed document.

It looks odd that we need to download the metadata for every single document manually.

Jul 26 '21 14:07 AaronDavidSchneider

@ddvk are you able to make sense from the above information?

Jul 26 '21 14:07 AaronDavidSchneider

more or less...

Jul 26 '21 15:07 ddvk


676b2ff23cbf0bec01ddf6cc7c63df82011cab8f418b822050e026a85293b0ef:80000000:ff2fa02f-a2ad-45f4-b65f-ee572e0c499b:4:0

aa0f54b98224e74bd5ab78f5f513b153c141ff86ac1493788914feae48cda5f7:80000000:ffbea9f9-7859-456b-b301-7dfdbbae059a:4:0

@ddvk, Could you tell me about the content of the columns in these lines?

I still don't understand/think that it's possible to get all nescessary metadata information from a single request which seems to be quite unfortunate...

Edit: did a test and reinstalled The remarkable app on my iPhone. It seems that this will download ALL files and all metadata!

I will continue tomorrow and rework rmapy to save the metadata locally and try to understand how the remarkable client keeps the metadata uptodate.

Jul 26 '21 17:07 AaronDavidSchneider

i meant i'm experimenting. i added the calls to rmfakecloud, and i can play with it to some extend

Jul 26 '21 18:07 ddvk

So the metadata part should be working now. I have started to look into the upload process. It looks like that the zip upload is depreciated. The client will now request multiple google cloud urls and upload the individual files to these urls.

However, there is one large issue: The client additionally alters the filestructure file (see above). I am not sure how we could do that.

Jul 27 '21 10:07 AaronDavidSchneider

Hello @AaronDavidSchneider have you had any luck with this?

I'm also trying to reverse engineer the new API communication but I haven't gone very far.

It seems to be quite messy with many calls as you said, but I'm getting lost on where some parts are coming from.

Aug 06 '21 08:08 jvlobo

Hi @jvlobo, I am still working on it with @sabidib I think we now understand most of the relevant parts of the api. We will update this PR soon.

Aug 06 '21 09:08 AaronDavidSchneider

so I managed to get back to this.

The gcd, ids of the individual files are sha256 of the file content the nodes are sha256 of all the file's hashes (also for folders) the root is sha256 of all the nodes's hashes (i think) I think this is called a Merkle Tree

Sep 19 '21 23:09 ddvk

@AaronDavidSchneider Have you made any more progress in this regard? I have seen some more commits on your fork under the newapi branch. Is that related?

Sep 19 '22 17:09 opal06

@AaronDavidSchneider Have you made any more progress in this regard? I have seen some more commits on your fork under the newapi branch. Is that related?

@opal06 I didn't have the time to continue working on this and have therefore chosen to build a workaround using rmapi (which works just fine) for my workflows.

Sep 20 '22 06:09 AaronDavidSchneider

rmapy rmapy copied to clipboard

WIP: New api

rmapy
rmapy copied to clipboard