drive icon indicating copy to clipboard operation
drive copied to clipboard

Downloading `drive_document` fails

Open michellealva opened this issue 2 years ago • 2 comments

Tried downloading a file and got the below error.

Traceback (most recent call last):
  File "apps/frappe/frappe/app.py", line 83, in application
    response = frappe.api.handle()
  File "apps/frappe/frappe/api.py", line 53, in handle
    return _RESTAPIHandler(call, doctype, name).get_response()
  File "apps/frappe/frappe/api.py", line 69, in get_response
    return self.handle_method()
  File "apps/frappe/frappe/api.py", line 79, in handle_method
    return frappe.handler.handle()
  File "apps/frappe/frappe/handler.py", line 48, in handle
    data = execute_cmd(cmd)
  File "apps/frappe/frappe/handler.py", line 86, in execute_cmd
    return frappe.call(method, **frappe.form_dict)
  File "apps/frappe/frappe/__init__.py", line 1646, in call
    return fn(*args, **newargs)
  File "apps/frappe/frappe/utils/typing_validations.py", line 30, in wrapper
    return func(*args, **kwargs)
  File "apps/drive/drive/api/files.py", line 302, in get_file_content
    file = open(drive_entity.path, "rb")
TypeError: expected str, bytes or os.PathLike object, not NoneType

michellealva avatar Aug 15 '23 03:08 michellealva

The drive_document doctype doesn't currently support being downloaded. It's saved as a base64 string from a binary CRDT

https://github.com/frappe/drive/blob/cb09f99de669d20a862b8193a728080af44c45f8/frontend/src/components/DocEditor/TextEditor.vue#L284-L288

https://github.com/frappe/drive/blob/b095cd5b3ebd1f18b255ffbdf66b99933260ec5d/frontend/src/pages/Document.vue#L103-L109

All of this is done to support real time editing. Since the CRDT encodes and contains the entire document history and blame to prevent collisions.

Desired behaviour here would be downloading a PDF of the document perfectly reflecting the style and typography.

This is easy when done client side, provided the document is rendered. As opposed to requesting the download from the server. Which will require considerable amounts of work to do this correctly, decoding the CRDT to HTML and responding with a generated PDF.

Another potential solution would be just storing the HTML content (we can generate this during runtime) on disk along with the CRDT in the DB, and then generating a PDF from the saved HTML. This effectively doubles or close to doubles the file size. Which potentially isn't that big of an issue for unicode characters alone (which is all drive_document contains) since even after potentially doubling the file size we're still realistically talking about KBs worth of data. But this still isn't a perfect solution. Since this would involve tracking the file_size of both the HTML content and the CRDT. Could potentially cause issues/orphaned items I guess.

Implemented just disallowing & skipping them in batch downloads 2ce2c6d78dbeb4abe0bc78eb46a349912378fc61 for now. To prevent all of this from breaking so abruptly.

uhrjun avatar Aug 16 '23 10:08 uhrjun

A Document should be downloadable as a PDF. That's a basic expectation from a user.

Now, to implement it you can do JSON to HTML to PDF or store both JSON and HTML. Storing the data in two formats and having the PDF feature is better than not having the feature and saving some KBs.

netchampfaris avatar Aug 16 '23 10:08 netchampfaris