Downloading `drive_document` fails
Tried downloading a file and got the below error.
Traceback (most recent call last):
File "apps/frappe/frappe/app.py", line 83, in application
response = frappe.api.handle()
File "apps/frappe/frappe/api.py", line 53, in handle
return _RESTAPIHandler(call, doctype, name).get_response()
File "apps/frappe/frappe/api.py", line 69, in get_response
return self.handle_method()
File "apps/frappe/frappe/api.py", line 79, in handle_method
return frappe.handler.handle()
File "apps/frappe/frappe/handler.py", line 48, in handle
data = execute_cmd(cmd)
File "apps/frappe/frappe/handler.py", line 86, in execute_cmd
return frappe.call(method, **frappe.form_dict)
File "apps/frappe/frappe/__init__.py", line 1646, in call
return fn(*args, **newargs)
File "apps/frappe/frappe/utils/typing_validations.py", line 30, in wrapper
return func(*args, **kwargs)
File "apps/drive/drive/api/files.py", line 302, in get_file_content
file = open(drive_entity.path, "rb")
TypeError: expected str, bytes or os.PathLike object, not NoneType
The drive_document doctype doesn't currently support being downloaded. It's saved as a base64 string from a binary CRDT
https://github.com/frappe/drive/blob/cb09f99de669d20a862b8193a728080af44c45f8/frontend/src/components/DocEditor/TextEditor.vue#L284-L288
https://github.com/frappe/drive/blob/b095cd5b3ebd1f18b255ffbdf66b99933260ec5d/frontend/src/pages/Document.vue#L103-L109
All of this is done to support real time editing. Since the CRDT encodes and contains the entire document history and blame to prevent collisions.
Desired behaviour here would be downloading a PDF of the document perfectly reflecting the style and typography.
This is easy when done client side, provided the document is rendered. As opposed to requesting the download from the server. Which will require considerable amounts of work to do this correctly, decoding the CRDT to HTML and responding with a generated PDF.
Another potential solution would be just storing the HTML content (we can generate this during runtime) on disk along with the CRDT in the DB, and then generating a PDF from the saved HTML. This effectively doubles or close to doubles the file size. Which potentially isn't that big of an issue for unicode characters alone (which is all drive_document contains) since even after potentially doubling the file size we're still realistically talking about KBs worth of data. But this still isn't a perfect solution. Since this would involve tracking the file_size of both the HTML content and the CRDT. Could potentially cause issues/orphaned items I guess.
Implemented just disallowing & skipping them in batch downloads 2ce2c6d78dbeb4abe0bc78eb46a349912378fc61 for now. To prevent all of this from breaking so abruptly.
A Document should be downloadable as a PDF. That's a basic expectation from a user.
Now, to implement it you can do JSON to HTML to PDF or store both JSON and HTML. Storing the data in two formats and having the PDF feature is better than not having the feature and saving some KBs.