core icon indicating copy to clipboard operation
core copied to clipboard

Integrate message parser into the core library

Open link2xt opened this issue 2 years ago • 6 comments

Message parser is a crate used by Delta Chat Desktop to parse markdown-like formatting in the messages: https://github.com/deltachat/message-parser It is currently used by Delta Chat Desktop if experimental setting "Render Markdown in Messages" is enabled.

Delta Chat Android currently uses regexp-based parsing to highlight URLs and DeltaLab is using a Java library to parse markdown.

To use message parser on Android we need a C API similar to dc_msg_get_text that returns the result of parsing the lightweight markup of the message. To simplify integration into Android client and possibly use the same API in DeltaTouch we need API that returns HTML markup, so API can have the following signature:

char*           dc_msg_get_html               (const dc_msg_t* msg);

Note that it is different from dc_get_msg_html() which returns the full message for display in the browser.

As a first step it is decided to parse markdown from text/plain part and not change the structure of the MIME message, so the only change on the UI side is to display an HTML in the message bubble if dc_msg_get_html returns a non-empty string. In the future it may be possible to start sending HTML for display in classic MUA and integrate WYSIWYG editor into the text field similar to one used in XMPP clients supporting XHTML-IM, Telegram etc., but this is outside of the scope of this issue and not necessary for the first step.

(as discussed with @adbenitez and @Simon-Laux)

link2xt avatar Aug 06 '23 23:08 link2xt

We may also want to add JSON-RPC API for desktop to avoid duplication of the code, otherwise Desktop will ship with both the message parser compiled into the core library and WASM.

link2xt avatar Aug 06 '23 23:08 link2xt

I opened an issue in the message-parser repository for adding an HTML output on the Rust side: https://github.com/deltachat/message-parser/issues/37

link2xt avatar Aug 06 '23 23:08 link2xt

On Sun, Aug 06, 2023 at 16:17 -0700, link2xt wrote:

char* dc_msg_get_html (const dc_msg_t* msg);

Note that it is different from `dc_get_msg_html()` which returns the full message for display in the browser.

maybe worthwhile to give it a name like dc_msg_as_html to make it more distinct?

hpk42 avatar Aug 07 '23 10:08 hpk42

rendering HTML in the bubbles is not directly possible on android and ios.

android uses instead sth. as SpannableString, ios has NSAttributedText & co. in contrast to desktop, there is not much HTML outside webview - which, in return, probably cannot be used for bubbles for performance reasons.

therefore, having a function that returns HTML that then needs to be parsed again to sth. else is not that helpful - and also a waste of performance if the HTML was just generated in core (the waste would also be there if there is a HTML-to-SpannableString/AttributedText function). let alone expectations when passing HTML to bubbles :)

instead, we need sth. more on-point, maybe already character-index based, a tree or list or so, maybe in JSON. sth. that can be converted easily at best in a simple loop. generating HTML from that data should be straight forward then (generating HTML is easy :)

r10s avatar Aug 07 '23 11:08 r10s

It is currently used by Delta Chat Desktop if experimental setting "Render Markdown in Messages" is enabled.

No, it is always used in desktop, there is a markdown mode that can be enabled, the more important feature is parsing links, email addresses, hashtags, bot command suggestions, labeled links, and later stuff like mentions.

It is not only about markdown, also the first step should not even be markdown as stabilising it from its experimental state opens more questions.

Deltachat iOS is also using something regex based: https://github.com/deltachat/deltachat-ios/blob/d8f40df939ddf5a04d96ab90b94125fc8c5c4e0e/deltachat-ios/Chat/Views/DetectorType.swift#L37

Desktop does not want HTML output it already works with json: it uses the json serialised output (which is a tree) of the message parser to build together react elements -> https://github.com/deltachat/deltachat-desktop/blob/5103652009ad4078c5c1878ec17014377e927794/src/renderer/components/message/MessageMarkdown.tsx#L30 changing this to HTML makes no sense because, A. it opens XSS sanitation questions in core and B. makes custom elements, styles and behaviour way more difficult. Doing that makes no sense if we already have a solution that works perfectly fine.

The HTML idea was that deltachat android could easily display basic html in a label, because @adbenitez said that might be possible. My initial idea based on talking with @Hocuri was converting the output tree to spanable text. There might be issues with non inline markdown such as code blocks, but we shouldn't start with or focus on markdown anyway.

Simon-Laux avatar Aug 07 '23 14:08 Simon-Laux

My initial idea based on talking with @Hocuri was converting the output tree to spanable text

that would also be the way i would think about that at the first place, for various reasons (less parsing, less problems, less code, no html-intermediate-format generation needed, performance)

r10s avatar Aug 07 '23 17:08 r10s