conflr icon indicating copy to clipboard operation
conflr copied to clipboard

Error in doc_parse_raw() when using manual <br> page breaks

Open wcmbishop opened this issue 5 years ago • 1 comments

I'm getting an error when trying to post a markdown file that includes manual page breaks (<br>) in it. See the reproducible example below.

library(conflr)
md_text <- c("`Test Markdown`",
             "==============",
             "",
             "This is a test file.",
             "<br>",
             "It has a break in it.")
html_text <- commonmark::markdown_html(md_text)


page <- confl_update_page(
  id = "244877109",
  title = "Test",
  body = html_text)
#> Error in doc_parse_raw(x, encoding = encoding, base_url = base_url, as_html = as_html, : Opening and ending tag mismatch: br line 3 and p [76]

Created on 2019-03-08 by the reprex package (v0.2.1)

It looks like this error occurs in the private function translate_to_confl_macro, specifically in line 2 below:

html_doc <- xml2::read_xml(html_text, options = c("RECOVER", "NOERROR", "NOBLANKS"))

It seems like I can resolve this error by replacing xml2::read_xml with xml2::read_html, which doesn't seem to check to make sure every tag has a closing pair (which is not needed for manual breaks using <br>).

Is it possible to replace xml2::read_xml with xml2::read_html in the package source?

wcmbishop avatar Mar 08 '19 23:03 wcmbishop

Thanks for reporting. This should not raise error, but I'm afraid it's difficult to use read_html() here... Let me think how to fix this,

In the meantime, you can use <br />, but it ends up removed anyway.

https://github.com/line/conflr/blob/d05c8f8057a65ae394be993711526aa03d830d12/R/translate.R#L25-L26

yutannihilation avatar Mar 09 '19 00:03 yutannihilation