tinkr icon indicating copy to clipboard operation
tinkr copied to clipboard

tinkr 0.3.0 roadmap

Open zkamvar opened this issue 3 months ago • 0 comments

I am planning to do a major version component update to {tinkr} in the next few weeks with some new features and bug fixes. This describes my ambitions for it.

DOCUMENTATION

  • [ ] vignette to demonstrate applying all possible protections (#80)

BUG FIXES

  • [x] math with a single character will still be protected (thanks to @maelle, #101)
  • [x] (#112) filter out special control characters
  • [x] (#115) fixing bare links to return as <link> instead of [link](link)
  • [x] (#114) remove EOL warning from readLines()

NEW FEATURES

misc

  • find_between_nodes() will return a set of nodes that exist between two nodes in the same block. This will be most useful for finding the content between braces.
  • [ ] new methods to add markdown to specific places in the document (#119)

xml to markdown conversion/display

As of {tinkr} 0.2.0, it's not easy to convert a single XML node or set of nodes to markdown without converting the entire document. The following new features will help with that

  • [x] the $show() method gains a new argument, lines which will subset the output by the lines in the document (#108)
  • [x] show_*() functions will be operate on nodes and/or nodesets, returning a character vector that contains the markdown of just those nodes (#108)
    • show_censor() will show the nodes in context of the rest of the document, with the rest of the content censored.
    • show_list() will show the nodes individually in separate paragraphs
    • show_block() will show the nodes in the context of the markdown block structure
  • [x] to_md_vec() will convert a node or nodelist to a character vector of the resulting markdown (#108)

Safeguards against upcoming changes to "asis" nodes

Version 1.0.0 will subtly break the way both {babeldown} and {pegboard} have been operating by using attributes to protect nodes instead of splitting them into "asis" nodes (see #105 and #107).

On the one hand, the "asis" nodes were useful because it allows patterns to find and protect nodes for translation:

  ## protect content inside curly braces and math ----
  woolish$body <- tinkr::protect_math(woolish$body)
  woolish$body <- tinkr::protect_curly(woolish$body)
  curlies <- xml2::xml_find_all(woolish$body, "//*[@curly]")
  purrr::walk(curlies, protect_curly)
  maths <- xml2::xml_find_all(woolish$body, "//*[@asis='true']")
  purrr::walk(maths, protect_math)

On the downside, it creates a bit of chaos because it does split up the nodes, as shown in the documentation for pegboard's internal fix_links():

However, if a link uses liquid templating for a variable such as: 
`[Home]({{ page.root }}/index.html) and other text`, it will appear in XML as

```xml
...
<text asis="true">[</text>
<text>Home</text>
<text asis="true">]</text>
<text>({{ page.root }}/index.html) and other text</text>
...
```

I want to add an accessor for protected nodes from the yarn object to help prepare for the change.

  • [x] (#111) $get_protected() will return the protected nodes, which currently include curly, math, and square braces. The type and content of these nodes should not change when #107 is merged.
    • this will have an argument of type, which will allow one or more of "curly", "math", and "square" to select the type of protected node to find. It defaults to NULL.

zkamvar avatar May 09 '24 16:05 zkamvar