node-html-to-text
node-html-to-text copied to clipboard
Get text of elem from within formatters function
I'd like to get at the text that is parsed for each element. I have a formatter function for format: a and format: *. I want to be able to get the text that is walked from within the function. How do I do so? I thought I could do text = walk(elem.children, builder), but it keeps coming back undefined.
walk(elem.children, builder) doesn't return anything - it talks to the builder.
Depending on what you need, there are different ways to get the text in formatters which may be useful in different situations.
Built-in formatter for blockquotes uses blockTransform function when closing the block:
https://github.com/html-to-text/node-html-to-text/blob/5b7ca1c1a736a730c9a4fa1b6db6172e50f4ee3e/packages/html-to-text/src/text-formatters.js#L86-L99
Built-in formatter for anchors (a) uses a word-by-word transform function to just spy out the plain text (not the final formatted text that the builder will produce):
https://github.com/html-to-text/node-html-to-text/blob/5b7ca1c1a736a730c9a4fa1b6db6172e50f4ee3e/packages/html-to-text/src/text-formatters.js#L162-L170
The builder itself: https://github.com/html-to-text/node-html-to-text/blob/master/packages/base/src/block-text-builder.js
I may consider expanding the builder API when I run into use cases where existing methods are not enough.
Adding openInline and closeInline might be something for me to consider...