node-html-to-text
node-html-to-text copied to clipboard
ANSI colors and styles
Is it possible to add ANSI stlyes to the output?
You can try to overwrite existing formatters from the formatter.js file.
But I think it will not that easy, cause I normally ignore all styles in side a file.
Actually, I keep this issue in mind for some possible future improvements.
OK, I will reopen it. I am just cleaning up old issues I opened long time ago. 🙈
Happy new year! 🎆
By the way, with some caveats, it currently works like this:
const html = '<b>Hello</b> <span style="color:red;"><u>World</u>!</span><br/>';
const options = {
formatters: {
'bold': function (elem, walk, builder, formatOptions) {
builder.addLiteral('\x1b[1m');
walk(elem.children, builder);
builder.addLiteral('\x1b[22m');
},
'underline': function (elem, walk, builder, formatOptions) {
builder.addLiteral('\x1b[4m');
walk(elem.children, builder);
builder.addLiteral('\x1b[24m');
},
'red': function (elem, walk, builder, formatOptions) {
builder.addLiteral('\x1b[31m');
walk(elem.children, builder);
builder.addLiteral('\x1b[39m');
}
},
selectors: [
{ selector: 'b', format: 'bold' },
{ selector: 'u', format: 'underline' },
{ selector: 'span[style*="color:red"i]', format: 'red' }
]
};
const text = htmlToText(html, options);
console.log(text);
Result:
Usable for crafted HTML. Not usable for arbitrary HTML:
- completely unaware of CSS that is outside of the
styleattribute and can't be captured with selectors - this is unlikely to change ever; html-to-textwon't combine different selectors in case they happen to match the same tag, like<u style="color:red;">World</u>. Might be addressable to some extent - would require significantly rethinking how formatters work;- Literals still affect computed line length. Might be addressable if I allow literals to be defined as invisible and alter the line length counting;
- I'm not aware if https://en.wikipedia.org/wiki/ANSI_escape_code#SGR_(Select_Graphic_Rendition)_parameters can be stored to a stack and restored. Any "restore default" command is unaware of previous styles set by outer tag (for example, will break when nesting different colors). Such stack can potentially be recreated at the level of formatters (maybe with some support from the block text builder that also keeps the stack of tags).
Wow! That is amazing! Thank you for this!
On Thu, Jan 4, 2024 at 1:17 AM KillyMXI @.***> wrote:
By the way, with some caveats, it currently works like this:
const html = 'Hello World!
';const options = { formatters: { // Create formatters. 'bold': function (elem, walk, builder, formatOptions) { builder.addLiteral('\x1b[1m'); walk(elem.children, builder); builder.addLiteral('\x1b[22m'); }, 'underline': function (elem, walk, builder, formatOptions) { builder.addLiteral('\x1b[4m'); walk(elem.children, builder); builder.addLiteral('\x1b[24m'); }, 'red': function (elem, walk, builder, formatOptions) { builder.addLiteral('\x1b[31m'); walk(elem.children, builder); builder.addLiteral('\x1b[39m'); } }, tags: { // Assign to tags. 'b': { format: 'bold' }, 'u': { format: 'underline' }, 'span[style*="color:red"i]': { format: 'red' } }}; const text = htmlToText(html, options);console.log(text);Result: image.png (view on web) https://github.com/html-to-text/node-html-to-text/assets/13851064/aa3b188f-b8f4-4f7f-bbb8-aa814faa0b8d
Usable for crafted HTML. Not usable for arbitrary HTML:
- completely unaware of CSS that is outside of the style attribute and can't be captured with selectors - this is unlikely to change ever;
- html-to-text won't combine different selectors in case they happen to match the same tag, like World. Might be addressable to some extent - would require significantly rethinking how formatters work;
- Literals still affect computed line length. Might be addressable if I allow literals to be defined as invisible and alter the line length counting;
- I'm not aware if https://en.wikipedia.org/wiki/ANSI_escape_code#SGR_(Select_Graphic_Rendition)_parameters can be stored to a stack and restored. Any "restore default" command is unaware of previous styles set by outer tag. Such stack can potentially be recreated at the level of formatters (maybe with some support from the block text builder that also keeps the stack of tags). But I'm not very invested to explore it further for this niche use case, with all other limitations still in place.
— Reply to this email directly, view it on GitHub https://github.com/html-to-text/node-html-to-text/issues/43#issuecomment-1876097951, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAV3J44KAHNG7QREUXTTA4TYMXRIFAVCNFSM4A7CXD3KU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBXGYYDSNZZGUYQ . You are receiving this because you modified the open/close state.Message ID: @.***>