reversemarkdown-net
reversemarkdown-net copied to clipboard
Add optional support to underline elements
Currently u elements used for underline are not supported but the issue is that there is no markdown equivalent. As an idea, perhaps an option that is disabled by default to process them to italics could be added.
Acknowledge seeing this, let me see what best can be done for this case. What you have suggested is one possible approach.
After opening the issue I thought about another approach that would more future proof but I don't know if it's feasible to implement:
There could be a UnknownElementsReplacer property in the converter Config. It could be a simple KeyValue-like list that would take the element name as the key and the replacer as the value.
For example:
var htmlToMarkdownConverter = new Converter();
var newReplacer = new replacer("u", "*"); // u is the html underline element tag. * is an italic text in markdown
htmlToMarkdownConverter.Config.UnknownTagsReplacer.Add(newReplacer)
// or
htmlToMarkdownConverter.Config.UnknownTagsReplacer.Add("u", "*")
During conversion, it would convert this element:
<u>Some underline text</u>
To:
*Some underline text*
I made a rudimentary solution for my use case like this but I don't know how it should be approached if done properly:
var unsupportedElemsRegexFormatters = new Dictionary<string, string>
{
{@"<strike class=""bb_strike"">((.|\n)*?)</strike>", "~~$1~~" },
{@"<u>((.|\n)*?)</u>", "*$1*" }
};
// Some elements not supported by the converter need to be manually converted
foreach (var item in unsupportedElemsRegexFormatters)
{
text = Regex.Replace(text, item.Key, item.Value);
}
You could do a regex as you have done or load the HTML via HtmlAgilityPack and replace it as you need. I think the idea what you have outlined, will implement it in the next release in a month or so.