markdig
markdig copied to clipboard
Is there a way to render a summary by short circuiting or truncating the rendering?
Is there an extension to render a summary of markdown text by truncating the content after a set number of words?
For example, suppose I'm building a site where people write articles in markdown. But on the home page, I want to display the first 300 words of each article. I'd like to avoid rendering the whole thing and then parse the HTML to find the first 300 words (while properly closing any open tags). It'd be nice if there's a way I could do it as MarkDig is parsing (or rendering) the markdoown.
If no such thing exists, I could try to implement this myself, but would appreciate some hints as to how I'd go about it.
By far the simplest way is for you to find the first 300 words and pass that substring to Markdig. The downside is that if links are defined after that (as is common with Markdown where they are at the bottom), those wouldn't work. This could also break headings, emphasis text...
A much more correct approach, if the first one isn't suitable for you, is to edit the syntax tree to remove paragraphs. This is a better approach IMO. Ping me if you wanna go this route.
A much more correct approach, if the first one isn't suitable for you, is to edit the syntax tree to remove paragraphs. This is a better approach IMO. Ping me if you wanna go this route.
I'm looking for a correct approach. This sounds like a good option.
I wrote a sample trimming implementation. Right now it will try to keep as many elements as will fit into the limit and then start discarding elements.
You can change how new lines affect the limit. The current implementation will be limiting the output based on character count, to change it to word count, you only have to change the TrimSpan implementation.
I did a bit of testing, but there are probably some edge-cases I didn't test that won't count towards the limit properly.
// pipeline setup
MarkdownDocument document = Markdown.Parse(markdown, pipeline);
TrimDocument(document, numberOfCharactersToKeep: 200);
// rendering to html
There are some implementation details such as whether you want to cut a link in the middle. If not, change this to charactersAvailable -= autoLink.Url.Length
.