Which type of Markdown is supported?
Question
Which type of Markdown is the output? Some Markdown formats support strikethrough (~~strikethrough~~) such as here on GitHub, while others don't. The reason I'm asking is that we are working a lot with contract types where words are simply deleted with a strikethrough which can drasticly change the meaning of the sentence.
@thomasfrederikhoeck Good point! We have not yet considered the strike-through text explicitely, but I would assume it would simply carry through. I know we looked carefully at bold and italic and made sure we preserved that.
What are your current findings?
@PeterStaar-IBM My current finding is that it is not preserved, but I don't know if that is related to the OCR not extracting it/reconizing it (OCR sometimes think they are clever and remove it) or if it the conversion to markdown which removes the striketrough.
Meanwhile we have added the possibility to represent these styles in DoclingDocument if the input format contains that information. The serializers should respect it.