ambuda
ambuda copied to clipboard
Add option to download text as markdown
Could be proofreading texts or published texts.
Desiderata -
- should include proofreader notes as footnotes
Right now there are three download cases on my roadmap:
- TEI XML (as the document of record that generates everything else)
- plain text (for parsing, general ease of use, etc.)
- PDF (for pleasant reading)
For Markdown, I have some questions:
- How do you read a Markdown text today? What programs, etc.
- What would a markdown download would give you that plain text and PDF would not?
- Do you know others who would prefer to read texts through markdown?
For Markdown, I have some questions:
* How do you read a Markdown text today? What programs, etc.
- With intellij idea (if not a plain text editor).
- Hugo static website generator
* What would a markdown download would give you that plain text and PDF would not?
- Sections, for one.
- Better conservation of important formatting (bold, italics, footnotes etc..)
A better question might be - "What would plain text download give you that a markdown doesn't?"
* Do you know others who would prefer to read texts through markdown?
Basically everyone who cares to read LARGE texts as a plain text files, everyone who uses static website generation. I recall that @shreevatsa and @drdhaval2785 have used markdown.
So far, the most convenient way to produce markdown from TEI I've found : https://github.com/sanskrit-coders/doc_curation/blob/master/doc_curation/tei.py . (So, here it's mostly code reuse with minor modifications.)
- Sections, for one.
- Better conservation of important formatting (bold, italics, footnotes etc..)
Why are PDFs inadequate here?
A better question might be - "What would plain text download give you that a markdown doesn't?"
More useful for NLP applications and text mining. But that's about all that comes to mind.
So far, the most convenient way to produce markdown from TEI I've found
Wonderful! My main concern was that we'd have to maintain the TEI -> XML logic indefinitely going forward. If there's an out-of-the-box solution that someone else will maintain, I think there's no reason not to add it.
- Sections, for one.
- Better conservation of important formatting (bold, italics, footnotes etc..)
Why are PDFs inadequate here?
- Flowing text on various screens.
- Modifiability
- Search and copy-paste (can be assuaged to some extant by being very careful in pdf generation).
- Smallest possible file size.
- Ability to plug into a variety of other tools, including converters
So far, the most convenient way to produce markdown from TEI I've found
Wonderful! My main concern was that we'd have to maintain the TEI -> XML logic indefinitely going forward. If there's an out-of-the-box solution that someone else will maintain, I think there's no reason not to add it.
You mean TEI -> MD? I keep some custom variants of the main TEI stylesheets as well (eg. sarit here ). Maintenance might therefore not be a big issue.
You mean TEI -> MD?
Yes, my mistake.