simonwillisonblog
simonwillisonblog copied to clipboard
Support markdown in blogmarks and entries
https://simonwillison.net/2019/Nov/6/working-pdf/
Automate the Boring Stuff with Python: Working with PDF and Word Documents. I stumbled across this while trying to extract some data from a PDF file (the kind of file with actual text in it as opposed to dodgy scanned images) and it worked perfectly: PyPDF2.PdfFileReader(open(“file.pdf”, “rb”)).getPage(0).extractText()
Those curly quotes are bad news! This would be better:
Automate the Boring Stuff with Python: Working with PDF and Word Documents. I stumbled across this while trying to extract some data from a PDF file (the kind of file with actual text in it as opposed to dodgy scanned images) and it worked perfectly:
PyPDF2.PdfFileReader(open(“file.pdf”, “rb”)).getPage(0).extractText()
While I'm at it, may as well support all of markdown in these. That would finally allow me to embed additional links (and paragraph breaks) in the descriptions.
I write most of my blog entries in markdown these days and then save them in the database as HTML - could support markdown there too.
Done:
- #419
- #451
https://simonwillison.net/2019/Nov/6/working-pdf/ is fixed now.