ollama-ebook-summary icon indicating copy to clipboard operation
ollama-ebook-summary copied to clipboard

Question: Would this work with Markdown files?

Open lightningRalf opened this issue 1 year ago • 1 comments

So found a tool that creates a Markdown from a PDF file. https://github.com/VikParuchuri/marker/blob/dev/README.md I prefer this method, because it also extracts the images.

But now I have a lot of disertations in markdown files. And I would like to automate the summarizaiton just like you did. (haven't tried just adding the markdown file to privateGPT, to be honest)

lightningRalf avatar Jul 14 '24 19:07 lightningRalf

you could easily loop through and summarize certain headers of the markdown... its not implemented, but that will come.

now there is some real code you can play with here

cognitivetech avatar Aug 10 '24 02:08 cognitivetech

now you can use chunkbyline.py --md=3 some.md where 3 is the highest level of heading you want to split by.

then run the resulting csv through python3 sum.py -c some_chunkd2.csv

and you are good to go!

let me know if you have any trouble.

eventually I'll update some of that stuff, but this is just coming together bit-by-bit

cognitivetech avatar Oct 10 '24 06:10 cognitivetech