Fabric icon indicating copy to clipboard operation
Fabric copied to clipboard

[Feature request]: Sample use case documentation

Open iplayfast opened this issue 1 year ago • 5 comments

What do you need?

I've seen videos showing how to use fabric as part of a command prompt, but I can't remember where I saw it. What I would like is to have the read me updated to show examples of how to use fabrick.

  1. To get the wisdom from a youtube video
  2. to get the wisdom from a pdf
  3. and all the other things you use fabric for. (I don't know enough in order to ask).

iplayfast avatar May 20 '24 04:05 iplayfast

Did you look at the readme? Extracting wisdom from a YouTube video among other examples is provided right there.

fail-open avatar May 20 '24 16:05 fail-open

Testing now on 2 but would something like less document.pdf | fabric -p extract_wisdom work, I will give it a go when I can or is there a better way to extract and load a pdf or doc...

MichaelCade avatar May 29 '24 21:05 MichaelCade

I very much agree it would be useful to have an example per pattern, even a simple one. The patterns themselves I find very insightful, but sometimes a small example of the type of data that would be expected as the input would make it more obvious how to use them effectively.

Eg for summarize_git_diff something like git diff main | fabric -p summarize_git_diff

kdubb1337 avatar Jun 05 '24 20:06 kdubb1337

Update: You have to get the PDF into text somehow. This closed Issue has some suggestions.

I thought I could analyze a PDF using this command:

 fabric --pattern=analyze_paper ~/Downloads/filename.pdf

However, it output some Markdown analysis of a different paper than what was in ~/Downloads/filename.pdf. What's going on there?

Using this command:

less ~/Downloads/filename.pdf | fabric --pattern=analyze_paper

yielded this, which is not what I'd like to see either. I'd love more guidance here for PDF injestion.

 This is a PDF file trailer, which provides metadata about the document and contains references to other parts of the PDF file. The Size value indicates that the entire PDF file (header, body, and trailer) is 3368 bytes long. The Root object has an ID [<954405187512094D937EB2391F542B95><954405187512094D937EB2391F542B95>], and there is an Info dictionary with 574 bytes of data. The Prev value in the trailer refers to the XRefStm object located at offset 768496, which contains cross-reference table information for the PDF file. The startxref value of 843608 indicates that the body of the PDF file starts at this byte offset.

annegentle avatar Jan 04 '25 23:01 annegentle

Why not using pdf2text :

pdftotext ./document.pdf - | fabric -p extract_wisdom

You could probably also use marker-pdf to go from pdf to markdow :

marker_single ./document.pdf --disable_image_extraction --output_format markdown --output_dir .
cat ./document/document.md | fabric -p extract_wisdom

monkeymonk avatar Feb 24 '25 13:02 monkeymonk