uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering icon indicating copy to clipboard operation
uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering copied to clipboard

LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster R&D!

Results 18 uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering issues
Sort by recently updated
recently updated
newest added

splitter="fads" in pipeline_pdf.ipynb

writed by Huarong Zhang

In `example/rater/generated_answer.ipynb`. For a input of which true label is `equivalent`, model sometimes generate `accept` or `reject`. So majority vote can give wrong vote. Input: ``` ("Vitamin C (also known...

WIP for now - added TransformQuestionExtractionOpenAIFlow to generate questions from prev reports - added FeedOpenAIFlow to use questions from previous flow to generate responses for news feed - added corresponding...

### 🚀 The feature, motivation and pitch Is it possible to Define the number of Questions-Answers pairs? Also is there an option to load the models in q4 or q8...

### 🐛 Describe the bug I use the base [example extract pdf](https://github.com/CambioML/uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering/blob/main/example/transform/nougat_huggingface_QAs.ipynb) - I am using nvcr.io/nvidia/pytorch:24.07-py3 docker container - I have installed last Anaconda version - I have a...

### 🚀 The feature, motivation and pitch @CallmeNafiy Per our discussion, we would love to support multi-flow configuration in the future. ### Alternatives _No response_ ### Additional context _No response_