uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering issues

splitter="fads" in pipeline_pdf.ipynb

assignment

writed by Huarong Zhang

RGG2000

Majority vote give wrong label due to model inconsistency.

In `example/rater/generated_answer.ipynb`. For a input of which true label is `equivalent`, model sometimes generate `accept` or `reject`. So majority vote can give wrong vote. Input: ``` ("Vitamin C (also known...

Panzy-18

feat: WIP news feed and report genertaion

WIP for now - added TransformQuestionExtractionOpenAIFlow to generate questions from prev reports - added FeedOpenAIFlow to use questions from previous flow to generate responses for news feed - added corresponding...

CallmeNafiy

Define the number of Questions-Answers pairs and q4 or q8 quantization

### 🚀 The feature, motivation and pitch Is it possible to Define the number of Questions-Answers pairs? Also is there an option to load the models in q4 or q8...

Chasapas

OSError: 0.1.0-small is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'

### 🐛 Describe the bug I use the base [example extract pdf](https://github.com/CambioML/uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering/blob/main/example/transform/nougat_huggingface_QAs.ipynb) - I am using nvcr.io/nvidia/pytorch:24.07-py3 docker container - I have installed last Anaconda version - I have a...

C0casio45

Refactor to support multi flow configuration

### 🚀 The feature, motivation and pitch @CallmeNafiy Per our discussion, we would love to support multi-flow configuration in the future. ### Alternatives _No response_ ### Additional context _No response_

CambioML

uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering
uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering copied to clipboard

Metadata

add example of generating Q&As on IRS pdf

splitter="fads" in pipeline_pdf.ipynb

assignment

Majority vote give wrong label due to model inconsistency.

feat: WIP news feed and report genertaion

Define the number of Questions-Answers pairs and q4 or q8 quantization

OSError: 0.1.0-small is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'

Refactor to support multi flow configuration

← Metadata

Owner

Metadata

uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering copied to clipboard

Metadata

← Metadata

Owner

Metadata

uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering
uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering copied to clipboard