Logan comments

Results 685 comments of


                                            Logan

How to see the ploted bboxes on image as result along with json output

Donut doesn't predict boxes, hence the "OCR-Free" part. You can, however, use the attention scores to create "heatmaps" of what the model thought the answer was on the page ->...

RuntimeError: CUDA out of memory

14GB of VRAM will be difficult. If I'm remembering correctly, I trained with default settings (batch size 2, default input image size) and used about 40GB of VRAM.

React-native render

Just giving this a bump... would be a fantastic addition

Is there any way of speeding things up?

I've spent the day testing, and tesserocr seems to be slower than pytesseract. I need the boxes, so I'm comparing to image_to_data from pytesseract Here's my quick benchmark script (the...

Is there any way of speeding things up?

Interesting findings, thanks for the follow-up! 💪🏻

Object index improvements

@jerryjliu we can add an option for that, but that really only makes sense for objects you cam define on the fly (I.e. a query engine tool). This wouldn't really...

Object index improvements

@jerryjliu Added `FnNodeMapping` and an example + test 👍🏻 Should be good to go

[Bug]: github reader quickstart throws error

If you change the repo to `run-llama` instead of `jerryliu` it works, otherwise you get a `301 response, moved`

Question: How the number of categories affect the training and accuracy?

I'm currently training a setfit model with 4500 classes, 10 samples per class (using a proprietary dataset). I think it is still generating pairs though? I just see endless tqdm...

Question: How the number of categories affect the training and accuracy?

@grofte yea it never worked well for me either. I think the dataset is just too big haha