surya icon indicating copy to clipboard operation
surya copied to clipboard

How to run Surya OCR on 8GB or 6GB VRAM NVidea/AMD GPUs

Open kkailaasa opened this issue 1 year ago • 7 comments

Hello, I'm interested in using Surya OCR, but I have two systems with less VRAM than the default requirements (> 24 GB VRAM):

  • One system with an 8GB VRAM NVIDIA graphics card
  • Another with a 6GB VRAM AMD Radeon graphics card

From my reading of the project description, I understand that Surya can potentially run with lower VRAM by adjusting batch sizes.

  • Is it possible to run Surya OCR on either my 6GB or 8GB VRAM GPU by changing the batch size settings?
  • If so, could you recommend appropriate RECOGNITION_BATCH_SIZE and DETECTOR_BATCH_SIZE values for each of my setups?
  • I'm okay with slower processing times. What kind of performance impact should I expect compared to the default settings?

Thank you for your help and for creating Surya OCR.

kkailaasa avatar Aug 23 '24 21:08 kkailaasa

@VikParuchuri Hi could you please share some insights on this.

sharabheshwara avatar Aug 28 '24 02:08 sharabheshwara

@kkailaasa I'm running compiled surya-ocr on RTX 3050 with VRAM 8GB and have a decent speed

snowfluke avatar Sep 09 '24 08:09 snowfluke

@kkailaasa can you tell us how you do ?Thanks

newsyh avatar Oct 14 '24 03:10 newsyh

i'm running compiled surya-ocr on RTX 3050 with VRAM 8GB and have a decent speed

@snowfluke can you tell us how you do ?Thanks

RedwindA avatar Oct 18 '24 18:10 RedwindA

Hello, thanks for the good software. Before putting it into prod use I did a small test (below). I have linux with Nvidia 4090 card (24GB). It takes about 6.2 GB only and when processing saturates one CPU thread to 100% and the GPU shows between 0% and 1% load. One page recognition (recognized text is 19KB) takes 70 seconds. Detection goes fast, but recognition is pretty slow.

Loaded detection model vikp/surya_det3 on device cuda with dtype torch.float16 Loaded recognition model vikp/surya_rec2 on device cuda with dtype torch.float16 Using device: cuda Detecting bboxes: 100%|██████████████████████████████████████████| 1/1 [00:00<00:00, 2.79it/s] Recognizing Text: 100%|██████████████████████████████████████████| 1/1 [01:08<00:00, 68.17s/it]

Is it because recognition step requires more VRAM than I have? If so, can it be configured to use more CPU threads? I have a second (slower) GPU - P40, is it possible to configure it for example detection to use P40 and recognition to use 4090?

from PIL import Image from surya.ocr import run_ocr from surya.model.detection.model import load_model as load_det_model, load_processor as load_det_processor from surya.model.recognition.model import load_model as load_rec_model from surya.model.recognition.processor import load_processor as load_rec_processor import json from PIL import ImageDraw, ImageFont import torch

IMAGE_PATH = "scan.JPEG"

image = Image.open(IMAGE_PATH) langs = ["pl"] # Replace with your languages - optional but recommended det_processor, det_model = load_det_processor(), load_det_model() rec_model, rec_processor = load_rec_model(), load_rec_processor()

Check GPU availability

device = torch.device("cuda" if torch.cuda.is_available() else "cpu") print(f"Using device: {device}")

Move models to the appropriate device

det_model = det_model.to(device) rec_model = rec_model.to(device)

Add timing for text recognition

start_time = time.time() predictions = run_ocr([image], [langs], det_model, det_processor, rec_model, rec_processor) end_time = time.time()

waan1 avatar Oct 24 '24 13:10 waan1

resolved git clone for some reason was extremely slow and pip install fixed the issue

waan1 avatar Oct 29 '24 15:10 waan1

@snowfluke How to install surya-ocr on a Windows 11 machine with the RTX 3060 8GB?

insinfo avatar Jun 26 '25 23:06 insinfo