ocr-vqgan icon indicating copy to clipboard operation
ocr-vqgan copied to clipboard

OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Perceptual loss for clear text-within-image generation. Fork from VQGA...

Results 2 ocr-vqgan issues
Sort by recently updated
recently updated
newest added

Hi! I attempted to train the model using the main.py script. The script starts off well but seems to get stuck when attempting to download the ocr_craft model. There's no...

Thank you for sharing the code,I used taming-tranformer to did the image reconstruction for Street View,but smaller text sections don't work well. If i use this model to train this...