TextZoom icon indicating copy to clipboard operation
TextZoom copied to clipboard

Production of data sets自制数据集

Open YUAN-ANAN opened this issue 4 years ago • 6 comments

wenjia,你好,我想参照你的方法自己制作数据集,用长焦距拍摄图像做HR,短焦距做LR。请问你是如何预处理图像的,具体是如何crop小焦距和大焦距图像的同一个区域且大小一样来做成图像对的?非常期待您的回复,谢谢

YUAN-ANAN avatar Feb 14 '21 13:02 YUAN-ANAN

你好yuan-anan,建议你看一下SRRAW和realSR的github,他们有根据传统算法进行角点匹配进行HR-LR两张图匹配抠图的代码,但是不算非常准确,会有十几到几十个像素的偏移。

WenjiaWang0312 avatar Feb 14 '21 16:02 WenjiaWang0312

Hi Jason, I am trying to use TPL for documents super-resolution. But I am not able to get most of the things from this repo, can you help me out to include TPL Loss in my model...

Thanks,

m-ali-awan avatar Feb 28 '21 08:02 m-ali-awan

Hi Jason, following YUAN-ANAN question. Maybe you can add to the repo the code you used to create the dataset from the original images ? (detect text, crop and detect same text and crop in second image). Thanks

dkaliroff avatar Mar 04 '21 08:03 dkaliroff

Hi Jason, I am trying to use TPL for documents super-resolution. But I am not able to get most of the things from this repo, can you help me out to include TPL Loss in my model...

Thanks,

TPL, I think it is not so useful. I quit OCR now.

WenjiaWang0312 avatar Mar 12 '21 06:03 WenjiaWang0312

Hi Jason, following YUAN-ANAN question. Maybe you can add to the repo the code you used to create the dataset from the original images ? (detect text, crop and detect same text and crop in second image). Thanks

It is very easy. I did not detect the text bboxes by detection networks. I crop the text from manual annotation.

WenjiaWang0312 avatar Mar 12 '21 06:03 WenjiaWang0312

Hi Jason, I am trying to use TPL for documents super-resolution. But I am not able to get most of the things from this repo, can you help me out to include TPL Loss in my model... Thanks,

TPL, I think it is not so useful. I quit OCR now.

Thanks for reply. But I have tried all approaches (SRGAN,ESRGAN,UNET models) ,and also different kinds of losses. But , model is able to reconstruct blurry images, but when it comes to come up with some letters when the source is almost distorted completely, it cannot,(AS MODEL IS NOT AWARE OF ENGLISH LETTERS) Like, this is source image... test_full

And this is model output: Genertated-Client-Image But in this image, you can see for blurry, it can do something:

generated-biteasy-blurry

input-biteasy-blurry

So, I think if model has any sense of English letters, it can come up with something... Thanks, for any help....

muhammadali-awan avatar Mar 12 '21 06:03 muhammadali-awan