TextRecognitionDataGenerator
TextRecognitionDataGenerator copied to clipboard
generate handwriting process is successful but no image appears
Hello @Belval and community,
I run this code for generate handwritten images in colab. Process is successfull but no image appears in the output directory. When I generate an image with some kind of font without -hw
, the image appears in the output directory
Anyone can help?
Handwritten text generation is not working well right now, it's very possible that something broke in the latest TensorFlow version. I'll check if I can repro.
In my case, I downgrade the Python to 3.7 and Install Tensorflow==1.13.1 and the model works fine.
In my case, I downgrade the Python to 3.7 and Install Tensorflow==1.13.1. I still don't get the image, so I try to see the code and I found this error in section # Comparing average pixel value of text and background image # when you want to generate HW (try and except temporary I comment). I'm not using CLI but using python program using dict for language indonesia ("id")
(generate) D:\ocr_handwriting\generate_data\TextRecognitionDataGenerator>python test_hw.py 2022-08-16 10:20:19.738744: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 WARNING:tensorflow:From C:\Users\1000.conda\envs\generate\lib\site-packages\tensorflow\python\training\saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. image <PIL.Image.Image image mode=RGBA size=419x336 at 0x2470D1A3348> mask <PIL.Image.Image image mode=RGB size=640x480 at 0x2470D171308> background_img <PIL.Image.Image image mode=RGBA size=37x32 at 0x2470704B3C8> background_mask <PIL.Image.Image image mode=RGB size=37x32 at 0x24707051C48> Traceback (most recent call last): File "test_hw.py", line 22, in
for img, lbl in generator: File "D:\ocr_handwriting\generate_data\TextRecognitionDataGenerator\trdg\generators\from_strings.py", line 97, in next return self.next() File "D:\ocr_handwriting\generate_data\TextRecognitionDataGenerator\trdg\generators\from_strings.py", line 134, in next self.output_bboxes, File "D:\ocr_handwriting\generate_data\TextRecognitionDataGenerator\trdg\data_generator.py", line 191, in generate resized_img_px_mean = sum(resized_img_st.mean[:2]) / 3 File "C:\Users\1000.conda\envs\generate\lib\site-packages\PIL\ImageStat.py", line 47, in getattr v = getattr(self, "_get" + id)() File "C:\Users\1000.conda\envs\generate\lib\site-packages\PIL\ImageStat.py", line 103, in _getmean v.append(self.sum[i] / self.count[i]) ZeroDivisionError: float division by zero`
I don't know if it's really help or not but i don't know how to fix this. my code:
generator = GeneratorFromDict( count=1, fonts=["./fonts/arial.ttf"], language="id", background_type=1, is_handwritten=True ) for img, lbl in generator: # Do something with the pillow images here. img.save(f"./z_out/{lbl}.jpg")`
I make the following modifications, it works fine~
185 ¦ ##############################################################
186 ¦ # Comparing average pixel value of text and background image #
187 ¦ ##############################################################
188 ¦ try:
189 ¦ ¦ resized_img_st = ImageStat.Stat(resized_img, resized_mask.split()[2])
190 ¦ ¦ background_img_st = ImageStat.Stat(background_img)
191
192 ¦ ¦ if is_handwritten:
193 ¦ ¦ ¦ resized_img_px_mean = 25
194 ¦ ¦ else:
195 ¦ ¦ ¦ resized_img_px_mean = sum(resized_img_st.mean[:2]) / 3
196 # print(resized_img_px_mean)
197 ¦ ¦ background_img_px_mean = sum(background_img_st.mean) / 3
198 # print(background_img_px_mean)
199
I make the following modifications, it works fine~
185 ¦ ############################################################## 186 ¦ # Comparing average pixel value of text and background image # 187 ¦ ############################################################## 188 ¦ try: 189 ¦ ¦ resized_img_st = ImageStat.Stat(resized_img, resized_mask.split()[2]) 190 ¦ ¦ background_img_st = ImageStat.Stat(background_img) 191 192 ¦ ¦ if is_handwritten: 193 ¦ ¦ ¦ resized_img_px_mean = 25 194 ¦ ¦ else: 195 ¦ ¦ ¦ resized_img_px_mean = sum(resized_img_st.mean[:2]) / 3 196 # print(resized_img_px_mean) 197 ¦ ¦ background_img_px_mean = sum(background_img_st.mean) / 3 198 # print(background_img_px_mean) 199
I tried your fix and I can get the image, but I find another issue with the width for handwriting. it's always generated an image with a resolution of 37x32 for every word. did you edit another line of code?
edit: I can resize it using size parameter. so my question is what did # Comparing average pixel value of text and background image # actually do? it's okay if we use hardcode like you did @simmerken ?
Actually, there happens a ZeroDivision error but since it is caught, it is not visible. So far, the solution suggested at this issue helped me to solve the problem. I hope the author applies the suggested solution and saves many from headache...
Related issue: #251. So this issue might be closed as a duplicate.