CRNN_Chinese_Characters_Rec icon indicating copy to clipboard operation
CRNN_Chinese_Characters_Rec copied to clipboard

训练不同长度标签

Open Springzcf opened this issue 4 years ago • 18 comments

请问怎么样更改代码,可以支持不定长度的训练呢?多谢指导~

Springzcf avatar Jul 30 '20 11:07 Springzcf

修改dataloader,将图片padding到固定的长度

Sierkinhane avatar Aug 02 '20 14:08 Sierkinhane

修改dataloader,将图片padding到固定的长度

请问,具体在哪里改呢,没有找到。可以具体说一下吗,谢谢!

Lie-huo avatar Aug 04 '20 07:08 Lie-huo

https://github.com/Sierkinhane/CRNN_Chinese_Characters_Rec/tree/stable/lib/dataset _360cc.py里面的getitem

Sierkinhane avatar Aug 04 '20 07:08 Sierkinhane

楼主说的是训练不同长度图片,那怎么训练不同长度标签呢

zhunge avatar Aug 06 '20 06:08 zhunge

@zhunge 我没有具体尝试过,不过不同长度图片就可能有不同长度的标签, 对图片和标签都要padding才能concat成一个batch

Sierkinhane avatar Aug 06 '20 13:08 Sierkinhane

如果有人成功尝试的话可以把代码贴出来 :)

Sierkinhane avatar Aug 06 '20 13:08 Sierkinhane

如果有人成功尝试的话可以把代码贴出来 :)

        w, h = image.size
        if w / h < 280 / 32:
            image_temp = image.resize((int(32 / h * w), 32), Image.BILINEAR)
            w_temp, h_temp = image_temp.size
            image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR)
            image_temp = cv2.copyMakeBorder(
                image_temp, 0, 0, 0, 280 - w_temp, cv2.BORDER_CONSTANT, value=0)
            res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB))
            res_image = res_image.resize(self.size, Image.BILINEAR)
            res_image = res_image.convert('L')
            res_image = self.toTensor(res_image)
            res_image.sub_(0.5).div_(0.5)
            return res_image
        if w / h > 280 / 32:
            image_temp = image.resize((280, int(280 / w * h)), Image.BILINEAR)
            w_temp, h_temp = image_temp.size
            image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR)
            image_temp = cv2.copyMakeBorder(
                image_temp, 0, 32 - h_temp, 0, 0, cv2.BORDER_CONSTANT, value=0)
            res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB))
            res_image = res_image.resize(self.size, Image.BILINEAR)
            res_image = res_image.convert('L')
            res_image = self.toTensor(res_image)
            res_image.sub_(0.5).div_(0.5)
            return res_image
        if w / h == 280 / 32:
            image = image.resize(self.size, Image.BILINEAR)
            image = image.convert('L')
            image = self.toTensor(image)
            image.sub_(0.5).div_(0.5)
            return image

zcswdt avatar Aug 14 '20 03:08 zcswdt

@zcswdt good job

Sierkinhane avatar Aug 14 '20 05:08 Sierkinhane

楼主说的是训练不同长度图片,那怎么训练不同长度标签呢

请问前辈有方法支持不同长度标签的训练吗

Ryansanity avatar Oct 21 '20 10:10 Ryansanity

如果有人成功尝试的话可以把代码贴出来 :)

        w, h = image.size
        if w / h < 280 / 32:
            image_temp = image.resize((int(32 / h * w), 32), Image.BILINEAR)
            w_temp, h_temp = image_temp.size
            image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR)
            image_temp = cv2.copyMakeBorder(
                image_temp, 0, 0, 0, 280 - w_temp, cv2.BORDER_CONSTANT, value=0)
            res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB))
            res_image = res_image.resize(self.size, Image.BILINEAR)
            res_image = res_image.convert('L')
            res_image = self.toTensor(res_image)
            res_image.sub_(0.5).div_(0.5)
            return res_image
        if w / h > 280 / 32:
            image_temp = image.resize((280, int(280 / w * h)), Image.BILINEAR)
            w_temp, h_temp = image_temp.size
            image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR)
            image_temp = cv2.copyMakeBorder(
                image_temp, 0, 32 - h_temp, 0, 0, cv2.BORDER_CONSTANT, value=0)
            res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB))
            res_image = res_image.resize(self.size, Image.BILINEAR)
            res_image = res_image.convert('L')
            res_image = self.toTensor(res_image)
            res_image.sub_(0.5).div_(0.5)
            return res_image
        if w / h == 280 / 32:
            image = image.resize(self.size, Image.BILINEAR)
            image = image.convert('L')
            image = self.toTensor(image)
            image.sub_(0.5).div_(0.5)
            return image

您好,请问这个代码是把图片的读取方式换成PIL.Image了吗? 还有这个是改_own.py里面的__getitem__函数吗?

chichuhu avatar Nov 09 '20 09:11 chichuhu

@zcswdt good job

这块代码往哪个地方放

chichuhu avatar Nov 09 '20 09:11 chichuhu

如果有人成功尝试的话可以把代码贴出来 :)

        w, h = image.size
        if w / h < 280 / 32:
            image_temp = image.resize((int(32 / h * w), 32), Image.BILINEAR)
            w_temp, h_temp = image_temp.size
            image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR)
            image_temp = cv2.copyMakeBorder(
                image_temp, 0, 0, 0, 280 - w_temp, cv2.BORDER_CONSTANT, value=0)
            res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB))
            res_image = res_image.resize(self.size, Image.BILINEAR)
            res_image = res_image.convert('L')
            res_image = self.toTensor(res_image)
            res_image.sub_(0.5).div_(0.5)
            return res_image
        if w / h > 280 / 32:
            image_temp = image.resize((280, int(280 / w * h)), Image.BILINEAR)
            w_temp, h_temp = image_temp.size
            image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR)
            image_temp = cv2.copyMakeBorder(
                image_temp, 0, 32 - h_temp, 0, 0, cv2.BORDER_CONSTANT, value=0)
            res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB))
            res_image = res_image.resize(self.size, Image.BILINEAR)
            res_image = res_image.convert('L')
            res_image = self.toTensor(res_image)
            res_image.sub_(0.5).div_(0.5)
            return res_image
        if w / h == 280 / 32:
            image = image.resize(self.size, Image.BILINEAR)
            image = image.convert('L')
            image = self.toTensor(image)
            image.sub_(0.5).div_(0.5)
            return image

您好,请问这个代码是把图片的读取方式换成PIL.Image了吗? 还有这个是改_own.py里面的__getitem__函数吗?

def getitem(self, idx):

zcswdt avatar Dec 03 '20 03:12 zcswdt

@zcswdt good job

这块代码往哪个地方放

image

zcswdt avatar Dec 03 '20 03:12 zcswdt

@zcswdt 大大,不好意思,你的代码我在python3.6使用时,好像不太对,38 行h,w 应该不能直接承接 image.shape 吧,而且42行Image.size 返回的应该是hwc的一个数,怎么解包给w_temp 和 h_temp 呢?是我理解出错了吗?

19ethan avatar Apr 19 '21 06:04 19ethan

@zcswdt 大大,不好意思,你的代码我在python3.6使用时,好像不太对,38 行h,w 应该不能直接承接 image.shape 吧,而且42行Image.size 返回的应该是h_w_c的一个数,怎么解包给w_temp 和 h_temp 呢?是我理解出错了吗?

这里读入的图片应该是灰度图。

zcswdt avatar Apr 19 '21 08:04 zcswdt

@zcswdt good job

这块代码往哪个地方放

image

您好,你这个代码跟之前那个不一样,是有什么区别吗,用您这个方法怎么训练icdar2015数据集这种不定长的图片和标签吗,我放进去报错

lrfighting avatar May 19 '21 10:05 lrfighting

@zcswdt good job

这块代码往哪个地方放

image

您好,请问您的图片是什么规律呢?是长度一样还是大小不一,里面的字符也不同

lrfighting avatar May 19 '21 12:05 lrfighting

哪有那么麻烦,batchsize改成1就行了,缺点就是慢而已

MarStarck avatar Sep 01 '21 09:09 MarStarck