CRNN_Chinese_Characters_Rec
CRNN_Chinese_Characters_Rec copied to clipboard
训练不同长度标签
请问怎么样更改代码,可以支持不定长度的训练呢?多谢指导~
修改dataloader,将图片padding到固定的长度
修改dataloader,将图片padding到固定的长度
请问,具体在哪里改呢,没有找到。可以具体说一下吗,谢谢!
https://github.com/Sierkinhane/CRNN_Chinese_Characters_Rec/tree/stable/lib/dataset _360cc.py里面的getitem
楼主说的是训练不同长度图片,那怎么训练不同长度标签呢
@zhunge 我没有具体尝试过,不过不同长度图片就可能有不同长度的标签, 对图片和标签都要padding才能concat成一个batch
如果有人成功尝试的话可以把代码贴出来 :)
如果有人成功尝试的话可以把代码贴出来 :)
w, h = image.size
if w / h < 280 / 32:
image_temp = image.resize((int(32 / h * w), 32), Image.BILINEAR)
w_temp, h_temp = image_temp.size
image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR)
image_temp = cv2.copyMakeBorder(
image_temp, 0, 0, 0, 280 - w_temp, cv2.BORDER_CONSTANT, value=0)
res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB))
res_image = res_image.resize(self.size, Image.BILINEAR)
res_image = res_image.convert('L')
res_image = self.toTensor(res_image)
res_image.sub_(0.5).div_(0.5)
return res_image
if w / h > 280 / 32:
image_temp = image.resize((280, int(280 / w * h)), Image.BILINEAR)
w_temp, h_temp = image_temp.size
image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR)
image_temp = cv2.copyMakeBorder(
image_temp, 0, 32 - h_temp, 0, 0, cv2.BORDER_CONSTANT, value=0)
res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB))
res_image = res_image.resize(self.size, Image.BILINEAR)
res_image = res_image.convert('L')
res_image = self.toTensor(res_image)
res_image.sub_(0.5).div_(0.5)
return res_image
if w / h == 280 / 32:
image = image.resize(self.size, Image.BILINEAR)
image = image.convert('L')
image = self.toTensor(image)
image.sub_(0.5).div_(0.5)
return image
@zcswdt good job
楼主说的是训练不同长度图片,那怎么训练不同长度标签呢
请问前辈有方法支持不同长度标签的训练吗
如果有人成功尝试的话可以把代码贴出来 :)
w, h = image.size if w / h < 280 / 32: image_temp = image.resize((int(32 / h * w), 32), Image.BILINEAR) w_temp, h_temp = image_temp.size image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR) image_temp = cv2.copyMakeBorder( image_temp, 0, 0, 0, 280 - w_temp, cv2.BORDER_CONSTANT, value=0) res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB)) res_image = res_image.resize(self.size, Image.BILINEAR) res_image = res_image.convert('L') res_image = self.toTensor(res_image) res_image.sub_(0.5).div_(0.5) return res_image if w / h > 280 / 32: image_temp = image.resize((280, int(280 / w * h)), Image.BILINEAR) w_temp, h_temp = image_temp.size image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR) image_temp = cv2.copyMakeBorder( image_temp, 0, 32 - h_temp, 0, 0, cv2.BORDER_CONSTANT, value=0) res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB)) res_image = res_image.resize(self.size, Image.BILINEAR) res_image = res_image.convert('L') res_image = self.toTensor(res_image) res_image.sub_(0.5).div_(0.5) return res_image if w / h == 280 / 32: image = image.resize(self.size, Image.BILINEAR) image = image.convert('L') image = self.toTensor(image) image.sub_(0.5).div_(0.5) return image
您好,请问这个代码是把图片的读取方式换成PIL.Image了吗? 还有这个是改_own.py里面的__getitem__函数吗?
@zcswdt good job
这块代码往哪个地方放
如果有人成功尝试的话可以把代码贴出来 :)
w, h = image.size if w / h < 280 / 32: image_temp = image.resize((int(32 / h * w), 32), Image.BILINEAR) w_temp, h_temp = image_temp.size image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR) image_temp = cv2.copyMakeBorder( image_temp, 0, 0, 0, 280 - w_temp, cv2.BORDER_CONSTANT, value=0) res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB)) res_image = res_image.resize(self.size, Image.BILINEAR) res_image = res_image.convert('L') res_image = self.toTensor(res_image) res_image.sub_(0.5).div_(0.5) return res_image if w / h > 280 / 32: image_temp = image.resize((280, int(280 / w * h)), Image.BILINEAR) w_temp, h_temp = image_temp.size image_temp = cv2.cvtColor(np.asarray(image_temp), cv2.COLOR_RGB2BGR) image_temp = cv2.copyMakeBorder( image_temp, 0, 32 - h_temp, 0, 0, cv2.BORDER_CONSTANT, value=0) res_image = Image.fromarray(cv2.cvtColor(image_temp, cv2.COLOR_BGR2RGB)) res_image = res_image.resize(self.size, Image.BILINEAR) res_image = res_image.convert('L') res_image = self.toTensor(res_image) res_image.sub_(0.5).div_(0.5) return res_image if w / h == 280 / 32: image = image.resize(self.size, Image.BILINEAR) image = image.convert('L') image = self.toTensor(image) image.sub_(0.5).div_(0.5) return image
您好,请问这个代码是把图片的读取方式换成PIL.Image了吗? 还有这个是改_own.py里面的__getitem__函数吗?
def getitem(self, idx):
@zcswdt good job
这块代码往哪个地方放
@zcswdt 大大,不好意思,你的代码我在python3.6使用时,好像不太对,38 行h,w 应该不能直接承接 image.shape 吧,而且42行Image.size 返回的应该是hwc的一个数,怎么解包给w_temp 和 h_temp 呢?是我理解出错了吗?
@zcswdt 大大,不好意思,你的代码我在python3.6使用时,好像不太对,38 行h,w 应该不能直接承接 image.shape 吧,而且42行Image.size 返回的应该是h_w_c的一个数,怎么解包给w_temp 和 h_temp 呢?是我理解出错了吗?
这里读入的图片应该是灰度图。
@zcswdt good job
这块代码往哪个地方放
您好,你这个代码跟之前那个不一样,是有什么区别吗,用您这个方法怎么训练icdar2015数据集这种不定长的图片和标签吗,我放进去报错
@zcswdt good job
这块代码往哪个地方放
您好,请问您的图片是什么规律呢?是长度一样还是大小不一,里面的字符也不同
哪有那么麻烦,batchsize改成1就行了,缺点就是慢而已