echochoc comments

Results 5 comments of


                                            echochoc

focal loss的问题

> 根据focalloss的原理应该写成如下形式 > focal_loss = tf.abs(target - alpha) * tf.pow(tf.abs(target - actual), gamma) > > # 正样本时为 (1-alpha) * tf.pow(actual, gamma) > # 负样本时为（alpha） * tf.pow((1-actual), gamma) >...

loss does not converge

> Hi authors, I tried DINO with my dataset of 4000000 images of people. But after 30 epochs, the loss function does not decrease anymore. Do you have any idea...

复杂排版文字识别如何标注？

> 目前仅支持单行文本识别，多行复杂排版需要标注多个检测框。请问您的场景中都是这类图片吗？如果样式固定可以写一个统一的处理逻辑，例如把每张图片分离成价格和单位两个部分，分别经过OCR识别，最终将结果拼在一起。感谢回复。实际场景不一定，价签的版式多种多样。

@tink2123 我看PPOCRv4里使用了SVTR结构，把图片做了patch-wise image tokenization，是不是可以一定程度上解决这个问题呢？还有有的时候价格可能是这种形式的： ![image](https://github.com/PaddlePaddle/PaddleOCR/assets/18675077/6b757227-a76a-4a8b-a5aa-8a2170c142f3) 我希望预测出370.00，是否可以直接标记为"370.00"呢？

Possible to continue pretraining?

In my case, my training dataset will periodically grow over time, continuously incorporating images from new categories. I hope to train the model with long-life learning. Any ideas?