AdvancedLiterateMachinery icon indicating copy to clipboard operation
AdvancedLiterateMachinery copied to clipboard

Can MGP-STR deal with Chinese text? Can I train MGP-STR with Huggingface version?

Open BrianPYChen opened this issue 1 year ago • 1 comments
trafficstars

Hi AlibabaResearch,

I have few questions listed as below:

  1. Can MGP-STR deal with Chinese text with code in GitHub or Huggingface?
  2. Can I train MGP-STR with Huggingface version?

Thanks.

BrianPYChen avatar Mar 06 '24 02:03 BrianPYChen

Hi, Currently, MGP-STR is unable to process Chinese as the model has not been trained on Chinese data, and we have not found an effective method for segmenting Chinese words. If you have discovered one, we welcome the exchange of ideas.

The version on Huggingface is only capable of inference; for training purposes, you may refer to the instructions provided on GitHub.

wdp-007 avatar Mar 12 '24 08:03 wdp-007