RNNSharp icon indicating copy to clipboard operation
RNNSharp copied to clipboard

System Total Fail due to double-Number conversion problems for non "es-US" Cultures

Open andy-soft opened this issue 5 years ago • 0 comments

Hi, I trained the sample, for english BIO NER labeler, adn the training was OK But when running the TEST batch, it fails TOTALLY! The labels got were absolutely NUTS!

so I started to din inside algorithms, and found a BIG PROBLEM when you write down the training and model files, and use a "string" as double, in "en-US" culture, the decimal point is a "." while in spanish culture, it is a "," so when writing down and reading agfain the files, there is a inconsistency (if used as string) so the solution is 2 ways

  1. use this way (modelReader.cs)

             //读入cost_factor
             strLine = sr.ReadLine();
             cost_factor_ = double.Parse(strLine.Split(':')[1].Trim(), CultureInfo.InvariantCulture);
    
  2. add this to the first run inside a MAIN() loop for console apps.

    static void Main(string[] args)
     {
     	CultureInfo culture = CultureInfo.CreateSpecificCulture("en-US");
     	CultureInfo.DefaultThreadCurrentCulture = culture;
     	CultureInfo.DefaultThreadCurrentUICulture = culture;
     	Thread.CurrentThread.CurrentCulture = culture;
     	Thread.CurrentThread.CurrentUICulture = culture;
      .....
    

And now it works!

andy-soft avatar Nov 03 '18 13:11 andy-soft