fastrtext icon indicating copy to clipboard operation
fastrtext copied to clipboard

Add autotune option

Open alanault opened this issue 6 years ago • 11 comments

Hi there,

I saw on the Fasttext page here they've added an autotune feature, which automatically optimizes the various hyperparameters.

Seems it can be activated with the -autotune-validation option, which isn't currently supported. Wondered if this could be added with the updates for CRAN?

https://fasttext.cc/docs/en/autotune.html

best

Alan

alanault avatar Sep 19 '19 10:09 alanault

Hi,

Will work on it since I need to fix something else to be back on Cran... Hope next week I will have some time to work on it.

pommedeterresautee avatar Sep 20 '19 12:09 pommedeterresautee

Sounds great - intrigued to see the results of this output

alanault avatar Sep 24 '19 12:09 alanault

Hi there, just wondered how this was coming on?

I'm not familiar with Rcpp, but happy to help on any R-based areas if that's useful?

alanault avatar Oct 21 '19 08:10 alanault

Hello, I have updated the C++ code. Should work using command line. Do you think it would make sense to have a R API for this?

pommedeterresautee avatar Oct 28 '19 08:10 pommedeterresautee

Hi: that's fantastic!

Yes, I was thinking it could just be exactly the same as the other calls, just with the -autotune-validation passed as an option in the execute command, along with the validation file.

So, to use the supervised example: execute(commands = c("supervised", "-input", train_tmp_file_txt, "-output", tmp_file_model, "-autotune-validation", valid_tmp_file_text))

That way everything is consistent within the execute command?

alanault avatar Oct 28 '19 08:10 alanault

I think so. Not yet tried. If you check, can you let me know if it works?

pommedeterresautee avatar Oct 28 '19 11:10 pommedeterresautee

Installed 0.3.4 So the call seems to work just fine... however each time I try it I get a C stack usage crash (is too close to the limit).

I'm only training on 100k short sentences and it crashes within seconds at (0.8% of completion), so wonder if something else is going on?

alanault avatar Oct 28 '19 15:10 alanault

does it crash in other situations?

pommedeterresautee avatar Oct 28 '19 17:10 pommedeterresautee

No - if I remove the -autotune-validation argument and just make the same call train, then it runs just fine on the same data, so must be related to the validation...

alanault avatar Oct 28 '19 18:10 alanault

Did a test using the command line version and this worked fine, so the issue must be somewhere inside the rcpp wrapper

alanault avatar Oct 30 '19 08:10 alanault

Any luck with this?

alanault avatar Nov 19 '19 14:11 alanault