Kenneth Heafield
Kenneth Heafield
Fixed gcc 4 support with static (it can't do avx512 but it will compile everything else). I was even able to compile intgemm on machines maintained by our computing history...
It's currently hardcoded to use 8-bit on everything (except autotuning kicks in but it shouldn't do much). https://github.com/marian-nmt/marian-dev/blob/intgemm/src/graph/expression_operators.cpp#L269 https://github.com/marian-nmt/marian-dev/blob/intgemm/src/graph/expression_operators.cpp#L309 I'm writing code to optimize the minmax.
Now with vectorized max absolute value for various instruction sets. I also fixed some testing of 7-bit that I had accidentally checked in which was damaging BLEU. Should be much...
Column selection in PrepareB quantized format is tested with these functions: ``` static void Int16::SelectColumnsB(const int16_t *input, int16_t *output, int rows, const int *cols_begin, const int *cols_end); static void Int8::SelectColumnsB(const...
Only 8-bit does max absolute value to pick a qunatization multiplier which is never going to be fast. 16-bit just does 1024. Standard practice is to do it once on...
Ok, I have the post-quantization column selection working. Turns out I didn't think to define hash() and equals() for the column selection operator. It's probably cleaner to put this behind...
By the way if you want the really noisy stuff before cleaning https://s3.amazonaws.com/web-language-models/paracrawl/bonus/en-ha.classified.gz https://s3.amazonaws.com/web-language-models/paracrawl/bonus/en-ig.classified.gz . Taking a skim over the top sites: gospelgo.com just bible quotes islamhouse.com is religious but...
License is the usual one on paracrawl.eu. So much of this is machine translated though. Most likely [they are watermarked by Google](https://www.aclweb.org/anthology/D11-1126/), but Google has not to my knowledge documented...
This is a problem. We're not consistent between training and test. We're also creating the impression to Mozilla that this file doesn't exist when it needs to, which will bite...
Cleanest solution is probably to ship the file with the MT models. Or (and this is crazy) stuff it in the yaml somehow.