Kenneth Heafield comments

Results 290 comments of


                                            Kenneth Heafield

Choose code path for use of CPU features at run time instead of at compile time

There's getting CPU features which is relatively easy: ``` template T ChooseCPU(T avx512vnni, T avx512, T avx2, T ssse3, T sse2, T unsupported) { // TODO: don't catch Knights processors...

Use bias epilogue in GPU affine operation if CUDA >= 10.1

Regarding MKL, paging @sidkashyap-at-Intel

What's the effect of decoder --mini-batch size?

> So, the takeaway here is, we should make `--maxi-batch-sort src` default for translation. It is `trg` by-default for training. Should we go further and make batched translation the default,...

Implement Constrained Beam Search (Disjunctive Positive Constraint Decoding)

Aren't beam search based approaches deprecated in favor of model based approaches? See https://aclanthology.org/P19-1294/ Here's Marian's implementation of the above paper: https://github.com/marian-nmt/marian-examples/tree/master/forced-translation Regarding "disjunctive" constraints, it would seem the natural...

Fast implementation of Select for most cases on CPU

I know, will fix it this week.

Marian won't compile on KNL processors

Ironically KNL was the original CPU port of Marian. Then we lost interest in it but there are still 2 KNLs and 8 KNMs floating around. How is the updated...

problem with workspace > 26000

IIRC GPUs don't have a native 64-bit int type which is why you would see a penalty.

Quirky --max-length in marian-decoder

"(and fail if out of memory)"

Quirky --max-length in marian-decoder

The dominant use case is translating things. Where you want one line in and one line out. You also want that for backtranslation. https://github.com/kpu/preprocess/blob/master/preprocess/remove_long_lines_main.cc

Quirky --max-length in marian-decoder

Marian should preserve the principle of one line in, one line out. If you're backtranslating web data, the long line should have been removed from the input; there's nothing consistent...