Young Jin Kim comments

Results 17 comments of


                                            Young Jin Kim

Add RTS and token masking to top-2 gating + configurable jitter epsilon

Hi @awan-10. I am trying to add some missing features to top-2 gating which are currently only available for top-1 gating. Please take a look and let me know what...

Consider oneDNN instead of MKL for SGEMM

> Had an internal discussion with @vpirogov from the oneDNN team, unfortunately the support for FAISS MKL dependencies cannot be addressed in oneDNN as it is outside the Deep Learning...

Intgemm disabled when GENERATE_MARIAN_INSTALL_TARGETS=TRUE

@XapaJIaMnu When it's compiled with install target, there was a cmake error. Do you know how to fix this error? `CMake Error: install(EXPORT "marian-targets" ...) includes target "marian" which requires...

Choose code path for use of CPU features at run time instead of at compile time

This works well. https://github.com/pytorch/cpuinfo/blob/d5e37adf1406cf899d7d9ec1d317c47506ccb970/src/x86/isa.c#L391 https://github.com/pytorch/cpuinfo/blob/d5e37adf1406cf899d7d9ec1d317c47506ccb970/include/cpuinfo.h#L1001 And, it's used in FBGEMM. https://github.com/marian-nmt/FBGEMM/blob/84e66a976046180187724aff60a236c5378fde7c/src/Utils.cc#L201

Choose code path for use of CPU features at run time instead of at compile time

@kpu As long as it works, I don't have any preference. One thing I'd like to check is this is available on gcc only? Then, do we want to have...

Use bias epilogue in GPU affine operation if CUDA >= 10.1

Regarding the CPU, I recall the current implementation was efficient. I have run several different options to add bias, but this one was fastest for the student models on single...

Use bias epilogue in GPU affine operation if CUDA >= 10.1

@emjotde yes, that makes sense. fbgemm also has a bias epilogue, but it didn't help.

Marian won't compile on KNL processors

Just out of curiosity, how's marian's performance on KNL? Is there any benchmark for it?

Marian won't compile on KNL processors

@XapaJIaMnu Yes, that's what I thought. I was curious why someone's trying to use marian on KNL. I guess even compilation takes extremely long.

Marian won't compile on KNL processors

@kpu Ah,, that's an interesting story. It might have been really hard to get performance out of them for the transformer/RNN architectures. It might have some performance benefits for CNNs....