Junseong Kim
Junseong Kim
## Describe the bug I tried to infer gpt2 model with under code. The code use the DeepSpeed inference optimization. When I constantly repeated model inference, `floating point exception(core dump)`...
**Describe the bug** https://github.com/microsoft/DeepSpeed/pull/1705 add line to overwrite the input_mask(attention_mask) at DeepSpeedSelfAttentionFunction to dummy attention mask. Due to this code, `attention_mask` input has been ignored for all transformer models forwards....
## What Changed? - this PR fix #1925 - remove layer_past storing in transformer_inference ## why? - `layer_past` should not be stored in model. It should given by input. -...
Well all of you guys know, it's nearly impossible to train from the scratch, because of lack of computation power. So I'm going to implement the transfer code for making...
Is it possible to achieve the same result as the paper in short time? Well.. I don't have enough GPU & computation power to see the enough result as google...
Building the same corpus with original paper. Please share your tips to preprocess and download the file. It would be great to share preprocessed data using dropbox or google drive...
Building the same corpus with original paper. Please share your tips to preprocess and download the file. It would be great to share preprocessed data using dropbox or google drive...
## 어떤 내용의 논문인가요? 👋 - 본 논문에서는 먼저 language translation 모델을 활용해서 문장의 latent representation 더 잘 배울 수 있도록 하였습니다. 특히 이 latent representation 내에 stylistic 한 정보가...
## 어떤 내용의 논문인가요? 👋 why previous VAEs on text cannot learn controllable latent representation as on images, as well as a fix to enable the first success towards controlled...
## 어떤 내용의 논문인가요? 👋 - error correction and text style transfer 는 monolingual seq2seq 문제로 생각할 수 있습니다. - 두 테스크에서 모두 parallel corpus 가 부족하기 때문에 어려운 부분이...