Junki Ohmura
Junki Ohmura
FreeLB: Enhanced Adversarial Training for Language Understanding Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, Jingjing Liu, ICLR 2020 - https://arxiv.org/abs/1909.11764 - [openreview (8-8-8)](https://openreview.net/forum?id=BygzbyHFvB) ## 概要 FreeLB (Free...
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut https://arxiv.org/abs/1909.11942 code: https://github.com/google-research/google-research/tree/master/albert ## 概要 事前学習モデルで,モデルサイズを大きくするとパフォーマンスが向上する傾向にあるが,GPU/TPUメモリは高コストになり,さらに予期しないパフォーマンス低下を招くことがある.本論文では,メモリ消費量の削減のために2つのパラメタ削減方法を提案し,BERTの学習速度も向上させる(A Lite BERT,...
2019: Modeling Semantic Relationship in Multi-turn Conversations with Hierarchical Latent Variables
Modeling Semantic Relationship in Multi-turn Conversations with Hierarchical Latent Variables Lei Shen, Yang Feng, Haolan Zhan 6 pages, accepted by ACL 2019 https://arxiv.org/abs/1906.07429 ## 概要 マルチターンの対話システムを行うためのモデル,Conversational Semantic Relationship RNN (CSRR)を提案.発話間の関係を階層的に捉えることができる.このモデルは3つの階層で潜在変数を用いて表現する....
Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study Chinnadhurai Sankar, Sandeep Subramanian, Christopher Pal, Sarath Chandar, Yoshua Bengio To appear at ACL 2019 https://arxiv.org/abs/1906.01603 code: https://github.com/chinnadhurai/ParlAI/...
Attention over Parameters for Dialogue Systems Andrea Madotto, Zhaojiang Lin, Chien-Sheng Wu, Jamin Shin, Pascale Fung NeurIPS Conversational AI Workshops (Best Paper Award) https://arxiv.org/abs/2001.01871
TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising Ziyi Yang, Chenguang Zhu, Robert Gmyr, Michael Zeng, Xuedong Huang 10 pages, 3 figures https://arxiv.org/abs/2001.00725
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples Anish Athalye, Nicholas Carlini, David Wagner ICML 2018. Source code at this https URL https://arxiv.org/abs/1802.00420
Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, Shin Ishii To be appeared in IEEE Transactions on Pattern Analysis and Machine...
Towards Deep Learning Models Resistant to Adversarial Attacks Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu ICLR'18 https://arxiv.org/abs/1706.06083
Understanding Knowledge Distillation in Non-autoregressive Machine Translation Chunting Zhou, Graham Neubig, Jiatao Gu https://arxiv.org/abs/1911.02727