Yi-Chen (Howard) Lo
Yi-Chen (Howard) Lo
### Summary  - This line of work starts from the [*Local Binary Convolution Neural Networks (LBCNN)*](https://arxiv.org/abs/1608.06049) by the same author, which proposes a non-learnable spatial conv layer as an...
### Summary #### Contribution - Propose a novel formulation for object detection as detecting bounding box as paired keypoints (top-left and bottom-right corner), which does away with anchor boxes. -...
### Summary - This paper proposes to use a target-domain language model as a discriminator in GAN training. - The motivation: The error signal for generator provided by a binary-classifier...
### Summary - The word embeddings are derived from a Bi-LM (bidirectional language model), a.k.a., Embeddings from Language Models; ELMo. Specifically, a linear combination of the vector stacked above each...
### Summary This paper purposes to initialize the weights of encoder and decoder in neural sequence-to-sequence (Seq2Seq) model with two pre-trained language models, and then fine-tuned with labeled data. **During...
### Summary - Novel framework for image captioning that can produce natural language explicitly grounded in entities that object detectors find in the image. - Two step approach: First generate...
@luulinh90s Hi, what I meant here is ResNet-101 serves as the backbone image encoder for Faster-RCNN. They are not used separately. Maybe I should write "Faster-RCNN with ResNet-101 backbone" would...
@luulinh90s : ### How are visual words generated? To my understanding (it might be wrong), I think first, we get the region proposals from Faster-RCNN. For example, we can index...
@luulinh90s I will stop the discussion here since I've been busy recently. Hope my answers help! :-)
### TL;DR  - Supervised data augmentation: Current data augmentation method for labeled data provides a steady but limited performance boost when labeled data is usually small. - Unsupervised data...