research-GANwriting
research-GANwriting copied to clipboard
Errors in Architecture Overview
When looking at the image of the architecture overview, I noticed two things that were reflected differently in the code.
- The noise that is added to
is not present in the code. Am I missing something here?
- The cubes that represent the shape of
are misleading because they imply that when merging
and
the channel dimension changes. However, the linear layer here halves the number of channels of the combined feature maps. This way the number of channels is
.
If I am not mistaken or have missed something, would it be possible to fix those issues? Because besides those minor flaws, the graphic is really beautiful and provides a great overview of the network's architecure.
Thanks for pointing out the useful details!
-
Yes you are right, in the code we didn't introduce the noise explicitly. Since Xi is a subset of images, shuffling Xi is a way to introduce noise implicitly, which is our original intuition. I agree with you that this noise injection arrow in the Figure might mislead people.
-
Yes, F is the concatenation of hat{Fs} and Fc along channels so as to end up with channel number 1024. In the Figure, the missing part is the Linear layer that projects 1024 channels back to 512 between F and G.
We will try to update the Figure in the next version, cheers:-)