swift-apis
swift-apis copied to clipboard
Support Advanced Layers
Now that we support a multitude of basic layers, I'd like this issue to serve as the discussion for supporting advanced layers. While #54 tracked basic layers and most have been implemented, We now need to consider which of the following to add in swift-apis. tensorflow/swift-models#231 added bert, with that also added support for transformers and attention layers. Having reviewed a variety of current frameworks, I've made the following list:
- [ ] Masking
- [ ] Spatial Dropout 1D, 2D, 3D
- [ ] Cropping 1D, 2D, 3D
- [ ] Locally-Connected 1D, 2D
- [ ] ConvLSTM 1D, 2D
- [ ] Concatenate
- [ ] Gaussian Noise
- [ ] Gaussian Dropout
- [ ] Alpha Dropout
- [ ] TimeDistributed
- [ ] Bidirectional
- [ ] Dilation2D
- [ ] Erosion2D
- [ ] MaxUnpool 1D, 2D, 3D
- [ ] ReflectionPad, ReplicationPad and ConstantPadding
- [x] GroupNorm, InstanceNorm
- [ ] Transformer, TransformerEncoder, TransformerDecoder
- [ ] PixelShuffle
- [ ] Attention
- [ ] LSTMCell & GRUCell
- [ ] Recusive Neural Nets #68
- [ ] Neural Turing Machine #52
If anyone else has any feature/layer requests please do add to this PR.
There’re Attention and MultiHeadAttention defined in tensorflow/swift-models/Transformer
. Maybe those can graduate into ‘/swift-apis’ in some shape and form.
(see struct Attention: ParameterlessLayer...
, for example)
@8bitmp3 yeah that's the plan for the transformer and attention layers afaik, we stage the layers there while eventually moving the fleshed out versions to this repository.
could use a UpSampling2D to do an autoencoder demo.
@brettkoonce Upsampling1D, 2D and 3D already exist.
@Shashi456 not sure how i missed that 😓
@dan-zheng could you add a good first issue tag to this PR? And add any other comments if you have any.
@Shashi456 Is this list updated? If so I would like to contribute
swift-models/Models/Text/GPT2/TransformerLM.swift has a TimeDistributed
Layer. I believe this is a specific case of the Tensorflow keras TimeDistributed
, which takes an arbitrary layer and applies it across the temporal dimension. In the swift-models case it is wrapping specifically a given Dense<Float>
layer.