David Landup
David Landup
Oh, I meant this: > Ps. Tested on actual training, works fine so far for 2D and 3D cases; now needs some polish. Sorry to bug you while you're on...
Of course! I'll get an example up and running as soon as I finish the test cases for the other PR
@tanzhenyu added an example run in the new PR #968 based on your training script for DeepLabV3
Changes were requested and it's still WIP
As ViTs are finished - I'll be working on this one now ;) If anyone wants to collab, let me know. (@ayulockin wanted to work on this a while back)
If nobody else signs up for it by the time MaxViT is done, I'd gladly hop onto MAXIM too :)
Since MaxViT uses MBConvs, which we have in EfficientNets, and which originated in MobileNets - we'll have three architectures reusing them same blocks. Additionally, having them as a layer would...
Done in new PR :) #1146
Thanks for tagging and awesome work on `RelativeMultiHeadAttention`! Question for the Keras team - do we want to make `RelativeMultiHeadAttention` part of core Keras? MHA already is, and the relative...
For reference, this is the constructor: ``` def __init__( self, project_dim, num_heads, mlp_dim, mlp_dropout=0.1, attention_dropout=0.1, activation=tf.keras.activations.gelu, layer_norm_epsilon=1e-06, attention_type='mha', **kwargs, ): ``` Though, because of the defaults, usage can be as...