GLU
GLU copied to clipboard
An easy-to-use library for GLU (Gated Linear Units) and GLU variants in TensorFlow.
you are not using the dense layer in swiglu class SwiGLU(tf.keras.layers.Layer): def __init__(self, bias=True, dim=-1, **kwargs): super(SwiGLU, self).__init__(**kwargs) self.bias = bias self.dim = dim self.dense = tf.keras.layers.Dense(2, use_bias=bias) def call(self,...
Update tf.split keyword arg ## :page_facing_up: Context Initialiing the layers raises an TypeError. ## :pencil: Changes Change `tf.split(x, num_split=2, axis=self.dim) `to `tf.split(x, num_or_size_splits=2, axis=self.dim)` ## :no_entry_sign: Breaking Since tensorflow 2.3...