NeuralNetworks
NeuralNetworks copied to clipboard
Multiple bias per layer
Hi, I'm trying to use your library for TinyML in the Arduino Uno. I have a pre-trained autoencoder model with multiple biases per layer. Do you have an example on how to use this for the library? I only see examples with 1 bias per layer.
Thank you!
At the moment we are talking, the library doesn't support this feature but it seems pretty easy to implement, so I might probably add it.
(To be honest with you, I didn't know that such a neural-network with multiple biases per layer existed.)
I'm mainly self taught on this subject and usually it takes me quite a while to fully grasp and carefully implement some ideas, so if you could provide me with some links or insights about the subject, that would be great!
In conclusion, (just by curiosity) my question is: are multiple biases per layer in your model really needed? And if yes why?
Hi, thanks for the quick response. I think it's common to have a bias vector per layer or one (float) value per neuron in the layer. I was checking out this implementation as well. I extracted my weights and biases in the same way and indeed it outputted one bias per neuron on each layer.
https://github.com/hollance/TinyML-HelloWorld-ArduinoUno/blob/master/train_hello_world_model.ipynb
(sorry for this late reply, my phone screen just broke and I had to deal with that issue first)
I see... I'll do my best to bring a new version as soon as possible
I've just trained a neural network using tensorflow in python and that's kinda standard behaviour, you get separate bias terms for each node. I'd love to see this feature added! Without it it's kinda not possible to use pre-trained networks from popular libraries like pytorch or tensorflow without effort to write custom code to use one bias while training.
Here's an example how many parameters tensorflow calculates for a 1 x 5 x 3 (3 layers) network:
Layer (type) Output Shape Param #
L1 (Dense) (None, 1) 2
L2 (Dense) (None, 5) 10
L3 (Dense) (None, 3) 18
================================================================= Total params: 30 Trainable params: 30 Non-trainable params: 0
@pjurczen sorry for my late reply, I am planning to do so. I am planning to implement both multi-bias and non-bias layers. Although, until I do so, you might be interested in playing with something like this instead:
Click to expand Tensorflow example
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import LearningRateScheduler
import tensorflow as tf
import numpy as np
def lr_schedule(epoch, lr):
if epoch < 2000:
return 0.01
elif epoch < 4000:
return 0.001
elif epoch < 7000:
return 0.0001
else:
return 0.00001
tf.keras.backend.set_floatx('float32')
# Define the XOR gate inputs and outputs
input_size = 3
inputs = np.array([[ 0, 0, 0 ], [ 0, 0, 1 ], [ 0, 1, 0 ], [ 0, 1, 1 ], [ 1, 0, 0 ], [ 1, 0, 1 ], [ 1, 1, 0 ], [ 1, 1, 1 ]], dtype = np.float32)
outputs = np.array([[0], [1], [1], [0], [1], [0], [0], [1]], dtype = np.float32)
# Create a simple convolutional neural network
model = tf.keras.Sequential([
tf.keras.layers.Input(shape=(input_size,)), # Input layer (no bias)
tf.keras.layers.Dense(5, activation='sigmoid', use_bias=False), # Dense layer with 2 units and tanh activation
tf.keras.layers.Dense(1, activation='sigmoid', use_bias=False) # Output layer with 1 unit and sigmoid activation (binary classification)
])
# Compile the model
optimizer = Adam(learning_rate=0.01)
model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])
# Train the model
lr_callback = LearningRateScheduler(lr_schedule)
model.fit(inputs, outputs, epochs=9000, callbacks=[lr_callback], verbose=0) # proof of concep :P
# Evaluate the model on the training data
loss, accuracy = model.evaluate(inputs, outputs)
print(f"Model accuracy: {accuracy * 100:.2f}%")
# Predict XOR gate outputs
predictions = model.predict(inputs)
print("Predictions:")
for i in range(len(inputs)):
print(f"Input: {inputs[i]}, Predicted Output: {predictions[i][0]:.7f}")
# Print weights
print()
all_layers_units = [input_size] + [layer.units for layer in model.layers] # input_size because `keras.Sequential` optimizes it by merging it with the dense on
for l in range(0,len(all_layers_units)-1):
input_units = all_layers_units[l]
output_units = all_layers_units[l+1]
weights = model.layers[l].get_weights()[0]
print(f"// LAYER: {l} -> {l+1}:")
for j in range(0,output_units):
for i in range(0,input_units):
print(f"{weights[i][j]:.7f}", end=', ')
print()
print()
btw, fun realization I had... tensorflow by default returns the weights of each layer in a 2D-matrix of i*j instead of j*i:
I chose the reverse way for optimization reason.
@pjurczen sorry for my late reply, I am planning to do so. I am planning to implement both multi-bias and non-bias layers. Although, until I do so, you might be interested in playing with something like this instead:
Click to expand Tensorflow example
Hey, thanks for the reply! I ended up implementing the forward feed myself for the small network that I was using.