NeuralNetworks icon indicating copy to clipboard operation
NeuralNetworks copied to clipboard

Multiple bias per layer

Open expeon07 opened this issue 3 years ago • 3 comments

Hi, I'm trying to use your library for TinyML in the Arduino Uno. I have a pre-trained autoencoder model with multiple biases per layer. Do you have an example on how to use this for the library? I only see examples with 1 bias per layer.

Thank you!

expeon07 avatar Feb 18 '22 21:02 expeon07

At the moment we are talking, the library doesn't support this feature but it seems pretty easy to implement, so I might probably add it.

(To be honest with you, I didn't know that such a neural-network with multiple biases per layer existed.)

I'm mainly self taught on this subject and usually it takes me quite a while to fully grasp and carefully implement some ideas, so if you could provide me with some links or insights about the subject, that would be great!

In conclusion, (just by curiosity) my question is: are multiple biases per layer in your model really needed? And if yes why?

GiorgosXou avatar Feb 19 '22 11:02 GiorgosXou

Hi, thanks for the quick response. I think it's common to have a bias vector per layer or one (float) value per neuron in the layer. I was checking out this implementation as well. I extracted my weights and biases in the same way and indeed it outputted one bias per neuron on each layer.

https://github.com/hollance/TinyML-HelloWorld-ArduinoUno/blob/master/train_hello_world_model.ipynb

expeon07 avatar Feb 21 '22 11:02 expeon07

(sorry for this late reply, my phone screen just broke and I had to deal with that issue first)

I see... I'll do my best to bring a new version as soon as possible

GiorgosXou avatar Feb 23 '22 08:02 GiorgosXou

I've just trained a neural network using tensorflow in python and that's kinda standard behaviour, you get separate bias terms for each node. I'd love to see this feature added! Without it it's kinda not possible to use pre-trained networks from popular libraries like pytorch or tensorflow without effort to write custom code to use one bias while training.

Here's an example how many parameters tensorflow calculates for a 1 x 5 x 3 (3 layers) network:


Layer (type) Output Shape Param #

L1 (Dense) (None, 1) 2

L2 (Dense) (None, 5) 10

L3 (Dense) (None, 3) 18

================================================================= Total params: 30 Trainable params: 30 Non-trainable params: 0

pjurczen avatar Feb 18 '24 10:02 pjurczen

@pjurczen sorry for my late reply, I am planning to do so. I am planning to implement both multi-bias and non-bias layers. Although, until I do so, you might be interested in playing with something like this instead: screenshot_image

Click to expand Tensorflow example
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import LearningRateScheduler
import tensorflow as tf
import numpy as np


def lr_schedule(epoch, lr):
    if epoch < 2000:
        return 0.01
    elif epoch < 4000:
        return 0.001
    elif epoch < 7000:
        return 0.0001
    else:
        return 0.00001


tf.keras.backend.set_floatx('float32')

# Define the XOR gate inputs and outputs
input_size = 3
inputs  = np.array([[ 0, 0, 0 ], [ 0, 0, 1 ], [ 0, 1, 0 ], [ 0, 1, 1 ], [ 1, 0, 0 ], [ 1, 0, 1 ], [ 1, 1, 0 ], [ 1, 1, 1 ]], dtype = np.float32)
outputs = np.array([[0], [1], [1], [0], [1], [0], [0], [1]], dtype = np.float32)


# Create a simple convolutional neural network
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(input_size,)), # Input layer (no bias) 
    tf.keras.layers.Dense(5, activation='sigmoid', use_bias=False), # Dense  layer with 2 units and tanh activation
    tf.keras.layers.Dense(1, activation='sigmoid', use_bias=False)  # Output layer with 1 unit and sigmoid activation (binary classification)
])

# Compile the model
optimizer = Adam(learning_rate=0.01)
model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])

# Train the model
lr_callback = LearningRateScheduler(lr_schedule)
model.fit(inputs, outputs, epochs=9000, callbacks=[lr_callback], verbose=0) # proof of concep :P

# Evaluate the model on the training data
loss, accuracy = model.evaluate(inputs, outputs)
print(f"Model accuracy: {accuracy * 100:.2f}%")

# Predict XOR gate outputs
predictions = model.predict(inputs)
print("Predictions:")
for i in range(len(inputs)):
    print(f"Input: {inputs[i]}, Predicted Output: {predictions[i][0]:.7f}")

# Print weights
print()
all_layers_units = [input_size] + [layer.units for layer in model.layers] # input_size because `keras.Sequential` optimizes it by merging it with the dense on
for l in range(0,len(all_layers_units)-1):
    input_units  = all_layers_units[l]
    output_units = all_layers_units[l+1]
    weights = model.layers[l].get_weights()[0]

    print(f"// LAYER: {l} -> {l+1}:")
    for j in range(0,output_units):
        for i in range(0,input_units):
            print(f"{weights[i][j]:.7f}", end=', ')
        print()
    print()

GiorgosXou avatar Feb 28 '24 17:02 GiorgosXou

btw, fun realization I had... tensorflow by default returns the weights of each layer in a 2D-matrix of i*j instead of j*i: image

I chose the reverse way for optimization reason.

GiorgosXou avatar Feb 28 '24 17:02 GiorgosXou

@pjurczen sorry for my late reply, I am planning to do so. I am planning to implement both multi-bias and non-bias layers. Although, until I do so, you might be interested in playing with something like this instead: screenshot_image

Click to expand Tensorflow example

Hey, thanks for the reply! I ended up implementing the forward feed myself for the small network that I was using.

pjurczen avatar Mar 01 '24 12:03 pjurczen