eis_toolkit Add bayesian nn

Hi Let s start to check the code together!I m not an expert in bayesian NN but I study the original documentation of tit that is here:

https://www.tensorflow.org/probability/overview
https://keras.io/examples/keras_recipes/bayesian_neural_networks/

One important thing: I got trouble with the package version to install. With the current eis env I suggest you to install the 0.22 the newest one crashed. In this PR there are still some sort of commit from old stuff I do not get way! Monday I will copy a paste there some other documentation to explain what this Bayesian truly do :-)

I put two functions there:

generate_prediction_using_traditional_arrays: this I think is the version to use for the toolikit, because it take np array as input like my cnn.
generate_predictions_with_tensor_api: this one use the tf apoi to feed the bayesian network. Both of them do the same exact stuff it changes how the input is presented to the NN.

PS I start to do the last PR today!

Mar 23 '24 05:03 zelioluca

Hi @zelioluca , and sorry for late reply! We agreed with @msmiyels that he will first work on the CNN and after that on the BNN if he has time. The plan is that Micha would finish the development of these 2 tools, so you might not need to do programming work on these anymore :). We'll ask some questions and help if the need arises!

Apr 04 '24 08:04 nmaarnio

Hello niko and micha how are you?ok ok qnything you need I am here! 😁

On Thu, 4 Apr 2024, 11:24 Niko Aarnio, @.***> wrote:

Hi @zelioluca https://github.com/zelioluca , and sorry for late reply! We agreed with @msmiyels https://github.com/msmiyels that he will first work on the CNN and after that on the BNN if he has time. The plan is that Micha would finish the development of these 2 tools, so you might not need to do programming work on these anymore :). We'll ask some questions and help if the need arises!

— Reply to this email directly, view it on GitHub https://github.com/GispoCoding/eis_toolkit/pull/357#issuecomment-2036510334, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF3KUFOJENVYFBPORRUEMOLY3UE35AVCNFSM6AAAAABFEMI5YOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZWGUYTAMZTGQ . You are receiving this because you were mentioned.Message ID: @.***>

Apr 04 '24 08:04 zelioluca

Hi @nmaarnio , @zelioluca ,

sorry for the long waiting time ⌛ and esp. @zelioluca for building this up. As mentioned last week in the WP3 meeting, I got hands on the BNN code for review. Besides some minor EIS specific style guidance and toolkit things, there are a few points that I came across. I'm not a bayesian expert, too, so take them with a grain of salt.

Overview:

Train size must be number of samples instead number of attributes when calculating Kullback-Leibler weights
Prediction and results are structured bit weird: they have each statistic per sample (pixel) in a nested structure which is not what we aim for
Activation must have other options (like reluor linear , whereas latter is the default if None is provided) or even no specification at all
Use of last_activation is misleading since the selected activation is only applied to the hidden layers, but None to the output layer(s)
Do not understand why using a non-bayesian layer for the output layer(s) definition
Unfortunately, the choice of loss and distributions depends on the goal, so that I'm not sure that the “simple” negative log likelihood is the best choice for everything, especially binary classification problems
Would stick to the approach that has been implemented in the other tools as well, i.e., using arrays instead of TF generator objects for data input (which makes the generate_predictions_with_tensor_api functionality obsolete).
The way of defining the input layer technically works but looks overcomplicated: Never used a dictionary with names etc. to get the number of attributes. Usually, the input_shape is just one parameter that is provided to the network or function
Is it intended, that the batch normalization is done right after the input layer definition? This way, it will not affect the batches during training between the hidden layers
None of the test run results provided expected or usable results for the binary classification problem, which is the main case people are going to work with

Tests:

I ran a slightly modified version of the code on a test bench using real-world data from two mineral assessments for which we have clear expectations how the results should look like (both for an ANN and BNN). Technically, the code runs, but I wasn´t able to reproduce the expected outputs (even close).

I also put some effort into changing the behavior of the BNN by addressing some of the above listed points, but that also did not work out. I started once with the BNNs the same way, but it seems that the "classical" approach or what we find in the Keras and TF doc's just does not work very well for that particular problem. I guess the regression thing is easier to solve than the binary classification.

How to proceed:

So what I've offered in the meeting was to integrate a Bayesian nn we used in another project and refactor it to achieve EIS conformity by the end of September for review. @nmaarnio : do we want to close this PR and open a new branch for the other version?

I would be happy to merge the two code bases in the future (say the basic idea of this code here, but as working version based on the stuff from the bayesian we use), but since all of my efforts to do so ended up in a dead end, it seems easier to substitude than to spend any more time on solving for now.

Sep 16 '24 08:09 msmiyels

Hello there how are you? Yes in my opinion we should use the one that is working. I m not a baysian guy too... I took the code from another guy and drop into the plug in.

On Mon, 16 Sept 2024, 11:02 Michael Steffen, @.***> wrote:

Hi @nmaarnio https://github.com/nmaarnio , @zelioluca https://github.com/zelioluca ,

sorry for the long waiting time ⌛ and esp. @zelioluca https://github.com/zelioluca for building this up. As mentioned last week in the WP3 meeting, I got hands on the BNN code for review. Besides some minor EIS specific style guidance and toolkit things, there are a few points that I came across. I'm not a bayesian expert, too, so take them with a grain of salt.

Overview:

Train size must be number of samples instead number of attributes when calculating Kullback-Leibler weights

Prediction and results are structured bit weird: they have each statistic per sample (pixel) in a nested structure which is not what we aim for

Activation must have other options (like reluor linear , whereas latter is the default if None is provided) or even no specification at all

Use of last_activation is misleading since the selected activation is only applied to the hidden layers, but None to the output layer(s)

Do not understand why using a non-bayesian layer for the output layer(s) definition

Unfortunately, the choice of loss and distributions depends on the goal, so that I'm not sure that the “simple” negative log likelihood is the best choice for everything, especially binary classification problems

Would stick to the approach that has been implemented in the other tools as well, i.e., using arrays instead of TF generator objects for data input (which makes the generate_predictions_with_tensor_api functionality obsolete).

The way of defining the input layer technically works but looks overcomplicated: Never used a dictionary with names etc. to get the number of attributes. Usually, the input_shape is just one parameter that is provided to the network or function

Is it intended, that the batch normalization is done right after the input layer definition? This way, it will not affect the batches during training between the hidden layers

None of the test run results provided expected or usable results for the binary classification problem, which is the main case people are going to work with

Tests:

I ran a slightly modified version of the code on a test bench using real-world data from two mineral assessments for which we have clear expectations how the results should look like (both for an ANN and BNN). Technically, the code runs, but I wasn´t able to reproduce the expected outputs (even close).

I also put some effort into changing the behavior of the BNN by addressing some of the above listed points, but that also did not work out. I started once with the BNNs the same way, but it seems that the "classical" approach or what we find in the Keras and TF doc's just does not work very well for that particular problem. I guess the regression thing is easier to solve than the binary classification.

How to proceed:

So what I've offered in the meeting was to integrate a Bayesian nn we used in another project and refactor it to achieve EIS conformity by the end of September for review. @nmaarnio https://github.com/nmaarnio : do we want to close this PR and open a new branch for the other version?

I would be happy to merge the two code bases in the future (say the basic idea of this code here, but as working version based on the stuff from the bayesian we use), but since all of my efforts to do so ended up in a dead end, it seems easier to substitude than to spend any more time on solving for now.

— Reply to this email directly, view it on GitHub https://github.com/GispoCoding/eis_toolkit/pull/357#issuecomment-2352248876, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF3KUFLSSCYCVRSQQ4F22C3ZW2GBXAVCNFSM6AAAAABFEMI5YOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJSGI2DQOBXGY . You are receiving this because you were mentioned.Message ID: @.***>

Sep 16 '24 08:09 zelioluca

Hi, yes I think it is best to close this PR now and you can start with a new branch @msmiyels . We don't need to delete this branch, so if there is time and will in the future, we can of course return to this implementation if we see a need.

Sep 16 '24 08:09 nmaarnio

Closing now

Sep 16 '24 08:09 nmaarnio

eis_toolkit eis_toolkit copied to clipboard

Add bayesian nn

eis_toolkit
eis_toolkit copied to clipboard