eis_toolkit
eis_toolkit copied to clipboard
Add bayesian nn
Hi Let s start to check the code together!I m not an expert in bayesian NN but I study the original documentation of tit that is here:
- https://www.tensorflow.org/probability/overview
- https://keras.io/examples/keras_recipes/bayesian_neural_networks/
One important thing: I got trouble with the package version to install. With the current eis env I suggest you to install the 0.22 the newest one crashed. In this PR there are still some sort of commit from old stuff I do not get way! Monday I will copy a paste there some other documentation to explain what this Bayesian truly do :-)
I put two functions there:
- generate_prediction_using_traditional_arrays: this I think is the version to use for the toolikit, because it take np array as input like my cnn.
- generate_predictions_with_tensor_api: this one use the tf apoi to feed the bayesian network. Both of them do the same exact stuff it changes how the input is presented to the NN.
PS I start to do the last PR today!
Hi @zelioluca , and sorry for late reply! We agreed with @msmiyels that he will first work on the CNN and after that on the BNN if he has time. The plan is that Micha would finish the development of these 2 tools, so you might not need to do programming work on these anymore :). We'll ask some questions and help if the need arises!
Hello niko and micha how are you?ok ok qnything you need I am here! 😁
On Thu, 4 Apr 2024, 11:24 Niko Aarnio, @.***> wrote:
Hi @zelioluca https://github.com/zelioluca , and sorry for late reply! We agreed with @msmiyels https://github.com/msmiyels that he will first work on the CNN and after that on the BNN if he has time. The plan is that Micha would finish the development of these 2 tools, so you might not need to do programming work on these anymore :). We'll ask some questions and help if the need arises!
— Reply to this email directly, view it on GitHub https://github.com/GispoCoding/eis_toolkit/pull/357#issuecomment-2036510334, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF3KUFOJENVYFBPORRUEMOLY3UE35AVCNFSM6AAAAABFEMI5YOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZWGUYTAMZTGQ . You are receiving this because you were mentioned.Message ID: @.***>
Hi @nmaarnio , @zelioluca ,
sorry for the long waiting time ⌛ and esp. @zelioluca for building this up. As mentioned last week in the WP3 meeting, I got hands on the BNN code for review. Besides some minor EIS specific style guidance and toolkit things, there are a few points that I came across. I'm not a bayesian expert, too, so take them with a grain of salt.
Overview:
- Train size must be number of samples instead number of attributes when calculating
Kullback-Leibler
weights - Prediction and results are structured bit weird: they have each statistic per sample (pixel) in a nested structure which is not what we aim for
-
Activation
must have other options (likerelu
orlinear
, whereas latter is the default ifNone
is provided) or even no specification at all - Use of
last_activation
is misleading since the selected activation is only applied to the hidden layers, butNone
to the output layer(s) - Do not understand why using a non-bayesian layer for the output layer(s) definition
- Unfortunately, the choice of loss and distributions depends on the goal, so that I'm not sure that the “simple” negative log likelihood is the best choice for everything, especially binary classification problems
- Would stick to the approach that has been implemented in the other tools as well, i.e., using arrays instead of TF
generator objects
for data input (which makes thegenerate_predictions_with_tensor_api
functionality obsolete). - The way of defining the input layer technically works but looks overcomplicated:
Never used a dictionary with names etc. to get the number of attributes. Usually, the
input_shape
is just one parameter that is provided to the network or function - Is it intended, that the batch normalization is done right after the input layer definition? This way, it will not affect the batches during training between the hidden layers
- None of the test run results provided expected or usable results for the binary classification problem, which is the main case people are going to work with
Tests:
I ran a slightly modified version of the code on a test bench using real-world data from two mineral assessments for which we have clear expectations how the results should look like (both for an ANN and BNN). Technically, the code runs, but I wasn´t able to reproduce the expected outputs (even close).
I also put some effort into changing the behavior of the BNN by addressing some of the above listed points, but that also did not work out. I started once with the BNNs the same way, but it seems that the "classical" approach or what we find in the Keras and TF doc's just does not work very well for that particular problem. I guess the regression thing is easier to solve than the binary classification.
How to proceed:
So what I've offered in the meeting was to integrate a Bayesian nn we used in another project and refactor it to achieve EIS conformity by the end of September for review. @nmaarnio : do we want to close this PR and open a new branch for the other version?
I would be happy to merge the two code bases in the future (say the basic idea of this code here, but as working version based on the stuff from the bayesian we use), but since all of my efforts to do so ended up in a dead end, it seems easier to substitude than to spend any more time on solving for now.
Hello there how are you? Yes in my opinion we should use the one that is working. I m not a baysian guy too... I took the code from another guy and drop into the plug in.
On Mon, 16 Sept 2024, 11:02 Michael Steffen, @.***> wrote:
Hi @nmaarnio https://github.com/nmaarnio , @zelioluca https://github.com/zelioluca ,
sorry for the long waiting time ⌛ and esp. @zelioluca https://github.com/zelioluca for building this up. As mentioned last week in the WP3 meeting, I got hands on the BNN code for review. Besides some minor EIS specific style guidance and toolkit things, there are a few points that I came across. I'm not a bayesian expert, too, so take them with a grain of salt.
Overview:
- Train size must be number of samples instead number of attributes when calculating Kullback-Leibler weights
- Prediction and results are structured bit weird: they have each statistic per sample (pixel) in a nested structure which is not what we aim for
- Activation must have other options (like reluor linear , whereas latter is the default if None is provided) or even no specification at all
- Use of last_activation is misleading since the selected activation is only applied to the hidden layers, but None to the output layer(s)
- Do not understand why using a non-bayesian layer for the output layer(s) definition
- Unfortunately, the choice of loss and distributions depends on the goal, so that I'm not sure that the “simple” negative log likelihood is the best choice for everything, especially binary classification problems
- Would stick to the approach that has been implemented in the other tools as well, i.e., using arrays instead of TF generator objects for data input (which makes the generate_predictions_with_tensor_api functionality obsolete).
- The way of defining the input layer technically works but looks overcomplicated: Never used a dictionary with names etc. to get the number of attributes. Usually, the input_shape is just one parameter that is provided to the network or function
- Is it intended, that the batch normalization is done right after the input layer definition? This way, it will not affect the batches during training between the hidden layers
- None of the test run results provided expected or usable results for the binary classification problem, which is the main case people are going to work with
Tests:
I ran a slightly modified version of the code on a test bench using real-world data from two mineral assessments for which we have clear expectations how the results should look like (both for an ANN and BNN). Technically, the code runs, but I wasn´t able to reproduce the expected outputs (even close).
I also put some effort into changing the behavior of the BNN by addressing some of the above listed points, but that also did not work out. I started once with the BNNs the same way, but it seems that the "classical" approach or what we find in the Keras and TF doc's just does not work very well for that particular problem. I guess the regression thing is easier to solve than the binary classification.
How to proceed:
So what I've offered in the meeting was to integrate a Bayesian nn we used in another project and refactor it to achieve EIS conformity by the end of September for review. @nmaarnio https://github.com/nmaarnio : do we want to close this PR and open a new branch for the other version?
I would be happy to merge the two code bases in the future (say the basic idea of this code here, but as working version based on the stuff from the bayesian we use), but since all of my efforts to do so ended up in a dead end, it seems easier to substitude than to spend any more time on solving for now.
— Reply to this email directly, view it on GitHub https://github.com/GispoCoding/eis_toolkit/pull/357#issuecomment-2352248876, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF3KUFLSSCYCVRSQQ4F22C3ZW2GBXAVCNFSM6AAAAABFEMI5YOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJSGI2DQOBXGY . You are receiving this because you were mentioned.Message ID: @.***>
Hi, yes I think it is best to close this PR now and you can start with a new branch @msmiyels . We don't need to delete this branch, so if there is time and will in the future, we can of course return to this implementation if we see a need.
Closing now