bnlearn icon indicating copy to clipboard operation
bnlearn copied to clipboard

Continous data

Open PARODBE opened this issue 4 years ago • 19 comments

Hi again!

I was thinking in using bayesian networks for continous data, do you think that an implementation in order to work with these kind of data could be possible?

Thanks!

PARODBE avatar Sep 09 '21 19:09 PARODBE

At the moment bnlearn can only be used for discrete/categorical analysis. Implementing models that can work with continuous values is on my (long) todo list.

Nevertheless, there are possibilities to model your data if you have continuous data. You may can discretize your variables with domain/experts knowledge. For example, if you have temperature, you can mark temperature < 0 as freezing, and

0 as normal. Or many more smaller categories.

erdogant avatar Sep 17 '21 20:09 erdogant

Any update as of today regarding to implementing the continuous values?

rafmora avatar Feb 01 '22 08:02 rafmora

Not yet but I'm planning to give it a start soonish. If you have any suggestions, working examples or methods to start with, let me know.

erdogant avatar Feb 01 '22 09:02 erdogant

Yes. Happy to help with examples. I am currently modeling soil corrosivity in Python. Please let me when you would like to start so I can prepare a easy-to-start example.

rafmora avatar Feb 01 '22 14:02 rafmora

Cheers for your work @erdogant . Do you have any kind of ETA on this ? :-)

thomasfrederikhoeck avatar Jul 05 '22 08:07 thomasfrederikhoeck

Yes Soonish! Hopefully shortly after the summer break in august! #wishfullthinking

erdogant avatar Jul 05 '22 15:07 erdogant

Please count on me for adding the feature to the model. Thanks

samanemami avatar Jul 21 '22 16:07 samanemami

@erdogant Have you seen this continuous approach?

samanemami avatar Aug 06 '22 16:08 samanemami

Thank you for spotting this repo! But the repo seems to support also only discrete models.

erdogant avatar Aug 07 '22 07:08 erdogant

I am thinking of applying their continuous approach to bnlean!

samanemami avatar Aug 07 '22 16:08 samanemami

If you can get it up and running with an example for continuous data, It would be great! But the name of the repo I am seeing is [LearnDiscreteBayesNets.jl], which hints towards discrete models. Can you paste the name of the function(s) of interest to be sure I am looking at the right repo.

erdogant avatar Aug 07 '22 22:08 erdogant

@erdogant This is its related paper. And here, the used function, I guess!

samanemami avatar Aug 08 '22 19:08 samanemami

@samanemami I think the function you are referring to simply discretizes the data before learning the network.

fimselamse avatar Aug 09 '22 11:08 fimselamse

@fimselamse Yes, that is the idea. Couldn't we use it?

samanemami avatar Aug 09 '22 15:08 samanemami

Very interesting though! I am going to read the paper first :-)

erdogant avatar Aug 09 '22 19:08 erdogant

@samanemami Yes, this is definitely way to create a network from continous data. However, it is not what's being done in the R bnlearn library. They use a special gaussian case of the scoring function, if I understand it correctly.

fimselamse avatar Aug 10 '22 06:08 fimselamse

Perfect @erdogant, count on me.

@fimselamse, Yes. This is not the method used in R, but this package could add other methods to handle the continuous data as well.

samanemami avatar Aug 10 '22 09:08 samanemami

@samanemami Definitely! I have tried some information preserving discretization methods myself. Some results were promising, but not as good as the method used in R. Would love to see the results with this one, though!

fimselamse avatar Aug 10 '22 09:08 fimselamse

I have been looking into this peace of code and it is written in Julia (I was hoping Python #wishfullthinking). Maybe not a huge issue but manually converting is kind of error-prone and time-consuming. I did find documentation about Julia to Python. I will give that a try first.

If someone here is familiar with Julia, it would be very helpful!

erdogant avatar Aug 27 '22 18:08 erdogant

@erdogant regarding real case examples of continuous data (e.g., corrosion), please let me know if I can be of help. Interested to test them using bnlearn. When would you expect to start the new version testing?

rafmora avatar Dec 28 '22 12:12 rafmora

I added the julia code in the github. I did an attempt to manually rewrite it to Python but without being experienced in Julia, it was quite intensive. Now I am thinking to create a small use case to use ChatGPT to let it rewrite it to Python. However, In my experience, the code that chatGPT generates can be buggy and not always correct. Nevertheless, it may be a good starting point to bugfix and rewrite!

erdogant avatar Jan 08 '23 10:01 erdogant

It took a while but in the latest release, there is now an implementation for continuous data modeling! There are many unit tests to ensure the quality and validity of the results but I did not use it in a real-life scenario yet.

Check it out in the docs.

erdogant avatar May 17 '23 11:05 erdogant

I am closing this one as there is now a basis functionality that can process continous data! See the documentations for more details.

erdogant avatar Jul 19 '23 07:07 erdogant