fusion
                                
                                
                                
                                    fusion copied to clipboard
                            
                            
                            
                        Perform correlation on features from combined dataset
After a single dataset exists, we need to apply some correlation to the data to:
- know what events are tightly correlated
 - trend analysis : how to a sequence of these events affect an outcome
 
Approaches
- (more articles from imessage to add here)
 - https://www.questionpro.com/blog/pearson-correlation-coefficient/
 
something to try out: caluclate r & p value
https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.correlate.html
some interesting examples from the oura dashboard
( there's now an option I see there that covers the pearson correlation efficient between different states)


we have basic correlations complete here. see jupyter notebook
One question that we've been bumping our head around was whether we should do correlations on the whole (feb- oct) period or pick smaller time samples and then average.///// well guess what? Today I found this paper that could likely be our answer! https://link.springer.com/content/pdf/10.3758%2FBF03334037.pdf
Edit: I think I got a little too excited, upon reading the paper again, tbh I sounds like averaging over samples will likely be biased but uno it is what it is lmao
Predictive power score is sounding interesting - https://towardsdatascience.com/rip-correlation-introducing-the-predictive-power-score-3d90808b9598
This does indeed look interesting!
On Tue, Jan 3, 2023, 14:17 Ore Ogundipe @.***> wrote:
Predictive power score is sounding interesting - https://towardsdatascience.com/rip-correlation-introducing-the-predictive-power-score-3d90808b9598
— Reply to this email directly, view it on GitHub https://github.com/oreHGA/fusion/issues/2#issuecomment-1369757311, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKXDOQUO5FJSHWWMXQDHHDWQQRHZANCNFSM45C5A77A . You are receiving this because you are subscribed to this thread.Message ID: @.***>
we need to be able to get the probability of something happening given another event, we can do this by running the following algorithms on
- feature combinations available and then highlight the most important / exciting ones
 
intro article - https://towardsdatascience.com/conditional-probability-with-a-python-example-fd6f5937cd2
stack overflow ref - https://stackoverflow.com/questions/33468976/pandas-conditional-probability-of-a-given-specific-b
pandas documentation - https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.crosstab.html
extra - https://corporatefinanceinstitute.com/resources/knowledge/other/conditional-probability/