Thomas Roder
Thomas Roder
I have been using [GaussionMixture](https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html#sklearn.mixture.GaussianMixture) to split by a continuous trait. This is a histogram of an example trait:  - Blue is group 1 - Green is group 2...
I wanted to compare Fisher's vs Boschloo's test. To do this, I simulated 10 pangenomes for each combination of sample size: `[25, 50, 75, 100, 150, 200]` and penetrance: `[90,...
Updated plot with improved ranking, based on pvalue instead of position in table. Didn't change the result.  `pvalue=0.0024`
I performed the same analysis with my [fast-fisher](https://github.com/MrTomRod/fast-fisher) library. It is now _incredibly_ fast. The causal gene always got the same rank as with scipy's implementation, except for two simulated...
You want to run Scoary on continuous traits? I'm working on an update for Scoary, but it's not ready yet. Approximately another month until testing makes sense. I will use...
@davidemms @Phhere Why open all files at the same time? Is it much slower to open the necessary files in append mode? Something like this: ```python def WriteOlogLinesToFile(file: str, text:...
> From what I can see from the script it looks like it takes the 'description' attribute for each gene by reading the fasta file using Bio.SeqIO and uses anything...
I don't fully understand... The input fastas only contain the gene name (`ENSDARG00000098423.2`)? Could you provide me with such a file? How would you design the solution? Add an ensembl-mode...
> This is awsome! The outputs of Orthofinder have been slightly updated now, do you figure you could update your scripts to accommodate that? I tried a bit but struggled.....
I had an accident and thought I'd spend some of my free time working on non-stressful projects like this, but I just learned that this (screen time) may slow my...