Thomas Roder comments

Results 37 comments of


                                            Thomas Roder

Support for non-binary traits

I have been using [GaussionMixture](https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html#sklearn.mixture.GaussianMixture) to split by a continuous trait. This is a histogram of an example trait: ![image](https://user-images.githubusercontent.com/40867365/143433794-e9eff39e-5143-47da-9da1-7ba42eaf5ece.png) - Blue is group 1 - Green is group 2...

Support for non-binary traits

I wanted to compare Fisher's vs Boschloo's test. To do this, I simulated 10 pangenomes for each combination of sample size: `[25, 50, 75, 100, 150, 200]` and penetrance: `[90,...

Support for non-binary traits

Updated plot with improved ranking, based on pvalue instead of position in table. Didn't change the result. ![fisher-boschloo-fig1](https://user-images.githubusercontent.com/40867365/148211480-7bd26dda-ffaf-438c-aac2-186084b2f58f.png) `pvalue=0.0024`

Support for non-binary traits

I performed the same analysis with my [fast-fisher](https://github.com/MrTomRod/fast-fisher) library. It is now _incredibly_ fast. The causal gene always got the same rank as with scipy's implementation, except for two simulated...

Support for non-binary traits

You want to run Scoary on continuous traits? I'm working on an update for Scoary, but it's not ready yet. Approximately another month until testing makes sense. I will use...

Reduce number of required file handles

@davidemms @Phhere Why open all files at the same time? Is it much slower to open the necessary files in append mode? Something like this: ```python def WriteOlogLinesToFile(file: str, text:...

Feature request: Additional output file with consensus gene name per orthogroup

> From what I can see from the script it looks like it takes the 'description' attribute for each gene by reading the fasta file using Bio.SeqIO and uses anything...

Feature request: Additional output file with consensus gene name per orthogroup

I don't fully understand... The input fastas only contain the gene name (`ENSDARG00000098423.2`)? Could you provide me with such a file? How would you design the solution? Add an ensembl-mode...

Feature request: Additional output file with consensus gene name per orthogroup

> This is awsome! The outputs of Orthofinder have been slightly updated now, do you figure you could update your scripts to accommodate that? I tried a bit but struggled.....

Feature request: Additional output file with consensus gene name per orthogroup

I had an accident and thought I'd spend some of my free time working on non-stressful projects like this, but I just learned that this (screen time) may slow my...