neusomatic
neusomatic copied to clipboard
Multi-patient training
I would like to run a large scale training task over thousands of labelled patient BAMs. Is this currently supported with neusomatic in any way, or will I have to write some custom code to recombine the generated training data?
@chrissype happy to see your interest in NeuSomatic. Yes, you can train on multiple samples as follows:
- For each sample run
preprocess.py
. This will you give you per sample candidate TSV files in the following paths:sample_i_output/dataset/work.*/candidates*.tsv
- Use all the candidate TSV files from multiple samples together to buil a NeuSomatic model using
train.py
. So, as--candidates_tsv
argument you can provide paths to all candidate TSVs, like:
--candidates_tsv sample_*_output/dataset/work.*/candidates*.tsv
OR
--candidates_tsv sample_1_output/dataset/work.*/candidates*.tsv \
sample_2_output/dataset/work.*/candidates*.tsv ... \
sample_n_output/dataset/work.*/candidates*.tsv ... \
That's amazing, many thanks!