Jian Zhou comments

Results 24 comments of


                                            Jian Zhou

Support NN models with multiple inputs

Thank you for bringing up the need for multiple inputs! As discussed with @kathyxchen we are planning to support multiple inputs as well as multiple targets. Currently, we have a...

Could expecto accepts fasta input and return the prdicted tissue-specific expression values?

Do you want to use sequences that are not in the TSSes that we used for training and validation (if it is you can refer to train.py for how to...

averaging forward and reverse strand from output of chromatin.py

Thanks for the question! Yes that is right. The first half of the chromatin predictions are computed from the forward strand sequences and the second half is for the same...

Discrepancy between architecture in paper and architecture in deepsea.beluga.2002.cpu

Thanks for reporting this! You are right - this is a mistake in the paper supplement. I will contact the journal to correct this.

The representative TSS site

Thanks for the question. The representative TSS site positions are determined from FANTOM CAGE data which measures 5' end more precisely, so yes it can be different from the gene...

How to repeat file Xreducedall.2002.npy for another organisms

For generating the equivalent of Xreducedall.2002.npy for new organisms, you will need to first train a sequence model to predict chromatin profiles in the organism of interest for you first,...

How to repeat file Xreducedall.2002.npy for another organisms

If you want to train new sequence models for epigenetics data, feel free to check out https://github.com/FunctionLab/selene (there are tutorials and manuscript examples provided). Note that for ExPecto model there...

Sequences < 1,000bp

Beluga requires 2kb sequence. Padding with N is not guaranteed to give meaningful results. If your sequence has any flanking sequence in the genomic context, you can add that to...

Sequences < 1,000bp

Thanks for letting us know. It should actually only allow sequences >2kb - we are looking into this and will update here once it's fixed

Sequences < 1,000bp

Sorry for late update. Currently if the input is smaller than 2kb, it will be padded with "N"s. I don't recommend using fasta input smaller than 2kb unless it is...