Jian Zhou

Results 22 comments of Jian Zhou

Thank you for bringing up the need for multiple inputs! As discussed with @kathyxchen we are planning to support multiple inputs as well as multiple targets. Currently, we have a...

Do you want to use sequences that are not in the TSSes that we used for training and validation (if it is you can refer to train.py for how to...

Thanks for the question! Yes that is right. The first half of the chromatin predictions are computed from the forward strand sequences and the second half is for the same...

Thanks for reporting this! You are right - this is a mistake in the paper supplement. I will contact the journal to correct this.

Thanks for the question. The representative TSS site positions are determined from FANTOM CAGE data which measures 5' end more precisely, so yes it can be different from the gene...

For generating the equivalent of Xreducedall.2002.npy for new organisms, you will need to first train a sequence model to predict chromatin profiles in the organism of interest for you first,...

If you want to train new sequence models for epigenetics data, feel free to check out https://github.com/FunctionLab/selene (there are tutorials and manuscript examples provided). Note that for ExPecto model there...

Beluga requires 2kb sequence. Padding with N is not guaranteed to give meaningful results. If your sequence has any flanking sequence in the genomic context, you can add that to...

Thanks for letting us know. It should actually only allow sequences >2kb - we are looking into this and will update here once it's fixed

Sorry for late update. Currently if the input is smaller than 2kb, it will be padded with "N"s. I don't recommend using fasta input smaller than 2kb unless it is...