ECNet icon indicating copy to clipboard operation
ECNet copied to clipboard

Unsupervised ECNet

Open eric-jm-lang opened this issue 3 years ago • 7 comments

Hello, In the ECNet paper, you built an unsupervised ECNet model that does not require DMS data for training. It uses the predicted probability of an amino acid at a position as a proxy for fitness. Is there a specific code for this unsupervised model? Or is it a question of using the current ECNet code to generate an unsupervised model by using a different input for --train? Could you please provide more details on how to build such an unsupervised model? Many thanks in advance

eric-jm-lang avatar Oct 11 '21 22:10 eric-jm-lang

+1

NOforgetQY avatar Aug 08 '22 07:08 NOforgetQY

I have the same confusion!

TernencezzZ avatar Oct 14 '22 03:10 TernencezzZ

Same here!

meehljd avatar Oct 14 '22 18:10 meehljd

Maybe one could hack a training file with a bunch of neutral mutations

mutation    score
M1M         1.0
F12F;L30L  1.0
G89G         1.0

meehljd avatar Oct 14 '22 18:10 meehljd

Looks like it worked. Tested with separate test file with random mutations. Need to still validate with experimental data.

Prediction from Training File with neutral mutations:

mutation score prediction D36D;G142G 1.00000000 1.04933691 E145E;S128S 1.00000000 1.04933691 L19L;N152N 1.00000000 1.04933691 E237E;P12P 1.00000000 1.04933691

Prediction from Test File with random mutations:

mutation score prediction A9D;T27L 1.00000000 0.51061106 A124A;I3T 1.00000000 0.98425829 V258L;A211L 1.00000000 -0.28957328 A276R;K252E 1.00000000 1.15801334 E175E;F14A 1.00000000 1.18147123

meehljd avatar Oct 14 '22 19:10 meehljd

Ignore my previous naive attempt. I re-read the paper and recalled @luoyunan used homologous sequences to train a bidirectional model on masked amino acid residues. I reviewed the ECNet and Dataset classes. The provided model can only process mutation-feature paired TSV files for training. Training on homologous sequences must be in a different code base.

meehljd avatar Oct 20 '22 17:10 meehljd

Thank you for your input @meehljd. Hope @luoyunan can provide more information on how to do this.

eric-jm-lang avatar Nov 10 '22 15:11 eric-jm-lang