rgn
rgn copied to clipboard
Converting Predicted Output To PDB File
Do you have a script lying around that converts the predicted output to a PDB file?
This would be really helpful. I have no idea how to interpret the content of the tertiary file. Is there some documentation?
Hi, I found myself in the same situation and created a small R script that converts tertiary files to PDB format. It seems to be enough to visualize predictions in Pymol. Probably @alquraishi has a better script to do a more complete/better conversion, but feel free to use/edit if you find it useful meanwhile.
https://github.com/lafita/rgn/blob/master/data_processing/tertiary2pdb.R
I used the documentation here: https://github.com/aqlaboratory/proteinnet/blob/master/docs/proteinnet_records.md
Does the script 'https://github.com/lafita/rgn/blob/master/data_processing/tertiary2pdb.R' can convert more atoms besides 'N', 'CA', 'C'?
Only those three backbone atoms are predicted by the RGN model. To include side-chains you would need to use another software like Rosetta.
@lafita Hi, I would like to ask how to convert the pdb file into a cif file? Or to convert a tertiary file into a cif file?
Conversion of PDB to mmCIF format should be straightforward and there are many tools that can do that. For example you can open the PDB file in Pymol and save it as mmCIF or use this website: https://mmcif.pdbj.org/converter
@lafita Thank you so much
@Inspriatio Hi, I think you have been success in doing predition with RGN. Can you please help me with the prediction stage. I tried to do prediction using a single protein sequence following "Predict structure of a single new sequence using a trained model" section in README. I succeed in PSSM computing and conversion to tfrecord. However, after running protling.py, it did not generate any output in outputsTesting directory. Can you share your prediction process? A separate issue has been opened for about 3 weeks here. I get no response. So I try here. Thank you in advance.
@Inspriatio Hi, I think you have been success in doing predition with RGN. Can you please help me with the prediction stage. I tried to do prediction using a single protein sequence following "Predict structure of a single new sequence using a trained model" section in README. I succeed in PSSM computing and conversion to tfrecord. However, after running protling.py, it did not generate any output in outputsTesting directory. Can you share your prediction process? A separate issue has been opened for about 3 weeks here. I get no response. So I try here. Thank you in advance. @stardustcx Hi,Here is how I predict its tertiary structure based on a sequence The process of generating a three-dimensional structure from an amino acid sequence 1、In the data_processing folder, place the sequence file to be predicted, and run the command
sh jackhmmer.sh xxxx.fasta.txt proteinnet7.fa
to complete the first step. 2、Generate a proteinnet file with the command~ / anaconda3 / envs / tf / bin / python convert_to_proteinnet.py xxxx.fasta.txt
3、Generate a tfrecord file by command~/anaconda3/envs/tf/bin/python convert_to_tfrecord.py xxxx.fasta.txt.proteinnet xxxx.fasta.txt.tfrecord 42
4、Put the tfrecord file under <baseDirectory> / data / <datasetName> / testing /, and execute the command in the model folder~/anaconda3/envs/tf/bin/python ~/rgn-master/model/protling.py ~/rgn-master/configurations/CASP7.config -d ~/rgn-master/RGN7/ -p -e weighted_testing -g0 -f 0.8
Thanks for the reply, I did almost the same as you do. I checked the difference of .tfrecord files in validating directory which I can get results when put in testing directory, and newly generated by myself. I found a lack of tertiery field in my file. I wonder whether this field presents in your .tfrecord file?
Hi, can you check the log output in baseDirectory/logs after running protling.py? I also had some problems in the beginning and it gave me the right hints. In my case I had to switch to tensorflow-gpu and set some environment variables that the gpu is found.
I checked log. It correctly found the gpu. I can also found its process on gpu when it was running. I really can not find valuable information in the log. It just finiahed with a normal exit but without any prediction.
@Inspriatio嗨,我认为您在RGN的掠夺中取得了成功。您能在预测阶段为我提供帮助吗? 在README中的“使用训练有素的模型预测单个新序列的结构”部分之后,我尝试使用单个蛋白质序列进行预测。我成功地进行了PSSM计算并将其转换为tfrecord。但是,在运行protling.py之后,它没有在outputsTesting目录中生成任何输出。您可以分享您的预测过程吗? 一个单独的问题已经在这里开了大约3个星期了。我没有回应。所以我在这里尝试。 先感谢您。 @stardustcx您好,这是我根据序列预测其三级结构的方式 从氨基酸序列生成三维结构的过程 1,在data_processing文件夹中,放置要预测的序列文件,然后运行命令
sh jackhmmer.sh xxxx.fasta.txt proteinnet7.fa
以完成第一步。 2,用命令~ / anaconda3 / envs / tf / bin / python convert_to_proteinnet.py xxxx.fasta.txt
生成蛋白质网文件3,用命令生成tfrecord文件~/anaconda3/envs/tf/bin/python convert_to_tfrecord.py xxxx.fasta.txt.proteinnet xxxx.fasta.txt.tfrecord 42
4,将tfrecord文件放在/ data // testing /下,并在model文件夹中执行该命令~/anaconda3/envs/tf/bin/python ~/rgn-master/model/protling.py ~/rgn-master/configurations/CASP7.config -d ~/rgn-master/RGN7/ -p -e weighted_testing -g0 -f 0.8
I want know why when I execute jackhmmer.sh it will tell me jackhmmer.sh: 10: jackhmmer.sh: jackhmmer: not found jackhmmer.sh: 11: jackhmmer.sh: esl-reformat: not found jackhmmer.sh: 12: jackhmmer.sh: esl-weight: not found jackhmmer.sh: 13: jackhmmer.sh: esl-alistat: not found what can I do