How to generate non-natural sequences of an interest protein using ProGen
When running the script with the command 'python3 sample.py --model ${model} --t 0.8 --p 0.9 --max-length 1024 --num-samples 2 --context "1"', the program runs successfully. If I want to generate non-natural amino acid sequences for a novel protein, what steps should I take? Specifically, should I prepare an input file containing the natural amino acid sequence in FASTA format or a PDB file containing the protein structure? Please provide guidance on the appropriate input format for generating non-natural sequences using ProGen.
How should I write the command to run the program for generating non-natural sequences using ProGen?
Did you observe the generated sequences often being quite "unrealistic" with e.g. the same residue types repeated over and over?