ProteinLM
ProteinLM copied to clipboard
Protein Language Model
Hello, I found a similar issue ( #5), but when I try to extract the embedding according to your instruction, the dimension of the `transformer_output` doesn't seem to fit the...
roteinlm)xxxx@quant:~/ProteinLM/pretrain$ sh examples/pretrain_tape.sh using world size: 1, data-parallel-size: 1, tensor-model-parallel size: 1, pipeline-model-parallel size: 1 using torch.float16 for parameters ... WARNING: overriding default arguments for tokenizer_type:BertWordPieceLowerCase with tokenizer_type:BertWordPieceCase ------------------------ arguments...
which format should be sequence json file? do we need to add spaces between amino acids? in: https://github.com/THUDM/ProteinLM/tree/main/pretrain {"text": "GCTVEDRCLIGMGAILLNGCVIGSGSLVAAGALITQ"} {"text": "RTIKVRILHAIGFEGGLMLLTIPMVAYAMDMTLFQAILLDLSMTTCILVYTFIFQWCYDILENR"} https://github.com/THUDM/ProteinLM/tree/main/pretrain/protein_tools {"text": "G C T V E D...