BioGPT "data" is not found after executing the code on Github

"data" is not found after executing the code on Github

Open thiptanawatp opened this issue 2 years ago • 4 comments

Hello,

Could anybody please guide me that how I can run the standard BioGPT model by using the current below code?

import torch from fairseq.models.transformer_lm import TransformerLanguageModel m = TransformerLanguageModel.from_pretrained( "checkpoints/Pre-trained-BioGPT", "checkpoint.pt", "data", tokenizer='moses', bpe='fastbpe', bpe_codes="data/bpecodes", min_len=100, max_len_b=1024) m.cuda() src_tokens = m.encode("COVID-19 is") generate = m.generate([src_tokens], beam=5)[0] output = m.decode(generate[0]["tokens"]) print(output)

After running this, I always get the error that the data is not found. Not sure if I have to download the data from an external source separately or not.

Thanks

Feb 18 '23 16:02 thiptanawatp

I am getting the same error

Feb 22 '23 08:02 Dontmindmes

@thiptanawatp did you clone the repo itself? It contains the data.

Feb 22 '23 13:02 ahvdk

@ahvdk I did both download in .ZIP file manually and git clone ... but the data didn't appear under the BioGPT/data folder except bpecodes and dict.txt. Any suggestion?

Thanks so much

Feb 23 '23 03:02 thiptanawatp

Did you cd to the repo before you run your script, or at least add path of repo to PATH of python?

Feb 24 '23 03:02 thisismygitrepo

BioGPT BioGPT copied to clipboard

"data" is not found after executing the code on Github

BioGPT
BioGPT copied to clipboard