dynalang
dynalang copied to clipboard
Compiler error when running Messenger env
Hi,
max_len
surrounded by the following block >>>>
, in def _embed
, hasn't been declared yet. I am assuming it is self.max_token_seqlen
?
The code is from here,
def _embed(self, string):
if f"{string}_{self.max_token_seqlen}" not in self.token_cache \
or string not in self.embed_cache:
print(string)
tokens = self.tokenizer(string, return_tensors="pt",
add_special_tokens=True) # add </s> separators
import torch
with torch.no_grad():
# (seq, dim)
embeds = self.encoder(**tokens).last_hidden_state.squeeze(0)
self.embed_cache[string] = embeds.cpu().numpy()
>>>>>
self.token_cache[f"{string}_{max_len}"] = {
k: v.squeeze(0).cpu().numpy() for k, v in tokens
}
>>>>>
return (
self.embed_cache[string],
self.token_cache[f"{string}_{self.max_token_seqlen}"]
)
Sorry about that, fixed!
FYI I haven't tested this code path for a while — the default setting should use the cached embeddings from the messenger_embeds.pkl
file, which will be much faster than embedding the sentences online in the environment.
Thank you!
And also tokens
in
self.token_cache[f"{string}_{max_len}"] = {
k: v.squeeze(0).cpu().numpy() for k, v in tokens
}
I assume it should be tokens.items()
?
Yep! Also fixed now.
Also is there any possibility that you can publish the trained weight? Thank you for your help!