maybe_tensor() fails with ESMC 6B sagemaker model.
I'm using the ESM Cambrian 6B model via Sagemaker to embed proteins using (in part) the following code:
client = ESM3SageMakerClient(endpoint_name=endpoint_name, model=model_name)
config = LogitsConfig(sequence=True, return_embeddings=True)
seq = "AAAAA"
protein = ESMProtein(sequence=seq)
protein_tensor = client.encode(protein)
logits = client.logits(protein_tensor, self.config)
Which will throw the following error:
TypeError: only integer tensors of a single element can be converted to an index
This does not happen with the 300m and 600m models.
As far as I can tell, the issue is that the larger model returns a list of tensors instead of a single tensor. When the maybe_tensor function is called with convert_none_to_nan=False, the list is tensors is not converted to a numpy array and torch.tensor(x) fails.
I patched the function with this code and it seems to have solved the problem for me:
def patched_maybe_tensor(x, convert_none_to_nan=False):
if x is None:
return None
x = np.asarray(x, dtype=np.float32)
if convert_none_to_nan:
x = np.where(x is None, np.nan, x)
return torch.tensor(x)
Thanks for reporting this, we'll be able to fix it early next week.
Apologies for the delay here, we pushed version 1.02 Friday that resolved this issue.
We've also made changes to our sagemaker release process so this doesn't happen again.