pyMSA icon indicating copy to clipboard operation
pyMSA copied to clipboard

type equality fails with 'is' use '==' instead

Open BAW2501 opened this issue 1 year ago • 0 comments

Issue with NumPy array and character comparison

When working with rather large sequences, we use NumPy arrays to save memory and make some manipulations faster. However, when using char1 is self.gap_character, char1 can be of type np.str_ and self.gap_character is always str, so the 'is' equality checking fails.

The error message is as follows:

File "c:\Users\Dell\PycharmProjects\MSA_Gym_ENV\MultipleSequenceAlignmentEnv.py", line 92, in calculate_reward return SumOfPairs(msa_obj, Blosum62()).compute() File "C:\Users\Dell\AppData\Local\Programs\Python\Python311\Lib\site-packages\pymsa\core\score.py", line 40, in compute final_score += self.get_column_score(k) File "C:\Users\Dell\AppData\Local\Programs\Python\Python311\Lib\site-packages\pymsa\core\score.py", line 126, in get_column_score score_of_column += get_score_of_two_chars(self.substitution_matrix, char_a, char_b) File "C:\Users\Dell\AppData\Local\Programs\Python\Python311\Lib\site-packages\pymsa\core\score.py", line 27, in get_score_of_two_chars return int(substitution_matrix.get_distance(char_a, char_b)) File "C:\Users\Dell\AppData\Local\Programs\Python\Python311\Lib\site-packages\pymsa\core\substitution_matrix.py", line 30, in get_distance raise Exception('The pair ({0},{1}) couldn't be found in the substitution matrix'.format(char1, char2)) Exception: The pair (S,-) couldn't be found in the substitution matrix

The current code (from substitution_matrix.py) is:

if char1 is self.gap_character and char2 is self.gap_character:
    distance = 1
elif char1 is self.gap_character or char2 is self.gap_character:
    distance = self.gap_penalty

A suggested fix is:

if char1 == self.gap_character and char2 == self.gap_character:
    distance = 1
elif char1 == self.gap_character or char2 == self.gap_character:
    distance = self.gap_penalty

This will fix the issue by using the == operator instead of the is operator to check for character equality.

BAW2501 avatar Jun 18 '23 17:06 BAW2501