coursera-deep-learning-specialization C5_W4_A1_Transformer_EX3_scaled_attention

C5_W4_A1_Transformer_EX3_scaled_attention_logits

Open mrgransky opened this issue 1 year ago • 0 comments

Your scaled_attention_logits is calculated wrong, since it gives:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-41-00665b20febb> in <module>
      1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
     73     assert np.allclose(weights, [[0.30719590187072754, 0.5064803957939148, 0.0, 0.18632373213768005],
     74                                  [0.3836517333984375, 0.3836517333984375, 0.0, 0.2326965481042862],
---> 75                                  [0.3836517333984375, 0.3836517333984375, 0.0, 0.2326965481042862]]), "Wrong masked weights"
     76     assert np.allclose(attention, [[0.6928040981292725, 0.18632373213768005],
     77                                    [0.6163482666015625, 0.2326965481042862],

AssertionError: Wrong masked weights

The correct value should be:

if mask is not None: # Don't replace this None
        scaled_attention_logits += ( (1-mask) * -1e9 )

Cheers,

Jul 21 '23 10:07 mrgransky

coursera-deep-learning-specialization coursera-deep-learning-specialization copied to clipboard

C5_W4_A1_Transformer_EX3_scaled_attention_logits

coursera-deep-learning-specialization
coursera-deep-learning-specialization copied to clipboard