ᴸᵘᶜʳᵉᶜᵉ ˢʰⁱⁿ comments

Repositories
Issues
Comments

Results 3 comments of


                                            ᴸᵘᶜʳᵉᶜᵉ ˢʰⁱⁿ

Some problems about Bert

How about just : `index = randint(4, vocab_size - 1)`

REINFORCE Correction

I agree. **Expected return G_t (sum of FUTURE awards)** should be multiplied with each log p(At|St), which decreases as t increases, not the cumulative reward/episode R which is same for...

can't work at all

such as?