transformer
transformer copied to clipboard
use zero_padding_mask and bias to replace non-bias linear projection