llama
llama copied to clipboard
Why is the value of hidden_dim in FeedForward calculated this way?
Why is the value of hidden_dim calculated this way?
hidden_dim = int(2 * hidden_dim / 3) # custom dim factor multiplier if ffn_dim_multiplier is not None: hidden_dim = int(ffn_dim_multiplier * hidden_dim) hidden_dim = multiple_of * ((hidden_dim + multiple_of - 1) // multiple_of)