clang8
clang8 copied to clipboard
Hyperparameters for prediction
Can you tell me what hyperparameters were used for the beam search at inference time and anything concerning penalty for length and repetition? Thanks!
Hi, we used greedy decoding for inference.