char-rnn-tensorflow
char-rnn-tensorflow copied to clipboard
Implement temperature
Implement temperature #22 with default value as 1. Following description is cited from karpathy/char-rnn .
Temperature. An important parameter you may want to play with is -temperature, which takes a number in range (0, 1] (0 not included), default = 1. The temperature is dividing the predicted log probabilities before the Softmax, so lower temperature will cause the model to make more likely, but also more boring and conservative predictions. Higher temperatures cause the model to take more chances and increase diversity of results, but at a cost of more mistakes.
Looks good!
I think this line is undesirable:
self.probs = tf.nn.softmax(tf.div(self.logits, temperature))
This overwrites self.probs which looked to be used later (when using the network as a language model) meaning that it impossible to do generation and get probs at the same time without trickery. It may make sense to make a new placeholder variable in the model init for temperature which can be fed, and a new probs with temp which can be retrieved (1). Or it may make sense to the the temperature sampling outside of the tensorflow graph (2).
(1)
# add new placeholder
def __init__(...):
...
self.temperature = tf.placeholder_with_default(tf.constant(1, dtype=tf.float32), None)
self.temp_probs = tf.nn.softmax(tf.div(self.logits, self.temperature))
...
def sample(...):
# same as this PR but use self.temp_probs when appropriate
(2)
# do temp sampling outside tf graph
def __init__(...):
# same as before
def sample(...):
...
# when appropriate, run to get self.logits
logits, state = sess.run([self.logits, self.final_state], feed)
logits = logits[0]
if temperature == 0.0:
sample = np.argmax(logits)
else:
scale = logits / temperature
exp = np.exp(scale - np.max(scale))
soft = exp / np.sum(exp)
sample = np.random.choice(len(soft), p=soft)
@vanechu This PR has merge conflicts.
I made a new PR to implement this here with merge conflicts fixed.
@fujimotomh good implementation