seq2seq
seq2seq copied to clipboard
`templatemethod` does not work as expected
import tensorflow as tf
from seq2seq.graph_utils import templatemethod
@templatemethod("trial")
def trial(x):
w = tf.get_variable('w', [])
return tf.reduce_sum(x) * w
y = tf.placeholder(tf.float32, [None])
z = tf.placeholder(tf.float32, [None])
a_y = trial(y)
a_z = trial(z)
s = tf.InteractiveSession()
tf.global_variables_initializer().run()
print(tf.global_variables())
print(a_y.eval(feed_dict={y: [1.1, 1.9]}))
print(a_z.eval(feed_dict={z: [1.9, 1.1]}))
The above code produces the following output
2017-06-20 17:17:45.724766: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up
CPU computations.
2017-06-20 17:17:45.724804: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CP
U computations.
2017-06-20 17:17:45.724812: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up C
PU computations.
2017-06-20 17:17:45.724819: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CP
U computations.
[<tf.Variable 'trial/w:0' shape=() dtype=float32_ref>, <tf.Variable 'trial_1/w:0' shape=() dtype=float32_ref>]
-2.49474
4.86785
Clearly, it creates two different variables!
Testing use tf.make_template directly like: def my_trail(x, share_variable_name): var1 = tf.get_variable(share_variable_name, shape=[]) return tf.reduce_sum(x) * var1 template_my = tf.make_template("template_my", my_trail, share_variable_name="my_v") y = tf.placeholder(tf.float32, [None]) z = tf.placeholder(tf.float32, [None]) a_y = template_my(y) a_z = template_my(z)
The result is reasonable: [<tf.Variable 'template_my/my_v:0' shape=() dtype=float32_ref>] 0.983209 0.983209
So is there any ideas? Or this's the big bug here?
I found calling trial with templatemethod twice, will generate two different template method.( call tf.make_template twice), So this is different. But now I 'm wondering the usage of templatemethod.
And the source call encode and decode templatemethod only once. So why use the template, not call they dirtectly?
Yes exactly, directly calling tf.make_template on the function works, but the point here is that the decorator @templatemethod does not work .. i.e. you can't just wrap that around any method, because that decorator templatemethod calls tf.make_template every call to the method - which is not what we want! My guess is that you could probably use a decorator to wrap your methods with tf.make_template as for lazily computed properties, so the templated function acts as a lazily computed property.
@amanjitsk Thanks. So A ha, the template usage is big bugs here? And I found some related bug in share source and target embedding the code in models/seq2seq_model.py lineth:126-150:
@property
@templatemethod("source_embedding")
def source_embedding(self):
"""Returns the embedding used for the source sequence.
"""
return tf.get_variable(
name="W",
shape=[self.source_vocab_info.total_size, self.params["embedding.dim"]],
initializer=tf.random_uniform_initializer(
-self.params["embedding.init_scale"],
self.params["embedding.init_scale"]))
@property
@templatemethod("target_embedding")
def target_embedding(self):
"""Returns the embedding used for the target sequence.
"""
if self.params["embedding.share"]:
return self.source_embedding
return tf.get_variable(
name="W",
shape=[self.target_vocab_info.total_size, self.params["embedding.dim"]],
initializer=tf.random_uniform_initializer(
-self.params["embedding.init_scale"],
self.params["embedding.init_scale"]))
It can't guarante source embedding and target embedding is the same, because these are in different name scopes (one is encode, one is decode). I find it when debuging model analysis .
So is there any nice way to solve it then ?
I used this
@property
@templatemethod("source_embedding")
def source_embedding(self):
"""Returns the embedding used for the source sequence.
"""
res = tf.get_variable(
name="W",
shape=[self.source_vocab_info.total_size, self.params["embedding.dim"]],
initializer=self.source_default_embedding)
self._source_embedding = res
return res
@property
@templatemethod("target_embedding")
def target_embedding(self):
"""Returns the embedding used for the target sequence.
"""
if self.params["embedding.share"]:
return self._source_embedding
return tf.get_variable(
name="W",
shape=[self.target_vocab_info.total_size, self.params["embedding.dim"]],
initializer=self.target_default_embedding)