seq2seq icon indicating copy to clipboard operation
seq2seq copied to clipboard

`templatemethod` does not work as expected

Open amanjitsk opened this issue 7 years ago • 5 comments

import tensorflow as tf
from seq2seq.graph_utils import templatemethod

@templatemethod("trial")
def trial(x):
    w = tf.get_variable('w', [])
    return tf.reduce_sum(x) * w

y = tf.placeholder(tf.float32, [None])
z = tf.placeholder(tf.float32, [None])
a_y = trial(y)
a_z = trial(z)

s = tf.InteractiveSession()
tf.global_variables_initializer().run()
print(tf.global_variables())
print(a_y.eval(feed_dict={y: [1.1, 1.9]}))
print(a_z.eval(feed_dict={z: [1.9, 1.1]}))

The above code produces the following output

2017-06-20 17:17:45.724766: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up
 CPU computations.
2017-06-20 17:17:45.724804: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CP
U computations.
2017-06-20 17:17:45.724812: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up C
PU computations.
2017-06-20 17:17:45.724819: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CP
U computations.
[<tf.Variable 'trial/w:0' shape=() dtype=float32_ref>, <tf.Variable 'trial_1/w:0' shape=() dtype=float32_ref>]
-2.49474
4.86785

Clearly, it creates two different variables!

amanjitsk avatar Jun 20 '17 21:06 amanjitsk

Testing use tf.make_template directly like: def my_trail(x, share_variable_name): var1 = tf.get_variable(share_variable_name, shape=[]) return tf.reduce_sum(x) * var1 template_my = tf.make_template("template_my", my_trail, share_variable_name="my_v") y = tf.placeholder(tf.float32, [None]) z = tf.placeholder(tf.float32, [None]) a_y = template_my(y) a_z = template_my(z)

The result is reasonable: [<tf.Variable 'template_my/my_v:0' shape=() dtype=float32_ref>] 0.983209 0.983209

So is there any ideas? Or this's the big bug here?

liyi193328 avatar Jul 03 '17 13:07 liyi193328

I found calling trial with templatemethod twice, will generate two different template method.( call tf.make_template twice), So this is different. But now I 'm wondering the usage of templatemethod.

And the source call encode and decode templatemethod only once. So why use the template, not call they dirtectly?

liyi193328 avatar Jul 03 '17 13:07 liyi193328

Yes exactly, directly calling tf.make_template on the function works, but the point here is that the decorator @templatemethod does not work .. i.e. you can't just wrap that around any method, because that decorator templatemethod calls tf.make_template every call to the method - which is not what we want! My guess is that you could probably use a decorator to wrap your methods with tf.make_template as for lazily computed properties, so the templated function acts as a lazily computed property.

amanjitsk avatar Jul 04 '17 17:07 amanjitsk

@amanjitsk Thanks. So A ha, the template usage is big bugs here? And I found some related bug in share source and target embedding the code in models/seq2seq_model.py lineth:126-150:

@property
@templatemethod("source_embedding")
def source_embedding(self):
  """Returns the embedding used for the source sequence.
  """
  return tf.get_variable(
      name="W",
      shape=[self.source_vocab_info.total_size, self.params["embedding.dim"]],
      initializer=tf.random_uniform_initializer(
          -self.params["embedding.init_scale"],
          self.params["embedding.init_scale"]))

@property
@templatemethod("target_embedding")
def target_embedding(self):
  """Returns the embedding used for the target sequence.
  """
  if self.params["embedding.share"]:
    return self.source_embedding
  return tf.get_variable(
      name="W",
      shape=[self.target_vocab_info.total_size, self.params["embedding.dim"]],
      initializer=tf.random_uniform_initializer(
          -self.params["embedding.init_scale"],
          self.params["embedding.init_scale"]))

It can't guarante source embedding and target embedding is the same, because these are in different name scopes (one is encode, one is decode). I find it when debuging model analysis .

So is there any nice way to solve it then ?

liyi193328 avatar Jul 07 '17 06:07 liyi193328

I used this


  @property
  @templatemethod("source_embedding")
  def source_embedding(self):
    """Returns the embedding used for the source sequence.
    """

    res = tf.get_variable(
        name="W",
        shape=[self.source_vocab_info.total_size, self.params["embedding.dim"]],
        initializer=self.source_default_embedding)
    self._source_embedding = res
    return res


  @property
  @templatemethod("target_embedding")
  def target_embedding(self):
    """Returns the embedding used for the target sequence.
    """
    if self.params["embedding.share"]:
      return self._source_embedding
    return tf.get_variable(
      name="W",
      shape=[self.target_vocab_info.total_size, self.params["embedding.dim"]],
      initializer=self.target_default_embedding)

davidpham87 avatar Aug 08 '17 16:08 davidpham87