deepxde icon indicating copy to clipboard operation
deepxde copied to clipboard

Type mismatch when trying to use L-BFGS

Open jdellag opened this issue 1 year ago • 5 comments

I've been working on a Navier Stokes problem and would like to try to further optimize my model after training with Adam like many of the examples I have seen. However, when trying to use this optimizer after setting the default precision to float64, I am receiving the following error:

Compiling model...

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[18], line 18
     15 for beta in beta_values:
     16     print(f"INTIALIZING RUN ******* {frame} ** Beta = {beta} ** Re = {Re} *******")
---> 18     losshistory, train_state, model = train_for_Re(Re, beta)
     19     # Update the progress count
     20     progress_count += 1

Cell In[17], line 38, in train_for_Re(Re, beta)
     32 variable = dde.callbacks.VariableValue(U, period = 500, filename = fnamevar, precision = 32)
     34 #losshistory, train_state = model.train(iterations = num_iter, callbacks = [variable], display_every = 500)
     35 #dde.utils.external.saveplot(losshistory, train_state, 
     36                             #issave = True, isplot = True, train_fname = train_filename, 
     37                             #    test_fname = test_filename, loss_fname = loss_filename, output_dir = "run_data")
---> 38 model.compile('L-BFGS-B')
     39 losshistory, train_state = model.train(iterations = num_iter, callbacks = [variable])
     41 return losshistory, train_state, model

File ~/python_environments/deepxde2/lib/python3.11/site-packages/deepxde/utils/internal.py:22, in timing.<locals>.wrapper(*args, **kwargs)
     19 @wraps(f)
     20 def wrapper(*args, **kwargs):
     21     ts = timeit.default_timer()
---> 22     result = f(*args, **kwargs)
     23     te = timeit.default_timer()
     24     if config.rank == 0:

File ~/python_environments/deepxde2/lib/python3.11/site-packages/deepxde/model.py:137, in Model.compile(self, optimizer, lr, loss, metrics, decay, loss_weights, external_trainable_variables)
    134     self.external_trainable_variables = external_trainable_variables
    136 if backend_name == "tensorflow.compat.v1":
--> 137     self._compile_tensorflow_compat_v1(lr, loss_fn, decay)
    138 elif backend_name == "tensorflow":
    139     self._compile_tensorflow(lr, loss_fn, decay)

File ~/python_environments/deepxde2/lib/python3.11/site-packages/deepxde/model.py:194, in Model._compile_tensorflow_compat_v1(self, lr, loss_fn, decay)
    192 self.outputs_losses_train = [self.net.outputs, losses_train]
    193 self.outputs_losses_test = [self.net.outputs, losses_test]
--> 194 self.train_step = optimizers.get(
    195     total_loss, self.opt_name, learning_rate=lr, decay=decay
    196 )

File ~/python_environments/deepxde2/lib/python3.11/site-packages/deepxde/optimizers/tensorflow_compat_v1/optimizers.py:22, in get(loss, optimizer, learning_rate, decay)
     20     if learning_rate is not None or decay is not None:
     21         print("Warning: learning rate is ignored for {}".format(optimizer))
---> 22     return ScipyOptimizerInterface(
     23         loss,
     24         method="L-BFGS-B",
     25         options={
     26             "maxcor": LBFGS_options["maxcor"],
     27             "ftol": LBFGS_options["ftol"],
     28             "gtol": LBFGS_options["gtol"],
     29             "maxfun": LBFGS_options["maxfun"],
     30             "maxiter": LBFGS_options["maxiter"],
     31             "maxls": LBFGS_options["maxls"],
     32         },
     33     )
     35 if isinstance(optimizer, tf.train.AdamOptimizer):
     36     optim = optimizer

File ~/python_environments/deepxde2/lib/python3.11/site-packages/deepxde/optimizers/tensorflow_compat_v1/scipy_optimizer.py:102, in ExternalOptimizerInterface.__init__(self, loss, var_list, equalities, inequalities, var_to_bounds, **optimizer_kwargs)
     95 inequalities_grads = [
     96     _compute_gradients(inequality, self._vars)
     97     for inequality in self._inequalities
     98 ]
    100 self.optimizer_kwargs = optimizer_kwargs
--> 102 self._packed_var = self._pack(self._vars)
    103 self._packed_loss_grad = self._pack(loss_grads)
    104 self._packed_equality_grads = [
    105     self._pack(equality_grads) for equality_grads in equalities_grads
    106 ]

File ~/python_environments/deepxde2/lib/python3.11/site-packages/deepxde/optimizers/tensorflow_compat_v1/scipy_optimizer.py:251, in ExternalOptimizerInterface._pack(cls, tensors)
    249 else:
    250     flattened = [tf.reshape(tensor, [-1]) for tensor in tensors]
--> 251     return tf.concat(flattened, 0)

File ~/python_environments/deepxde2/lib/python3.11/site-packages/tensorflow/python/util/traceback_utils.py:153, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    151 except Exception as e:
    152   filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153   raise e.with_traceback(filtered_tb) from None
    154 finally:
    155   del filtered_tb

File ~/python_environments/deepxde2/lib/python3.11/site-packages/tensorflow/python/framework/op_def_library.py:500, in _ExtractInputsAndAttrs(op_type_name, op_def, allowed_list_attr_map, keywords, default_type_attr_map, attrs, inputs, input_types)
    497     raise TypeError(f"{prefix} that do not match type {dtype.name} "
    498                     "inferred from earlier arguments.")
    499   else:
--> 500     raise TypeError(f"{prefix} that don't all match.")
    501 else:
    502   raise TypeError(f"{prefix} that are invalid. Tensors: {values}")

TypeError: Tensors in list passed to 'values' of 'ConcatV2' Op have types [float32, float64, float64, float64, float64, float64, float64] that don't all match.

I am puzzled as to why this is happening. FWIW, I have run into this error on both the DeepXDE docker image and while trying to run on a M1 Mac as well (both with tensorflow.compat.v1 backends). What surprises me even more is that I have not seen this specific type of error on this GitHub or anywhere else on the internet for that matter. Any ideas on what this could be?

jdellag avatar Nov 14 '23 14:11 jdellag

However, when trying to use this optimizer after setting the default precision to float64

Did you call deepxde.config.real.set_float64() right after import deepxde? If not, do it.

vl-dud avatar Nov 16 '23 10:11 vl-dud

Yes, trying that and any combination of the following three;

deepxde.config.set_default_float("float64")
tf.keras.backend.set_floatx("float64")
deepxde.config.real.set_float64()

has not yielded anything other than the error posted above.

jdellag avatar Nov 16 '23 15:11 jdellag

Can you show all the code?

vl-dud avatar Nov 16 '23 17:11 vl-dud

I finally nailed down what it was, and I'm surprised I didn't catch it earlier. I have a trainable variable U that I am interested in, but I noticed that when writing the value of U at every 500 iterations that it was a float32 even though I explicitly set all reals to be float64. Specifying the dtype of U when I initialized it as U = dde.variable(1.0, dtype='float64') did the trick.

Can we get this change commited in the code base?

jdellag avatar Nov 16 '23 19:11 jdellag

Good point. @jdellag Would like to submit a PR to fix it?

You only need to modify here https://github.com/lululxvi/deepxde/blob/5b21146dd2c73e8df7d31acaad8b5604d30fc3e8/deepxde/backend/tensorflow_compat_v1/tensor.py#L83

lululxvi avatar Nov 28 '23 00:11 lululxvi