deepxde
deepxde copied to clipboard
L-BFGS iteration records
Hi, I am using a local windows system with CPU. I have a question: An option named "display_every" exists for the Adam optimizer that records the results as we want. But how about L-BFGS? It records the results every 1000 iterations. How do we change it to something else, such as 100 iterations? I tried " model.compile("L-BFGS") model.train(display_every=100) " And it is not working.
Once I use Google Colab, the loss information, after a few thousand steps, starts showing one by one (it also happened on the local CPU). Somehow, it won't change after that. This happened for some examples and just sometimes. Here, I show you the Elastoplastic example under the setting: " layers = [2] + [200] * 30 + [5]
activation = "tanh" initializer = "Glorot uniform" net = dde.nn.FNN(layers, activation, initializer)
model = dde.Model(data, net)
dde.optimizers.config.set_LBFGS_options( maxcor=100, ftol=0, gtol=1e-08, maxiter=5000, maxfun=None, maxls=50) model.compile("L-BFGS", metrics=["l2 relative error"]) losshistory, train_state = model.train() " (the rest are the same as main example) and here are results: " Using backend: pytorch Other supported backends: tensorflow.compat.v1, tensorflow, jax, paddle. paddle supports more examples now and is recommended. Compiling model... 'compile' took 1.470951 s
Training model...
Step Train loss Test loss Test metric
0 [1.81e+03, 2.67e+02, 3.87e-02, 1.56e-01, 1.19e-02] [1.76e+03, 2.59e+02, 4.26e-02] [1.00e+00]
1000 [3.86e-03, 5.74e-03, 5.17e-03, 6.31e-03, 5.54e-03] [6.26e-03, 9.26e-03, 7.78e-03] [4.23e-02]
2000 [7.91e-04, 7.89e-04, 8.83e-04, 1.06e-03, 7.21e-04] [1.12e-03, 1.04e-03, 1.49e-03] [2.53e-02]
3000 [3.00e-04, 3.07e-04, 3.40e-04, 4.22e-04, 3.02e-04] [4.26e-04, 3.56e-04, 5.86e-04] [1.97e-02]
4000 [1.38e-04, 1.98e-04, 1.63e-04, 2.91e-04, 1.58e-04] [2.23e-04, 2.85e-04, 2.52e-04] [1.51e-02]
4732 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4733 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4734 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4735 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4736 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4737 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4738 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4739 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4740 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4741 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4742 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4743 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4744 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4745 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4746 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4747 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
4748 [9.51e-05, 1.59e-04, 1.44e-04, 1.75e-04, 1.03e-04] [1.65e-04, 2.46e-04, 2.26e-04] [1.26e-02]
``
As you can see, after Step=4732, they come one by one and won't change at all, even if I let it go up to thousands more.
Thanks for your time.
Yes, I believe the Pytorch implementation has a bug that it does not improve loss accuracy using LBFGS so switch to tf. You will get NaN at the end as well.
Yes, this bug was introduced for PyTorch 2.x. If you use PyTorch 1.x or TensorFlow, it should be fine.