CrossQ
CrossQ copied to clipboard
nan values in networks
Hello,
When running the code on deepmind/pendulum-swingup the training crashes as the action becomes nan
. I attach stack trace below (I added some more logging to catch exactly which part of the agent produces nan
action, the original error was later when interacting with the environment, but the cause is here). I believe that more envs share this problem as in my previous runs I also experienced this - happened mostly for dog
tasks, but as I was using my custom wrapper instead of shimmy I thought that maybe it had been some problem with my wrapper. Now it happens with shimmy
so it is not the case of the wrapper but probably some instabilities (maybe with BatchNorm?).
237 Traceback (most recent call last):
238 File "/home/src/crossq/train.py", line 264, in <module>
239 model.learn(total_timesteps=total_timesteps, progress_bar=True, callback=callback_list)
240 File "/home/src/crossq/sbx/sac/sac.py", line 187, in learn
241 return super().learn(
242 ^^^^^^^^^^^^^^
243 File "/home/miniconda3/envs/crossq/lib/python3.11/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 312, in learn
244 rollout = self.collect_rollouts(
245 ^^^^^^^^^^^^^^^^^^^^^^
246 File "/home/miniconda3/envs/crossq/lib/python3.11/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 541, in collect_rollouts
247 actions, buffer_actions = self._sample_action(learning_starts, action_noise, env.num_envs)
248 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
249 File "/home/miniconda3/envs/crossq/lib/python3.11/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 373, in _sample_action
250 unscaled_action, _ = self.predict(self._last_obs, deterministic=False)
251 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
252 File "/home/miniconda3/envs/crossq/lib/python3.11/site-packages/stable_baselines3/common/base_class.py", line 555, in predict
253 return self.policy.predict(observation, state, episode_start, deterministic)
254 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
255 File "/home/src/crossq/sbx/common/policies.py", line 64, in predict
256 actions = self._predict(observation, deterministic=deterministic)
257 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
258 File "/home/src/crossq/sbx/sac/policies.py", line 482, in _predict
259 self.debug_log_action(observation, action, "_predict")
260 File "/home/src/crossq/sbx/sac/policies.py", line 531, in debug_log_action
261 raise ValueError("Action is None")
262 ValueError: Action is None
When the error happens I added printing the state of the actor and the observation. nan
values are mostly present in BatchRenorm
:
Observations: [[-0.98452299 -0.17525546 -0.13700339]]
Actor state: ActorTrainState(step=Array(72082, dtype=int32, weak_type=True), apply_fn=<bound method Module.apply of Actor(
# attributes
net_arch = [256, 256]
action_dim = 1
batch_norm_momentum = 0.99
log_std_min = -20
log_std_max = 2
use_batch_norm = True
bn_mode = 'brn_actor'
)>, params={'BatchRenorm_0': {'bias': Array([nan, nan, nan], dtype=float32), 'scale': Array([nan, nan, nan], dtype=float32)}, 'BatchRenorm_1': {'bias': Array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan], dtype=float32), 'scale': Array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan], dtype=float32)}, 'BatchRenorm_2': {'bias': Array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan], dtype=float32), 'scale': Array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan], dtype=float32)}, 'Dense_0': {'bias': Array([ nan, nan, -0.72661287, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, -0.7797777 , nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, -0.81983244, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
-0.65440995, nan, -0.5785902 , nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, -0.3541636 , nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan], dtype=float32), 'kernel': Array([[ nan, nan, -0.14071447, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, 0.22611286, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, -0.01891099, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
0.11054939, nan, 0.04355304, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, -0.9748605 , nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan],
[ nan, nan, 0.07693207, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, 0.05971847, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, -0.07279737, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
0.06947228, nan, 0.04604982, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, -0.05010498, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan],
[ nan, nan, -0.03049578, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, -0.00323228, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, 0.0017229 , nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
-0.02165468, nan, -0.02158494, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, 0.00452778, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan,
nan]], dtype=float32)}, 'Dense_1': {'bias': Array([-3.55079502e-01, -2.61681288e-01, -3.62839073e-01, -1.15681604e-01,
-5.08833826e-01, -1.33646116e-01, nan, -1.64892986e-01,
-9.87671614e-02, -3.30210567e-01, 6.11294284e-02, -2.26123795e-01,
-2.73534119e-01, -3.34397793e-01, nan, -8.75263959e-02,
-1.58562064e-01, -3.51377517e-01, -1.74645379e-01, -8.94286670e-03,
-1.91893145e-01, -1.28213629e-01, 2.03128159e-02, -2.56696284e-01,
-1.50657192e-01, -3.45063061e-01, -2.13366076e-01, -1.69571996e-01,
-3.34517241e-01, -3.00842196e-01, -1.06576160e-01, -1.35408074e-01,
-6.20634668e-02, -5.48866615e-02, -2.52332807e-01, -1.78462148e-01,
-2.34845892e-01, -1.56766266e-01, -4.78359222e-01, -1.16198920e-01,
-1.25731722e-01, -2.61006474e-01, nan, -6.05887733e-02,
-2.15052500e-01, nan, -1.48657292e-01, -3.27274710e-01,
1.07243955e-01, -1.11210242e-01, -3.31136845e-02, -5.49518578e-02,
-2.12549612e-01, -2.13353574e-01, -1.78537995e-01, -2.18994096e-02,
-7.21647069e-02, -1.74253643e-01, -3.13391834e-01, 2.16715410e-02,
-1.14866629e-01, -4.00419235e-01, -2.60464311e-01, -3.07593644e-01,
nan, -2.45736688e-01, -1.73763752e-01, -6.66186884e-02,
1.08856119e-01, nan, -2.16983825e-01, 2.44164586e-01,
nan, nan, -3.26141238e-01, -8.73360708e-02,
-3.75555217e-01, nan, -3.91870797e-01, -2.39072606e-01,
-1.24068327e-01, -4.32559103e-01, 2.30513979e-02, nan,
-2.23912150e-01, -2.17534795e-01, -1.92928210e-01, -1.64950922e-01,
nan, -3.08977306e-01, -3.68163049e-01, 2.81006261e-03,
-5.07392526e-01, -1.65657967e-01, -3.15613002e-01, -1.74545765e-01,
-2.78588176e-01, -4.34532404e-01, -2.61619866e-01, -1.43855408e-01,
nan, nan, -3.72981817e-01, -1.94371045e-01,
1.83636006e-02, -4.24602851e-02, -8.58307257e-02, -2.71321237e-01,
-1.97004348e-01, -4.95876729e-01, -4.74496722e-01, nan,
-3.33254598e-02, -4.79034781e-01, -2.55109280e-01, -1.87851325e-01,
-3.39175999e-01, -4.80552763e-01, -4.50025231e-01, -1.03966720e-01,
-7.74463296e-01, -5.16545363e-02, 1.01213539e-02, nan,
-1.24744892e-01, -2.21584707e-01, -2.19108924e-01, -4.01318192e-01,
-2.04100892e-01, -2.66580433e-01, -7.59028137e-01, nan,
-2.50042826e-01, -4.02819782e-01, -2.02461675e-01, -3.39741558e-01,
-5.28345779e-02, -8.42932388e-02, -1.62568614e-01, 1.58098206e-01,
-1.10761724e-01, 2.35181837e-03, -3.26542675e-01, nan,
5.05282357e-03, -1.25751108e-01, nan, -3.03706586e-01,
nan, -2.41995305e-01, -2.53088415e-01, -2.43461326e-01,
-2.04102136e-02, -1.84795737e-01, -2.18806162e-01, nan,
-2.36812025e-01, -1.80641860e-01, -3.41657400e-01, -3.14457595e-01,
-2.63056546e-01, -3.97427410e-01, -2.54380584e-01, nan,
nan, -1.74986079e-01, -2.74913579e-01, -1.29359856e-01,
-3.59678119e-02, 3.35261613e-01, -8.78777653e-02, -5.04442751e-01,
-4.00152236e-01, -2.42632881e-01, nan, -4.05929804e-01,
-2.45563030e-01, -1.88916773e-01, -2.40435839e-01, -1.00784957e-01,
nan, nan, -3.35613132e-01, -1.17802337e-01,
nan, nan, -2.27645561e-01, nan,
-3.03044826e-01, 5.99465857e-04, -1.56689212e-01, -1.43252518e-02,
-1.03414640e-01, -3.61972488e-02, -2.86053956e-01, 1.54133691e-02,
-2.91877747e-01, -4.44084078e-01, -1.14257067e-01, -3.59545112e-01,
-1.86518461e-01, -4.90693688e-01, -1.78244472e-01, -4.35604304e-01,
-1.25659660e-01, -1.01315916e-01, -1.45916626e-01, -2.43625432e-01,
-1.30847663e-01, nan, -1.70976147e-01, -1.98871285e-01,
nan, 3.85484435e-02, -3.26892465e-01, -2.91502178e-01,
nan, -1.78116813e-01, -1.97384760e-01, -2.32053742e-01,
-2.82236040e-01, -1.08087726e-01, -4.09883052e-01, -5.29915988e-01,
-3.24332803e-01, -1.00257874e-01, nan, nan,
nan, -2.54310161e-01, nan, -5.29110789e-01,
-2.61053085e-01, -5.08699298e-01, -2.21153900e-01, -5.59086382e-01,
-2.46261105e-01, nan, nan, -3.05840541e-02,
-2.34860018e-01, -3.22149009e-01, -3.99790168e-01, -2.31906787e-01,
-3.94329689e-02, 8.35715458e-02, -2.45865479e-01, -4.13744181e-01,
nan, -4.18445647e-01, nan, -3.28062594e-01,
nan, -4.72936690e-01, -1.81261748e-01, 1.46970540e-01], dtype=float32), 'kernel': Array([[ 4.8442278e-02, -2.5370871e-04, 1.9979328e-01, ...,
1.5068804e-01, 3.6767788e-02, 1.3192339e-01],
[-6.7832903e-03, -5.0304595e-02, -2.1431591e-01, ...,
-1.1181718e-01, 2.0614813e-01, 1.2734850e-01],
[ 1.3255176e-01, 5.4206248e-02, 1.9638033e-01, ...,
-5.9157098e-03, -1.5652535e-02, -5.7662982e-03],
...,
[ 2.7093706e-03, -4.8780489e-01, -1.7505699e-01, ...,
7.1161285e-02, 2.8860131e-02, 5.7024822e-02],
[ 8.8524841e-02, -6.1251257e-02, -2.8650817e-02, ...,
-9.4492398e-02, 2.4803801e-01, 7.7640779e-02],
[-1.5573186e-01, -1.6367893e-01, -1.5592015e-01, ...,
-1.1927266e-01, -2.0962511e-01, -9.1291368e-02]], dtype=float32)}, [skipped many lines]
Action: [[nan]]
The log is not complete as it has more than 100KB in size, so I attach just the beginning.