Albert Zeyer comments

Results 972 comments of


                                            Albert Zeyer

pytest tests/test_rf_array.py crash in _GLOBAL__sub_I_IpcFabricConfigClient.cpp

Maybe the problem is actually in `std::random_device::_M_getval`? Some related issues: https://github.com/h2oai/datatable/issues/2453 https://github.com/RobJinman/pro_office_calc/issues/5 https://github.com/boostorg/fiber/issues/249 https://github.com/h2oai/datatable/issues/2453 https://github.com/microsoft/LightGBM/issues/1516 **Edit** I also posted it here: https://github.com/pytorch/pytorch/issues/102360

pytest tests/test_rf_array.py crash in _GLOBAL__sub_I_IpcFabricConfigClient.cpp

Looking at the error message from `std::__throw_runtime_error` (via `info registers` and then trial-and-error `print (const char*)...` on the register values), it is: `"random_device could not be read"`. Code [here](https://github.com/gcc-mirror/gcc/blob/d156c6054200237b707f9fb44ae9958d926b0745/libstdc%2B%2B-v3/src/c%2B%2B11/random.cc#L595).

pytest tests/test_rf_array.py crash in _GLOBAL__sub_I_IpcFabricConfigClient.cpp

Search for that last error gives some further maybe interesting results: https://discuss.pytorch.org/t/random-device-could-not-be-read/138697 One interesting bit: "I am also using tensorflow along with pytorch in the script." We also do the...

pytest tests/test_rf_array.py crash in _GLOBAL__sub_I_IpcFabricConfigClient.cpp

Interestingly, maybe using TensorFlow 2.10 does not cause the problem with the hang in PyTorch? At least I don't get the hang then. However, I don't have the proper CUDA...

Introduce native assign_mul for more efficient weight decay

Note, for relevant code: https://github.com/tensorflow/tensorflow/blob/9959f963a0afe0a5a24cb9913998fe89169df252/tensorflow/core/kernels/resource_variable_ops.cc#L630 https://github.com/tensorflow/tensorflow/blob/27dc409fcfcc538cce7447b9637a8f727ef6a123/tensorflow/core/ops/resource_variable_ops.cc#L217 Note that our `assign_mul` should support broadcasting, or rather even supporting a scalar as argument. This is actually the only relevant use case for...

SprintInterface demo() depends on Theano

It just should be kept, as it is just a demo, i.e. for demonstration purpose. Yes, it would be nicer if it also works, but I think this is not...

learning_rate_control_error_measure should match exactly

No feedback here? We can also first introduce such an option and later change the default of this option via a new behavior version. Or maybe also right away, and...

learning_rate_control_error_measure should match exactly

Despite this, there is also the inconsistency of a single vs multiple losses. In the single case, it would just store `dev_score` etc, and only with multiple losses it appends...

Non-deterministic training

Ok, this uses our native-CTC, which we know has some non-determinism. Maybe that causes the big effect here?

Non-deterministic training

@mmz33 @JackTemaki @Marvin84 @christophmluscher @ZhouW321 have you recently looked into this, or just tested it?