meta-dataset
meta-dataset copied to clipboard
Meta-Dataset in TFDS: [F tensorflow/core/platform/default/env.cc:73] Check failed: ret == 0 (11 vs. 0)Thread tf_data_iterator_resource creation via pthread_create() failed.
When Training on Meta-Dataset episodes (with all the training datasets) using the TFDS APIs, after only a few tasks the reader fails with the following error:
[F tensorflow/core/platform/default/env.cc:73] Check failed: ret == 0 (11 vs. 0)Thread tf_data_iterator_resource creation via pthread_create() failed.
This is on Linux with the latest TensorFlow and the latest TensorFlow Datasets frameworks installed.
Is there some limit that needs to be increased to accommodate all the thread usage?
The TFDS implementation unfortunately creates lots of threads due to there being one dataset per class. I'm not sure what the best solution would be, but I'll look into it and report back.
https://github.com/tensorflow/tensorflow/issues/41532#issuecomment-759075803 suggests that TF may use more than the numbers of available threads, and suggests things to check.
You could try using ulimit -u
, as explained here (in another context) to expand that limit if it's the issue.
If that doesn't work, could you share the limits you see?
Unfortunately I'm not aware of a way to ask TF to be more frugal.
For me, ulimit -u
gives 'unlimited'. I also checked /etc/security/limits.conf
(all commented-out) and this:
cat /proc/sys/kernel/threads-max
3976018
Still getting same error as reported above.
@lehrig Hi, I met the same problem in another case. Have you got any solution to it? Thanks so much if you could share your solution.