Albert Zeyer

Results 938 comments of Albert Zeyer

From your logs, it does not seem like the error is in `_TouchFilesThread.run`? It seems the error occurred in some sub proc. Can you give details on what sub proc...

I assume some bad environment configuration. Maybe weird ulimits or so. Or wrong tmp path. I wonder a bit where it tries to bind the socket (`self._socket.bind(address)`), i.e. what is...

Note, the temp dir logic of Python, i.e. where it would create those temp files/dirs: ```python def _candidate_tempdir_list(): """Generate a list of candidate temporary directories which _get_default_tempdir will try.""" dirlist...

But in any case, what is `address` actually? So we just know, and don't need to assume.

> I just got bitten by the same error in a training not using the new dataset or caching mechanism. So what is `address` in your case?

> `tempdir._get_default_tempdir` returns the project folder, i.e. `/home/mgunz/setups/2024-06-24--[redacted]` Yea that's a problem. It should definitely not use that. > ``` > >>> _candidate_tempdir_list() > ['/tmp', '/tmp', '/var/tmp', '/usr/tmp', '/home/mgunz/setups/2024-06-24--[redacted]'] >...

> There is a race condition when the file cache removes the lock file during a run of the mtime update thread. I'm not sure I understand. So you mean:...

> I think for simplicity I'll use a single, global lock first But this could cause some hangs. The list of files could be large, and the main thread should...

Btw, how often does this problem occur? How many people are affected by this? I thought many people were already using DistributeFilesDataset/FileCache since a while, and since the last fixes...

> What is the syntax for the comments that turn off the linter for a specific line? Why? Where do you want to turn it off? I'm pretty sure there...