nix-gl-host icon indicating copy to clipboard operation
nix-gl-host copied to clipboard

Cache incorrectly initialized when no space left on device

Open ahendriksen opened this issue 2 years ago • 1 comments

Describe the bug I had little space left in the volume hosting my .cache directory. So constructing the cache failed (see backtrace 0). When I made more space and reran nixglhost, it failed (with backtrace 1). I had expected it to rebuild the cache first.

To resolve, I deleted the cache and everything worked fine again:

$ rm -rf .cache/nix-gl-host
$ rm -rf .cache/nix-gl-host.lock
$ nixglhost -p
$ nixglhost nvidia-smi

Backtrace 0

$ nixglhost nvidia-smi                                                                                                                                                                                                                                             
Traceback (most recent call last):                                                                                                                                                                                                                                              
  File "/nix/store/c1vb2z3c64i0sd92iz7fv0lb720qcvhb-python3-3.10.6/lib/python3.10/shutil.py", line 815, in move                                                                                                                                                                 
    os.rename(src, real_dst)                                                                                                                                                                                                                                                    
OSError: [Errno 18] Invalid cross-device link: '/tmp/tmpdf59_vtp/nix-gl-host' -> '/home/ahendriksen/.cache/nix-gl-host'                                                                                                                                                         
                                                                                                                                                                                                                                                                                
During handling of the above exception, another exception occurred:                                                                                                                                                                                                             
                                                                                                                                                                                                                                                                                
Traceback (most recent call last):                                                                                                                                                                                                                                              
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 680, in <module>                                                                                                                                                                       
    ret = main(args)                                                                                                                                                                                                                                                            
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 628, in main                                                                                                                                                                           
    new_env = nvidia_main(cache_dir, host_dsos_paths, args.print_ld_library_path)                                                                                                                                                                                               
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 576, in nvidia_main                                                                                                                                                                    
    shutil.move(tmp_cache_dir, os.path.split(cache_dir)[0])                                                                                                                                                                                                                     
  File "/nix/store/c1vb2z3c64i0sd92iz7fv0lb720qcvhb-python3-3.10.6/lib/python3.10/shutil.py", line 831, in move                                                                                                                                                                 
    copytree(src, real_dst, copy_function=copy_function,                                                                                                                                                                                                                        
  File "/nix/store/c1vb2z3c64i0sd92iz7fv0lb720qcvhb-python3-3.10.6/lib/python3.10/shutil.py", line 558, in copytree                                                                                                                                                             
    return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,                                                                                                                                                                                                      
  File "/nix/store/c1vb2z3c64i0sd92iz7fv0lb720qcvhb-python3-3.10.6/lib/python3.10/shutil.py", line 512, in _copytree                                                                                                                                                            
    raise Error(errors)                                                                                                                                                                                                                                                         
shutil.Error: [('/tmp/tmpdf59_vtp/nix-gl-host/[.. snip ..]', 
'/home/ahendriksen/.cache/nix-gl-host/[.. snip ..]', 
'[Errno 28] No space left on device'), 
[.. snip more such tuples ..] ]   

Backtrace 1

[nix-shell:~]$ nixglhost nvidia-smi                                            
Traceback (most recent call last):                                             
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 680, in <module>                                                       
    ret = main(args)                                                                                                                                               
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 628, in main                                                                                                                                         
    new_env = nvidia_main(cache_dir, host_dsos_paths, args.print_ld_library_path)                                                                                                                                                  
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 583, in nvidia_main
    assert nix_gl_ld_library_path, "The nix-host-gl LD_LIBRARY_PATH is not set"
AssertionError: The nix-host-gl LD_LIBRARY_PATH is not set                                                                                                      

To Reproduce I guess limit the space available in the volume hosting the .cache directory and run nixglhost -p twice.

Expected behavior

I had expected nixglhost to rebuild the cache successfully on the second run of nixglhost -p (after I had made enough space).

Additional context

Thanks for developing this tool! I have only run it once and I already have the feeling it will be invaluable!

ahendriksen avatar Apr 21 '23 14:04 ahendriksen

Haha, I did not see that one coming. I see what's happening here.

Thanks for the detailed bug report! I need to try to find a way to test that.


Notes to future self: create small tmpfs for cache dir, couple that with large tmpdir to repro.

[Edit]: thanks for the kind words btw!

picnoir avatar Apr 22 '23 06:04 picnoir