weasel icon indicating copy to clipboard operation
weasel copied to clipboard

MemoryError on computing checksums for large files

Open oroszgy opened this issue 4 months ago • 2 comments

When creating a command which depends on a large file (which cannot be fitted into memory), weasel still tries to load the whole file which results in a MemoryError.

The traceback for such a run:


  File "/home/gorosz/workspace/ascend/invoice-ai-models/.venv/bin/weasel", line 8, in <module>                                                                                                                                                  sys.exit(app())                                                                                                                                                                                                                         

  File "/home/gorosz/workspace/ascend/invoice-ai-models/.venv/lib/python3.10/site-packages/weasel/cli/run.py", line 42, in project_run_cli                                                                                                      project_run(                            
                                                           
  File "/home/gorosz/workspace/ascend/invoice-ai-models/.venv/lib/python3.10/site-packages/weasel/cli/run.py", line 88, in project_run
    project_run(                                                                                                                                                                                                                                                                                       
  File "/home/gorosz/workspace/ascend/invoice-ai-models/.venv/lib/python3.10/site-packages/weasel/cli/run.py", line 113, in project_run
    update_lockfile(current_dir, cmd)                                                                                                                                                                                                                                                                                                                                                                                                                                                   
  File "/home/gorosz/workspace/ascend/invoice-ai-models/.venv/lib/python3.10/site-packages/weasel/cli/run.py", line 270, in update_lockfile
    data[command["name"]] = get_lock_entry(project_dir, command)
                                                                                                                                                                                                                                              File "/home/gorosz/workspace/ascend/invoice-ai-models/.venv/lib/python3.10/site-packages/weasel/cli/run.py", line 286, in get_lock_entry
    deps = get_fileinfo(project_dir, command.get("deps", []))

  File "/home/gorosz/workspace/ascend/invoice-ai-models/.venv/lib/python3.10/site-packages/weasel/cli/run.py", line 308, in get_fileinfo
    md5 = get_checksum(file_path) if file_path.exists() else None

  File "/home/gorosz/workspace/ascend/invoice-ai-models/.venv/lib/python3.10/site-packages/weasel/util/hashing.py", line 33, in get_checksum
    return hashlib.md5(Path(path).read_bytes()).hexdigest() 

  File "/home/gorosz/Applications/miniconda3/lib/python3.10/pathlib.py", line 1127, in read_bytes
    return f.read()

MemoryError

oroszgy avatar Feb 21 '24 11:02 oroszgy