LGLG42
Results
2
comments of
LGLG42
Same situation here, tried every single option with gpu_memory_utilization:[0.2:0.9], enforce_eager, batch_size, tried smaller model like "pretrained=facebook/opt-125m", max_model_len=[128:4096], etc, etc, etc, and the only thing that worked was to hack manually...
I'm also facing this issue for some time now. Renders remote notebooks unstable causing countless hours of work lost since they're also not autosaved for some weird reason.