accelerate icon indicating copy to clipboard operation
accelerate copied to clipboard

RecursionError: maximum recursion depth exceeded while calling a Python object

Open kirilllzaitsev opened this issue 2 years ago • 1 comments

Hi, using accelerate 0.19.0 and torch 1.13.1+cu117, I encounter the following issue in a single-GPU environment at around the 740th epoch of training:

RecursionError: maximum recursion depth exceeded while calling a Python object

Below is part of the log where recursion starts to show up. There is nothing special about the 740th epoch, as the workflow is identical to all epochs before this one.

│ ~/venvs/ssdc/lib64/python3.10/site-packages/acceler │
│ ate/utils/operations.py:521 in forward                                       │
│                                                                              │
│   518 │   model_forward = ConvertOutputsToFp32(model_forward)                │
│   519 │                                                                      │
│   520 │   def forward(*args, **kwargs):                                      │
│ ❱ 521 │   │   return model_forward(*args, **kwargs)                          │
│   522 │                                                                      │
│   523 │   # To act like a decorator so that it can be popped when doing `ext │
│   524 │   forward.__wrapped__ = model_forward                                │
│                                                                              │
│ ~/venvs/ssdc/lib64/python3.10/site-packages/acceler │
│ ate/utils/operations.py:509 in __call__                                      │
│                                                                              │
│   506 │   │   update_wrapper(self, model_forward)                            │
│   507 │                                                                      │
│   508 │   def __call__(self, *args, **kwargs):                               │
│ ❱ 509 │   │   return convert_to_fp32(self.model_forward(*args, **kwargs))    │
│   510 │                                                                      │
│   511 │   def __getstate__(self):                                            │
│   512 │   │   raise pickle.PicklingError(                                    │
│                                                                              │
│ ~/venvs/ssdc/lib64/python3.10/site-packages/torch/a │
│ mp/autocast_mode.py:14 in decorate_autocast                                  │
│                                                                              │
│    11 │   @functools.wraps(func)                                             │
│    12 │   def decorate_autocast(*args, **kwargs):                            │
│    13 │   │   with autocast_instance:                                        │
│ ❱  14 │   │   │   return func(*args, **kwargs)                               │
│    15 │   decorate_autocast.__script_unsupported = '@autocast() decorator is │
│    16 │   return decorate_autocast                                           │
│    17                                                                        │
│                                                                              │
│ ~/venvs/ssdc/lib64/python3.10/site-packages/acceler │
│ ate/utils/operations.py:521 in forward                                       │
│                                                                              │
│   518 │   model_forward = ConvertOutputsToFp32(model_forward)                │
│   519 │                                                                      │
│   520 │   def forward(*args, **kwargs):                                      │
│ ❱ 521 │   │   return model_forward(*args, **kwargs)                          │
│   522 │                                                                      │
│   523 │   # To act like a decorator so that it can be popped when doing `ext │
│   524 │   forward.__wrapped__ = model_forward                                │
│                                                                              │
│ ~/venvs/ssdc/lib64/python3.10/site-packages/acceler │
│ ate/utils/operations.py:509 in __call__                                      │
│                                                                              │
│   506 │   │   update_wrapper(self, model_forward)                            │
│   507 │                                                                      │
│   508 │   def __call__(self, *args, **kwargs):                               │
│ ❱ 509 │   │   return convert_to_fp32(self.model_forward(*args, **kwargs))    │
│   510 │                                                                      │
│   511 │   def __getstate__(self):                                            │
│   512 │   │   raise pickle.PicklingError(                                    │
│                                                                              │
│ ~/venvs/ssdc/lib64/python3.10/site-packages/torch/a │
│ mp/autocast_mode.py:14 in decorate_autocast                                  │
│                                                                              │
│    11 │   @functools.wraps(func)                                             │
│    12 │   def decorate_autocast(*args, **kwargs):                            │
│    13 │   │   with autocast_instance:                                        │
│ ❱  14 │   │   │   return func(*args, **kwargs)                               │
│    15 │   decorate_autocast.__script_unsupported = '@autocast() decorator is │
│    16 │   return decorate_autocast                                           │
│    17                                                                        │
│                                                                              │
│ ~/venvs/ssdc/lib64/python3.10/site-packages/acceler │
│ ate/utils/operations.py:521 in forward                                       │
│                                                                              │
│   518 │   model_forward = ConvertOutputsToFp32(model_forward)                │
│   519 │                                                                      │
│   520 │   def forward(*args, **kwargs):                                      │
│ ❱ 521 │   │   return model_forward(*args, **kwargs)                          │
│   522 │                                                                      │
│   523 │   # To act like a decorator so that it can be popped when doing `ext │
│   524 │   forward.__wrapped__ = model_forward                                │
│                                                                              │
│ ~/venvs/ssdc/lib64/python3.10/site-packages/acceler │
│ ate/utils/operations.py:509 in __call__                                      │
│                                                                              │
│   506 │   │   update_wrapper(self, model_forward)                            │
│   507 │                                                                      │
│   508 │   def __call__(self, *args, **kwargs):                               │
│ ❱ 509 │   │   return convert_to_fp32(self.model_forward(*args, **kwargs))    │
│   510 │                                                                      │
│   511 │   def __getstate__(self):                                            │
│   512 │   │   raise pickle.PicklingError(                                    │
│                                                                              │
│ ~/venvs/ssdc/lib64/python3.10/site-packages/torch/a │
│ mp/autocast_mode.py:14 in decorate_autocast                                  │
│                                                                              │
│    11 │   @functools.wraps(func)                                             │
│    12 │   def decorate_autocast(*args, **kwargs):                            │
│    13 │   │   with autocast_instance:                                        │
│ ❱  14 │   │   │   return func(*args, **kwargs)                               │
│    15 │   decorate_autocast.__script_unsupported = '@autocast() decorator is │
│    16 │   return decorate_autocast                                           │
│    17                                                                        │
│                                                                              │
│ ~/venvs/ssdc/lib64/python3.10/site-packages/acceler │
│ ate/utils/operations.py:521 in forward                                       │
│                                                                              │
│   518 │   model_forward = ConvertOutputsToFp32(model_forward)                │
│   519 │                                                                      │
│   520 │   def forward(*args, **kwargs):                                      │
│ ❱ 521 │   │   return model_forward(*args, **kwargs)                          │
│   522 │                                                                      │
│   523 │   # To act like a decorator so that it can be popped when doing `ext │
│   524 │   forward.__wrapped__ = model_forward                                │
│                                                                              │
│ ~/venvs/ssdc/lib64/python3.10/site-packages/acceler │
│ ate/utils/operations.py:509 in __call__                                      │
│                                                                              │
│   506 │   │   update_wrapper(self, model_forward)                            │
│   507 │                                                                      │
│   508 │   def __call__(self, *args, **kwargs):                               │
│ ❱ 509 │   │   return convert_to_fp32(self.model_forward(*args, **kwargs))    │
│   510 │                                                                      │
│   511 │   def __getstate__(self):                                            │
│   512 │   │   raise pickle.PicklingError(       

kirilllzaitsev avatar Jun 03 '23 07:06 kirilllzaitsev

It's going to be hard to help you without any reproducer.

sgugger avatar Jun 05 '23 15:06 sgugger

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jul 03 '23 15:07 github-actions[bot]