lhotse icon indicating copy to clipboard operation
lhotse copied to clipboard

[BUG] Deadloop on `LazyRepeater` for non re-iterable.

Open chenjiasheng opened this issue 1 year ago • 2 comments

The LazyRepeater is intended for a re-iterable Iterable. If the iterable is not re-iterable, for example, if it is a generator created by a generator expression or the yield keyword, then iterating on the LazyRepeater would hang indefinitely without yielding any real items when times is not specified, or it would yield fewer items than the user expected if times > 1.

Here is a simple reproduction of the issue:

it = (x for x in range(10))
repeated = LazyRepeater(iterable=it, times=None, preserve_id=True)
for x in repeated:
    print(x)
# Hangs after printing 10 numbers

My proposed solution:

  1. Add a "re-iterable" and "non-empty" restriction for the input iterable in the docstring.
  2. Whenever an epoch starts, if the very first yield statement raises a StopIteration error, then raise an Exception complaining about "not being non-empty" and "not being re-iterable" (for the second and later epochs only).
  3. Change the parameter name from 'iterator' to 'iterable' (however, this may cause backward compatibility issues).

chenjiasheng avatar Nov 25 '23 01:11 chenjiasheng

@oplatek @songmeixu @johnjosephmorgan @stachu86

chenjiasheng avatar Nov 25 '23 01:11 chenjiasheng

I am OK with your proposed solution, could you make a PR?

pzelasko avatar Nov 30 '23 14:11 pzelasko