tenacity icon indicating copy to clipboard operation
tenacity copied to clipboard

Need to rewind Generators that are passed as argument

Open heri16 opened this issue 8 years ago • 8 comments

Found an issue while using tenacity. For a wrapped function that receive a generator as arguments, the generator is not rewound before each retry attempt, causing subsequent attempts to skip.

import random
from tenacity import retry

@retry
def do_something_unreliable(generator):
    for i in generator:
        if random.randint(0, 10) > 1:
            raise IOError("Broken sauce, everything is hosed!!!111one")
        else:
            return i

def firstn(n):
     num = 0
     while num < n:
         yield num
         num += 1

generator = first(100)
print(do_something_unreliable(generator))

Not sure whether this can be solved within the core of this library.

heri16 avatar Apr 23 '17 23:04 heri16

Not sure whether the solution here should belong in the library. http://stackoverflow.com/questions/1271320/reseting-generator-object-in-python

Seems to be good for retry to cater for side-effects.

heri16 avatar Apr 24 '17 00:04 heri16

Seems to be affecting simple iterators too.

heri16 avatar Apr 24 '17 00:04 heri16

Its unreasonable to even try with the infrastructure available in python

RonnyPfannschmidt avatar Apr 24 '17 05:04 RonnyPfannschmidt

The only solution that I see is to iterate before ever trying the first call. That sounds like something dangerous. Best thing is probably to document that the caller is responsible for not passing generators.

jd avatar Apr 24 '17 08:04 jd

Cloning an iterator seems possible in python. http://stackoverflow.com/questions/20086679/python-itertools-tee-clones-and-caching

import itertools

# Side effect of Retrying is generator is not rewound on every retry attempt.
# Workaround when Retrying does not work with iterators as arguments.
class Wrapper(object):
  def __init__(self, f):
    self.f = f
    self.backup = None

  def __call__(self, iterator, *args, **kwargs):
    (iterator, self.backup) = itertools.tee(self.backup) if self.backup else itertools.tee(iterator)
    return self.f(iterator, *args, **kwargs)

retrying.call(Wrapper(grpc_stub.StreamFileContent), chunk_generator_object)

heri16 avatar Apr 25 '17 08:04 heri16

We could have a clone=True flag in tenacity.Retrying.__init__ that turns on argument checking , (only first-level) during call().

need_cloning = [arg for arg in args if inspect.isgenerator(arg)]

Or we cloud have a tee=[list of kwargs keys] that avoids the performance penalty of inspect.

heri16 avatar Apr 25 '17 09:04 heri16

It'd probably better to have a flag because iterators can have different and side effect behaviour, so it might do weird stuff otherwise.

jd avatar Apr 25 '17 09:04 jd

I realise I'm probably just making a can of worms worse, but if you have a feature to rewind a generator before retrying, should you also have a feature to record the position a seekable file-like object was in before the first attempt, and seek back to that before each further attempt? I'm sure we could find other examples of important state-ful objects in core Python that could get special treatment. Ultimately I think the "usual" semantics if you put retry on a function with side-effects, is that those side-effects will occur multiple times if the function is retried (and why would you even need retry if the function doesn't have some sort of side-effects!). Consuming a generator is one kind of side-effect, but there are many. 100% agree that any other behaviour (even something obviously needed for the retry to have a chance of success, such as rewinding a generator) should be opt-in, not the default.

steve-mavens avatar May 05 '22 08:05 steve-mavens