padatious icon indicating copy to clipboard operation
padatious copied to clipboard

test_train_timeout_subprocess fail randomly

Open pgajdos opened this issue 6 years ago • 2 comments

Hello,

test in question is failing with ca 30% probability in our build system. I have extraxted following testcase:

from time import monotonic
  
import os
import random

from padatious.intent_container import IntentContainer

cont = IntentContainer('temp')
cont.add_intent('a',
        [' '.join(random.choice('abcdefghijklmnopqrstuvwxyz') for _ in range(5))
            for __ in range(300)])
cont.add_intent('b',
        [' '.join(random.choice('abcdefghijklmnopqrstuvwxyz') for _ in range(5))
            for __ in range(300)])

for x in range(10):
    a = monotonic()
    assert not cont.train_subprocess(timeout=0.1)
    b = monotonic()
    print (b - a)

When I run it, I had got for example:

 0.47674093791283667
 0.5609202678315341
 0.5488572919275612
 6.474134984891862
 0.4769664751365781
 0.45290810498408973
 0.470392829971388
 0.4690805918071419
 0.46847033803351223
 0.4608854129910469

pgajdos avatar Apr 12 '19 10:04 pgajdos

Sorry for the late reply. Curious, what platform is this build system running on?

MatthewScholefield avatar Aug 16 '19 19:08 MatthewScholefield

It is 32-bit or 64-bit linux. I do not remember much; when I run it on live system, I am getting in verbatim:

$ python3 test.py
Some objects timed out while training
Took too long to train a
Took too long to train b
0.46342682399972546
Some objects timed out while training
0.5481992479999462
Some objects timed out while training
0.474773013000231
Some objects timed out while training
0.5743695310002295
Some objects timed out while training
0.4770706409999548
Some objects timed out while training
0.5607837820007262
Some objects timed out while training
Regenerated b.
Regenerated a.
6.3757639979994565
0.45383565700012696
0.4491451869998855
0.46560020400011126
$

Unfortunately I do not understand the module or neural networks more to be sure I do not do anything wrong. But the fact, that the test is failing in certain percent of runs seem to be correct. Currently, we are just skipping the test.

See build logs: 32-bit, 64-bit

Note that in the build log the test value is only slightly more than 1s, so this might be a different issue than above. The build system may be slower than my live system. We can either assign better worker for this task or skip the test entirely.

What do you suggest?

pgajdos avatar Aug 21 '19 10:08 pgajdos