PsyNeuLink icon indicating copy to clipboard operation
PsyNeuLink copied to clipboard

Performance issues for large AutodiffCompositions

Open tylergiallanza opened this issue 4 years ago • 1 comments

I'm encountering performance issues in the first call to the learn function of an AutodiffComposition. The performance of the first call to the learn method seems to be highly dependent on the size of the mechanisms in the composition, scaling super-linearly. Consider this code snippet, adapted from the documentation:

    print(f's1={s1},s2={s2}')
    start_t = time.time()
    my_mech_1 = pnl.TransferMechanism(function=pnl.Linear, size = s1)
    my_mech_2 = pnl.TransferMechanism(function=pnl.Linear, size = s2)
    my_projection = pnl.MappingProjection(matrix=np.random.randn(s1,s2),
                        sender=my_mech_1,
                        receiver=my_mech_2)
    # Create AutodiffComposition
    my_autodiff = pnl.AutodiffComposition()
    my_autodiff.add_node(my_mech_1)
    my_autodiff.add_node(my_mech_2)
    my_autodiff.add_projection(sender=my_mech_1, projection=my_projection, receiver=my_mech_2)
    print(f'  Time to create: {time.time()-start_t}')
    in_patterns = np.random.binomial(1,.1,size=(10,s1))
    out_patterns = np.random.binomial(1,.2,size=(10,s2))
    start_t = time.time()
    k = my_autodiff.learn(inputs = {'inputs':{my_mech_1:in_patterns}, 
                             'targets':{my_mech_2:out_patterns}})
    print(f'  First run\'s time: {time.time()-start_t}')
    start_t = time.time()
    k = my_autodiff.learn(inputs = {'inputs':{my_mech_1:in_patterns}, 
                             'targets':{my_mech_2:out_patterns}})
    print(f'  Second run\'s time: {time.time()-start_t}')

I ran this code for different values of s1 and s2, yielding this:

s1 s1 time to create composition time for first call to learn time for second call to learn
1 2 .5 sec 1 sec .03 sec
10 20 .5 sec 1 sec .03 sec
100 200 .8 sec 8 sec .03 sec
200 400 1.8 sec 23 sec .03 sec
200 800 3.4 sec 85 sec .03 sec

I also ran s1=200,s2=1600. At this point, the first call to learn took at least 5 minutes, at which point I stopped execution because Python had allocated 3.5 Gb of memory.

tylergiallanza avatar Mar 18 '21 21:03 tylergiallanza

Thanks for the report. This is addressed in #1963, which in my case brings the numbers down to

s1 s2 time to create composition time for first call to learn time for second call to learn
200 800 0.7 sec 1.6 sec .03 sec
400 1600 0.70 sec 2.4 sec .04 sec
400 3200 0.78 sec 9.1 sec .07 sec
800 3200 0.98 sec 15.8 sec .1 sec

I'm not sure what the expected scaling of learn is, though, and any test of 100/4800+ is killed by the os, possibly due to memory.

kmantel avatar Mar 20 '21 05:03 kmantel