GaitAnalysisToolKit Speed

Here is a profile of essentially running DFlowData.clean_data() + all the methods in WalkingData (including inverse dynamics compuations).

         7641430 function calls (7640347 primitive calls) in 181.271 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   320218  165.994    0.001  165.994    0.001 {method 'readline' of 'file' objects}
        3    2.709    0.903    2.709    0.903 {method 'read' of 'pandas.parser.TextReader' objects}
    36844    1.017    0.000    1.017    0.000 {pandas.lib.maybe_convert_objects}
       19    0.879    0.046  168.193    8.852 session.py:494(evaluate)
    36839    0.816    0.000    4.204    0.000 series.py:3129(interpolate)
    37612    0.574    0.000    0.643    0.000 common.py:134(_isnull_ndarraylike)
      120    0.530    0.004    0.532    0.004 fitpack2.py:441(__init__)
    41185    0.385    0.000    0.385    0.000 {method 'reduce' of 'numpy.ufunc' objects}
   320229    0.331    0.000    0.331    0.000 {_codecs.utf_8_decode}
   320226    0.284    0.000    0.777    0.000 {method 'decode' of 'str' objects}
   268534    0.267    0.000    0.485    0.000 {method 'view' of 'numpy.ndarray' objects}
    37998    0.258    0.000    0.851    0.000 series.py:430(__new__)
    37612    0.238    0.000    0.988    0.000 common.py:64(_isnull_new)
951420/951418    0.226    0.000    0.226    0.000 {isinstance}
    38160    0.208    0.000    0.209    0.000 {method 'copy' of 'numpy.ndarray' objects}
   111799    0.208    0.000    0.426    0.000 index.py:332(__getitem__)
    36642    0.203    0.000    1.341    0.000 frame.py:1928(_ixs)
   320110    0.185    0.000    0.297    0.000 __init__.py:1343(isEnabledFor)
    98213    0.169    0.000    0.169    0.000 {numpy.core.multiarray.array}
   320223    0.162    0.000    0.493    0.000 utf_8.py:15(decode)
   320110    0.154    0.000    0.451    0.000 __init__.py:1128(debug)
   151206    0.133    0.000    0.193    0.000 numeric.py:1810(isscalar)
    38175    0.132    0.000    0.298    0.000 series.py:3299(_sanitize_array)
      801    0.130    0.000    0.130    0.000 {pandas.algos.take_2d_axis0_float64_float64}
164141/164139    0.127    0.000    0.129    0.000 {getattr}
        1    0.119    0.119    0.277    0.277 __init__.py:20(<module>)
     1246    0.117    0.000    0.117    0.000 {numpy.core.multiarray.concatenate}
    36839    0.112    0.000    4.317    0.000 frame.py:4389(<lambda>)
   320110    0.112    0.000    0.112    0.000 __init__.py:1329(getEffectiveLevel)
      197    0.110    0.001    6.391    0.032 frame.py:4433(_apply_standard)
    36642    0.098    0.000    0.647    0.000 internals.py:1681(iget)
    76780    0.095    0.000    0.108    0.000 index.py:318(__contains__)
    75404    0.092    0.000    0.213    0.000 series.py:532(__array_finalize__)

I think using readline to pull the events from the record files is a major slow down. Also all the calls to session.py are oct2py calls, which are slow. I believe the slow parts are the inverse dynamics real time filter and the soder.m file.

Feb 06 '14 03:02 moorepants

Calling .readlines() a file is fast, as done for the event finding. I'm not sure where all of the .readline() methods are being called.

In [7]: %time df = pandas.read_csv('/home/moorepants/Data/human-gait/gait-control-identification/T006/mocap-006.txt', delimiter='\t')
CPU times: user 1.49 s, sys: 60.8 ms, total: 1.55 s
Wall time: 1.55 s

Reading in the files is fast. And profiling doesn't show any readline() calls.

It could be this readline() call in oct2py/session.py:evaluate (line 551):

line = self.proc.stdout.readline().rstrip().decode('utf-8')

Feb 06 '14 03:02 moorepants

Here's profiling output of essentially just the DFlowData.clean_data() call:

         1317864 function calls (1317260 primitive calls) in 62.622 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   128105   56.725    0.000   56.725    0.000 {method 'readline' of 'file' objects}
        3    2.573    0.858    2.573    0.858 {method 'read' of 'pandas.parser.TextReader' objects}
      120    0.560    0.005    0.562    0.005 fitpack2.py:441(__init__)
       10    0.336    0.034   57.587    5.759 session.py:494(evaluate)
      379    0.210    0.001    0.212    0.001 common.py:134(_isnull_ndarraylike)
   128153    0.122    0.000    0.122    0.000 {_codecs.utf_8_decode}
   128152    0.104    0.000    0.286    0.000 {method 'decode' of 'str' objects}
        2    0.102    0.051    0.102    0.051 {scipy.signal.sigtools._linear_filter}
      235    0.096    0.000    0.096    0.000 {method 'copy' of 'numpy.ndarray' objects}
      835    0.091    0.000    0.091    0.000 {method 'reduce' of 'numpy.ufunc' objects}
      182    0.088    0.000    0.088    0.000 {numpy.core.multiarray.putmask}
        5    0.077    0.015    0.077    0.015 {pandas.algos.take_2d_axis1_float64_float64}
      670    0.076    0.000    0.088    0.000 index.py:332(__getitem__)
      272    0.072    0.000    0.168    0.001 series.py:709(_get_values)
      120    0.070    0.001    0.070    0.001 {scipy.interpolate._fitpack._spl_}
   128037    0.068    0.000    0.111    0.000 __init__.py:1343(isEnabledFor)
        6    0.065    0.011    0.065    0.011 {pandas.algos.take_2d_axis0_float64_float64}
   128150    0.061    0.000    0.183    0.000 utf_8.py:15(decode)
   128037    0.057    0.000    0.168    0.000 __init__.py:1128(debug)
        1    0.053    0.053    0.133    0.133 __init__.py:20(<module>)
        2    0.048    0.024    0.054    0.027 indexing.py:100(_setitem_with_indexer)
       13    0.044    0.003    0.044    0.003 {numpy.core.multiarray.concatenate}
   128037    0.043    0.000    0.043    0.000 __init__.py:1329(getEffectiveLevel)
        6    0.041    0.007    0.042    0.007 internals.py:2368(_stack_arrays)
       90    0.039    0.000    0.072    0.001 series.py:719(where)
        4    0.031    0.008    0.031    0.008 {pandas.algos.take_2d_axis0_bool_bool}
   128106    0.029    0.000    0.029    0.000 {method 'rstrip' of 'str' objects}
       61    0.028    0.000    0.028    0.000 {method 'write' of 'file' objects}
       46    0.027    0.001    0.027    0.001 {method 'join' of 'str' objects}
        1    0.025    0.025    0.035    0.035 table.py:13(<module>)
      109    0.022    0.000    0.022    0.000 {method 'take' of 'numpy.ndarray' objects}
   138790    0.020    0.000    0.020    0.000 {method 'append' of 'list' objects}
        1    0.020    0.020    0.020    0.020 {pandas.algos.diff_2d_float64}

Pretty sure all the slow ups are all from the Octave code.

Feb 06 '14 03:02 moorepants

See issue #30 for details on the rtfilter.

Feb 06 '14 04:02 moorepants