GaitAnalysisToolKit Speed up inverse dynamics (Python version)

Speed up inverse dynamics (Python version)

Open moorepants opened this issue 9 years ago • 14 comments

Addresses issues #30 and some of #49.

The Octave file leg2d.m was quite slow. It takes from 5 to 6 minutes on my machine to compute the inverse dynamics of a 8 minute (100 Hz) chunk of data. This has bothered me a long time because it takes hours to process all of our data when it shouldn't. So I decided to finally fix it.

I decided not to spend time improving the performance of the leg2d.m and its dependent functions because I'd rather purge the Octave stuff from this library (the octave/python bridge has too much overhead and complicates installation especially on Windows).

So I've simply ported the leg2d.m file to a Python function in gait.py and also switched to using library filters (scipy.signal) and the finite difference functions in DynamicistToolKit for speed gains. From the simple port I'm seeing close to a 1000 fold performance increase. It now takes less than a second to process 8 minutes of data.

[x] Release new version of DTK.
[x] Address the filtering differences.
[ ] Break up lower_extremity_2d_inverse_dynamics() into several smaller functions or convert to a class.
[ ] Add change to the release notes.
[x] Fix tests that fail due to the changes in butterworth in dtk.

Mar 27 '15 19:03 moorepants

So this all seems to work. It gives the same results as the Octave file on the test data but when I run it for the plot in our perturbed data paper I'm seeing some noise in the joint torques in the Python version that I don't see in the Octave version. I must have some slight differences in the filtering/differentiation code. Here is a screenshot: selection_066

@tvdbogert Can you comment on the filtering/differentiation? You have a custom filterer differentiate. I switched to using a 2nd order Butterworth filter, with filtfilt, and central differencing.

Mar 27 '15 19:03 moorepants

I changed it so I filter the marker positions, compute the derivative, filter the velocities, and compute the derivative instead of filter the marker positions, compute the derivative, and compute the derivative and I get results more similar to the Octave file.

FYI: Images are swapped here with respect to the above. selection_069 But the hip joint torque is still more different than I'd hoped.

Mar 27 '15 21:03 moorepants

Good to know that Octave performs so poorly on this code.

I looked in your speedup inverse dynamics branch but could not locate the python code. Pls give a link if possible.

I think I know what is going on. My rtfilter.m processes data with time stamps, one sample at a time. It can handle gaps in the data. If you use standard digital butterworth filters, you need a constant sampling rate. The spikes may be generated if you do the differentiation with the assumption of constant sampling rate. If a frame drops out, you then suddenly get only half of the true derivative. The second derivative then gets spikes. This may happen in a certain part of the gait cycle when the hip marker drops out sometimes. Filtering again after the differentiation will get rid of it, but that just hides the problem and in fact distorts the signal.

In fact, you should never filter more than once in this sequence of steps. Filtering and differentiation are both linear operations on the signal, and you can change their order. So filter only once, at the beginning or at the end. My filter has the advantage that the filtering and differentiation is combined into one step, which reduces the distortion from finite differencing.

The for loop in myfiltfilt.m is probably the bottleneck for Octave. It may be possible to vectorize that, I would have to take a real good look at rtfilter.m. The problem is that the filter coefficients depend on the time between two samples. It may be possible to code that without loops. This code was written for real-time (I originally wrote it in C, Motek has that code in D-Flow) so it processes one sample at a time.

I would like to keep the Matlab code around, with the Python equivalent in a separate folder. In that case, the Python code should produce exactly the same output.

Here are the options as I see them:

Translate the existing filter code into Python (myfiltfilt.m and rtfilter.m). It's not a lot of code. If that still runs too slow, consider the other options.
Make a compiled version of myfiltfilt.m (which then also includes rtfilter.m), and call that from Python (or Matlab or Python). I will e-mail you the C source. Not sure how much work that is.
Give up on the filter that can handle variable sampling rate and gaps. Then we need to interpolate the gaps and then use a standard butterworth filter for constant sampling rate. And then do finite difference derivatives. Do this both in the Matlab and Python versionbs. To me that is clunky because I know better. If you have large gaps, and do linear interpolation, you get zero accelerations during that time. Higher order polynomial interpolations become unreliable.

Whatever we do, the Matlab/Octave and Python code should give identical results.

Mar 28 '15 22:03 tvdbogert

I looked in your speedup inverse dynamics branch but could not locate the python code. Pls give a link if possible.

inverse dynamics: https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128/files#diff-5e1f4c39e8343cdd4b4dedb9eda0a0efR1084

mimics leg 2d: https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128/files#diff-5e1f4c39e8343cdd4b4dedb9eda0a0efR149

numerical differences: https://github.com/moorepants/DynamicistToolKit/blob/master/dtk/process.py#L575

filter: https://github.com/moorepants/DynamicistToolKit/blob/master/dtk/process.py#L462

I think I know what is going on. My rtfilter.m processes data with time stamps, one sample at a time. It can handle gaps in the data. If you use standard digital butterworth filters, you need a constant sampling rate. The spikes may be generated if you do the differentiation with the assumption of constant sampling rate. If a frame drops out, you then suddenly get only half of the true derivative. The second derivative then gets spikes. This may happen in a certain part of the gait cycle when the hip marker drops out sometimes. Filtering again after the differentiation will get rid of it, but that just hides the problem and in fact distorts the signal.

That may be it, but the gaitanalysis.motek.DFlowData has always removed the missing markers (by interpolation) before the data ever got to leg2d.m.

In fact, you should never filter more than once in this sequence of steps.

I thought that too, but oddly filter twice changes the result in this case. I must have a slight error somewhere.

The for loop in myfiltfilt.m is probably the bottleneck for Octave.

Yes, it definitely is. See the timings here: https://github.com/csu-hmc/GaitAnalysisToolKit/issues/30

That loop would ideally be in a low level language. I'm not sure if vectorizing it would help that much.

I would like to keep the Matlab code around, with the Python equivalent in a separate folder.

We can do that. I currently have a unit test that tests my code against yours:

https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128/files#diff-46e213b86653d705c1774f2f0368b3f3R357

The results for the rawdata.txt file are virutally identical with the two methods, but not exact. I ended up using an R^2 for comparing the results due to slight differences in the end points. When I plotted the results I was getting visually identical results. I'm not sure why I get the "same" results with that data and not our perturbed data.

Translate the existing filter code into Python (myfiltfilt.m and rtfilter.m). It's not a lot of code. If that still runs too slow, consider the other options.

Yes, this could be done. But I'm pretty sure that code needs to be written in C or something for speed purposes. A direct translation will likely have some speed gains relative to Octave but it want be the 1000x speedup I'm seeing using the constant sample rate filtering here.

Make a compiled version of myfiltfilt.m (which then also includes rtfilter.m), and call that from Python (or Matlab or Python). I will e-mail you the C source. Not sure how much work that is.

This is likely the best solution.

Give up on the filter that can handle variable sampling rate and gaps. Then we need to interpolate the gaps and then use a standard butterworth filter for constant sampling rate. And then do finite difference derivatives. Do this both in the Matlab and Python versionbs. To me that is clunky because I know better. If you have large gaps, and do linear interpolation, you get zero accelerations during that time. Higher order polynomial interpolations become unreliable.

This is what is implemented now, in this PR.

Whatever we do, the Matlab/Octave and Python code should give identical results.

Agreed.

Mar 30 '15 16:03 moorepants

See if my revised Matlab/Octave code helps at all. It eliminates the function call from the for loop.

If you already interpolated before the data gets to leg2d, my explanation for the spikes is not valid. My filter would have produced the same spikes on such data. Hope you can track that down.

Also if you already interpolated we can safely do a standard butterworth filter (in Octave this would be "butter" and "filter", "filtfilt") and the Python equivalent. It's not elegant but for small gaps of a few frames this makes no difference really.

In leg2d you then need to do a check:

if (max(diff(timestamps)) ~= min(diff(timestamps)) error('sampling rate is not constant' end

My test data in rawdata.txt has missing samples, so that would not be a good test anymore.

One thing you may not have noticed in myfiltfilt.m. If you run a 2nd order Butterworth filter twice (to eliminate lag), it needs to be designed with a slightly higher cutoff frequency than what you really want. The factor is 0.802. That way, the transfer function of the double butterworth filter, has a magnitude of 1/sqrt(2) at the cutoff frequency (and not 1/2).

A C version would be good to have.

On 3/30/2015 12:49 PM, Jason K. Moore wrote:

I looked in your speedup inverse dynamics branch but could not
locate the python code. Pls give a link if possible.
inverse dynamics: https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128/files#diff-5e1f4c39e8343cdd4b4dedb9eda0a0efR1084

mimics leg 2d: https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128/files#diff-5e1f4c39e8343cdd4b4dedb9eda0a0efR149

numerical differences: https://github.com/moorepants/DynamicistToolKit/blob/master/dtk/process.py#L575

filter: https://github.com/moorepants/DynamicistToolKit/blob/master/dtk/process.py#L462
I think I know what is going on. My rtfilter.m processes data with
time stamps, one sample at a time. It can handle gaps in the data.
If you use standard digital butterworth filters, you need a
constant sampling rate. The spikes may be generated if you do the
differentiation with the assumption of constant sampling rate. If
a frame drops out, you then suddenly get only half of the true
derivative. The second derivative then gets spikes. This may
happen in a certain part of the gait cycle when the hip marker
drops out sometimes. Filtering again after the differentiation
will get rid of it, but that just hides the problem and in fact
distorts the signal.
That may be it, but the |gaitanalysis.motek.DFlowData| has always removed the missing markers (by interpolation) before the data ever got to |leg2d.m|.
In fact, you should never filter more than once in this sequence
of steps.
I thought that too, but oddly filter twice changes the result in this case. I must have a slight error somewhere.
The for loop in myfiltfilt.m is probably the bottleneck for Octave.
Yes, it definitely is. See the timings here: #30 https://github.com/csu-hmc/GaitAnalysisToolKit/issues/30

That loop would ideally be in a low level language. I'm not sure if vectorizing it would help that much.
I would like to keep the Matlab code around, with the Python
equivalent in a separate folder.
We can do that. I currently have a unit test that tests my code against yours:

https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128/files#diff-46e213b86653d705c1774f2f0368b3f3R357

The results for the |rawdata.txt| file are virutally identical with the two methods, but not exact. I ended up using an R^2 for comparing the results due to slight differences in the end points. When I plotted the results I was getting visually identical results. I'm not sure why I get the "same" results with that data and not our perturbed data.
Translate the existing filter code into Python (myfiltfilt.m and
rtfilter.m). It's not a lot of code. If that still runs too slow,
consider the other options.
Yes, this could be done. But I'm pretty sure that code needs to be written in C or something for speed purposes. A direct translation will likely have some speed gains relative to Octave but it want be the 1000x speedup I'm seeing using the constant sample rate filtering here.
Make a compiled version of myfiltfilt.m (which then also includes
rtfilter.m), and call that from Python (or Matlab or Python). I
will e-mail you the C source. Not sure how much work that is.
This is likely the best solution.
Give up on the filter that can handle variable sampling rate and
gaps. Then we need to interpolate the gaps and then use a standard
butterworth filter for constant sampling rate. And then do finite
difference derivatives. Do this both in the Matlab and Python
versionbs. To me that is clunky because I know better. If you have
large gaps, and do linear interpolation, you get zero
accelerations during that time. Higher order polynomial
interpolations become unreliable.
This is what is implemented now, in this PR.
Whatever we do, the Matlab/Octave and Python code should give
identical results.
Agreed.

— Reply to this email directly or view it on GitHub https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128#issuecomment-87747314.

Mar 30 '15 16:03 tvdbogert

See if my revised Matlab/Octave code helps at all. It eliminates the function call from the for loop.

Testing now.

Hope you can track that down.

I'll investigate more.

My test data in rawdata.txt has missing samples, so that would not be a good test anymore.

For my unit test I pre-interpolate the missing samples and pass that into both the Python and Octave code to see if I get the same results.

One thing you may not have noticed in myfiltfilt.m. If you run a 2nd order Butterworth filter twice (to eliminate lag), it needs to be designed with a slightly higher cutoff frequency than what you really want. The factor is 0.802. That way, the transfer function of the double butterworth filter, has a magnitude of 1/sqrt(2) at the cutoff frequency (and not 1/2).

I didn't know this. Maybe that's the difference. I'll fix my code.

A C version would be good to have.

I can implement the C version if you are fine with including your method in this package. Just let me know.

Mar 30 '15 17:03 moorepants

If you run a 2nd order Butterworth filter twice (to eliminate lag), it needs to be designed with a slightly higher cutoff frequency than what you really want. The factor is 0.802. That way, the transfer function of the double butterworth filter, has a magnitude of 1/sqrt(2) at the cutoff frequency (and not 1/2).

@tvdbogert Where do I find the theory for this? I've never heard of it and would like to implement a generalized version.

Mar 31 '15 02:03 moorepants

I found some reference here for 2nd order filters:

http://nbviewer.ipython.org/github/demotu/BMC/blob/master/notebooks/DataFiltering.ipynb#Butterworth-filter-with-zero-phase-shift

Do you know if there is a general formula for any order Butterworth filter?

Mar 31 '15 03:03 moorepants

In the code I cite the Winter book which explains this, but it's not widely known apparently.

If you apply a filter H(w) twice, the transfer function is H(w)^2.

Nth order Butterworth transfer function, magnitude, is:

|H(w)| = 1/sqrt(1+(w/w0)^(2n))

w0 is the cutoff frequency, where transfer magnitude is 1/sqrt(2).

applied twice:

|H(w)| = 1/(1+(w/w0)^(2n))

If you define cutoff frequency as the frequency w0 where transfer is 1/sqrt(2), and you want this frequency to be w0, you need to solve:

1/(1+(w0/w0corrected)^(2n)) = 1/sqrt(2)

Where w0corrected is the design parameter you need to put in the original single BW filter to achieve this. The solution is:

w0corrected = (sqrt(2) - 1)^(1/2n) * w0 = 0.802 * w0 (when n=2)

If you don.t do this, and you apply a 6 Hz lowpass filter twice, you end up with a filter that attenuates 6 Hz signals by a factor 2, rather than 1/sqrt(2) which is what you intended.

On 3/30/2015 10:48 PM, Jason K. Moore wrote:

If you run a 2nd order Butterworth filter twice (to eliminate
lag), it needs to be designed with a slightly higher cutoff
frequency than what you really want. The factor is 0.802. That
way, the transfer function of the double butterworth filter, has a
magnitude of 1/sqrt(2) at the cutoff frequency (and not 1/2).
@tvdbogert https://github.com/tvdbogert Where do I find the theory for this? I've never heard of it and would like to implement a generalized version.

— Reply to this email directly or view it on GitHub https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128#issuecomment-87909235.

Mar 31 '15 13:03 tvdbogert

fThat's a very nice notebook. Good to see that he used Pezzack's benchmark data, which was one of the first things I put on the ISB website in 1996.

He gives a formula for any filter order n, but I get a slightly different one (see my github comment above). Both give the same answer for n=2, but I think only mine is correct for other filter orders.

On 3/30/2015 11:12 PM, Jason K. Moore wrote:

I found some reference here for 2nd order filters:

http://nbviewer.ipython.org/github/demotu/BMC/blob/master/notebooks/DataFiltering.ipynb#Butterworth-filter-with-zero-phase-shift

Do you know if there is a general formula for any order Butterworth filter?

— Reply to this email directly or view it on GitHub https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128#issuecomment-87918232.

Mar 31 '15 14:03 tvdbogert

That's a very nice notebook. Good to see that he used Pezzack's benchmark data, which was one of the first things I put on the ISB website in 1996.

He's creating a whole note set (or textbook) here: https://github.com/demotu/BMC

He gives a formula for any filter order n, but I get a slightly different one (see my github comment above). Both give the same answer for n=2, but I think only mine is correct for other filter orders.

I've adjusted my butterworth function with the correction factor here: https://github.com/moorepants/DynamicistToolKit/pull/28

Note that your equation above is slightly incorrect, it should be:

w0corrected = wo / (sqrt(2) - 1)^(1/2n) = w0 / 0.802 (when n=2)

Mar 31 '15 23:03 moorepants

Ok, I now get virtually identical results with these changes:

add in the filtfilt correction factor
differentiate markers before filtering Left: Python, Right: Octave

Mar 31 '15 23:03 moorepants

That's good enough agreement in results! I assume you're using the conventional digital Butterworth filter, which eliminates the need for a C version. As long as you're interpolating the raw data onto a fixed sampling rate. Which is OK if the gaps are small enough.

And yes, I had my correction factor backwards, thanks for the correction.

Apr 01 '15 01:04 tvdbogert

I assume you're using the conventional digital Butterworth filter

Yep, I'm traditional like that :)

which eliminates the need for a C version.

I'll implement the C version of your filter if I get an itch. It is a more elegant solution.

Which is OK if the gaps are small enough.

Your filter actually had issues with large marker gaps. The data in #132 exposed this. I assume there has got to be a limit to the marker gap that makes either method fail.

Apr 01 '15 01:04 moorepants

GaitAnalysisToolKit GaitAnalysisToolKit copied to clipboard

Speed up inverse dynamics (Python version)

GaitAnalysisToolKit
GaitAnalysisToolKit copied to clipboard