DSP.jl icon indicating copy to clipboard operation
DSP.jl copied to clipboard

Support conv(x,y,z)?

Open dlfivefifty opened this issue 7 years ago • 22 comments

I believe conv is associative:

julia> x = randn(3); y= randn(4); z=randn(5); conv(conv(x,y),z) ≈ conv(x,conv(y,z))
true

so conv(x,y,z) is well-defined. Any reason not to support this?

I have a small use for this in ApproxFun (determining the dimensions of polynomial degrees of tensor products of polynomial bases), so could add it as a PR.

EDIT: I would also define conv(x) = x.

dlfivefifty avatar Aug 25 '18 12:08 dlfivefifty

You could "extend" DSP.conv yourself with:

import DSP.conv
conv(x, y, z...) = conv(conv(x, y), z...)

rickhg12hs avatar Aug 25 '18 20:08 rickhg12hs

I know. That’s why it’s a “?”

dlfivefifty avatar Aug 25 '18 20:08 dlfivefifty

Ah, it's a policy question. OK.

Also, technically speaking, if this were taken to the extreme, a more numerically stable and computationally efficient algorithm could be used than the convenience function I gave.

rickhg12hs avatar Aug 25 '18 20:08 rickhg12hs

One possible issue is that we really should get rid of conv2 and instead just have conv do dispatch and work for N-D arrays (#202). In that case there might be a conflict between this proposal and the 3-arg conv2. I guess technically dispatch should be able to distinguish between these cases (conv(::Vector, ::Vector, ::Vector) vs conv(::Vector, ::Vector, ::Matrix)) but then you might get into a tricky spot where a Nx1 Matrix is treated substantially differently than a Vector.

I haven't thought about it much but wanted to raise a flag that this could be a potential issue.

ssfrr avatar Aug 27 '18 02:08 ssfrr

The natural replacement for the 3-arg conv2 would be a 3-arg conv taking a vector, a rowvector, and a matrix. That would also nicely disambiguate whether u or v of the 3-arg conv2 is taken rowwise (which also isn't clarified by the docstring it seems).

martinholters avatar Aug 27 '18 06:08 martinholters

Yeah, I like that. Representing a rank-1 matrix by its column/row factors (and making that explicit) makes a lot of sense.

ssfrr avatar Aug 27 '18 13:08 ssfrr

Just to be clear: the syntax would be conv(u, v’, A) , right? And it’s consistent because conv(u, v’) = u*v’

dlfivefifty avatar Aug 27 '18 14:08 dlfivefifty

the syntax would be conv(u, v’, A), right?

Yes. But of course, thanks to commutativity could also be conv(u, A, v'), conv(A, v', u), ...

Here's a toy implementation in case someone would like to work on this:

conv_size(::Tuple{}, ::Tuple{}) = ()
conv_size(s1, ::Tuple{}) = s1
conv_size(::Tuple{}, s2) = s2
conv_size(s1, s2) = (s1[1] + s2[1] - 1, conv_size(Base.tail(s1), Base.tail(s2))...)
conv_size(s1, s2, ss...) = conv_size(conv_size(s1, s2), ss...)
conv_size(s) = s

function pad_to_size(A::AbstractArray{T, N}, s) where {T,N}
    s´ = ntuple(i -> s[i], Val(N))
    B = zeros(T, s´)
    B[map(d -> 1:d, size(A))...] = A
    return B
end
pad_to_size(A::Adjoint{T, <:AbstractVector}, s) where {T} =
    [A zeros(T, s[2] - size(A, 2))']
pad_to_size(A::Transpose{T, <:AbstractVector}, s) where {T} =
    [A transpose(zeros(T, s[2] - size(A, 2)))]

function conv(xs...)
    s = conv_size(map(size, xs)...)
    ifft(broadcast(*, map(x -> fft(pad_to_size(x, s)), xs)...))
end

Of course, we shouldn't do the computation with fft unconditionally (exempting small sizes and non-floating point element types) and also use rfft for real-valued inputs. But this should be the general idea.

martinholters avatar Aug 28 '18 07:08 martinholters

This came up in #269 but I figured it made more sense to continue this part of the conversation here: @HDictus:

The simplest solution to support that might actually be to do #224 for N arguments, such that one can provide all arrays in N dimensions (e. G. 1x3 and 3x1 vectors for a seperable 3x3 kernel, or 3x3x1 and 1x1x3 for 3x3x3 seperable kernel, so on).

I agree with this, and is consistent with the idea above. If we have conv(x, y, z) = conv(conv(x, y), z)then an NxM separable kernelK = conv(u, v)(where e.g.uis Nx1 andvis 1xM) can be applied withconv(A, u, v)` with no special handling. This also extends naturally to higher dimensions, where you might for instance have a 4D convolution expressed as 2 2D ones.

I think we'd just want to document the associativity of the multi-argument conv so that people knew where to put their separable kernels for efficiency, e.g. if we use the left-associative rule above than conv(A, u, v) would be faster than conv(u, v, A).

The other option would be to keep conv as a binary operator, and people could do foldl(conv, [A, u, v]) for left-associative or foldr(conv, [u, v, A]) if they want right-associativity, but I'm guessing there's some efficiency to be had by conv being able to see all the arguments at once, w.r.t. pre-allocating the output array, maybe more.

ssfrr avatar Mar 31 '19 19:03 ssfrr

I like the former better, I think you're right about losing time re-planning ffts and such The advantage of right associativity is that the notation is more consistent with the current 3-argument conv2, but if left associativity is at all more natural or intuitive we should definitely make the switch.

HDictus avatar Apr 01 '19 05:04 HDictus

We’ll need to add a deprecation message to the 3-arg conv2 anyways guiding people with how to change their code, so having them swap the argument order shouldn’t be a big deal.

ssfrr avatar Apr 01 '19 11:04 ssfrr

If no one else is working on this rn, I'll implement it.

HDictus avatar Apr 12 '19 19:04 HDictus

go for it!

ssfrr avatar Apr 12 '19 19:04 ssfrr

I think left associativity is more natural, personally.

galenlynch avatar Apr 12 '19 21:04 galenlynch

One comment before we go down this road too far, I was planning on merging FIR filtering code paths with conv, or at least giving conv the overlap-save algorithm. How would that work with multiple argument convolution?

galenlynch avatar Apr 12 '19 21:04 galenlynch

I'm not familiar with the overlap-save algorithm, but from a quick search it seems it breaks down a long convolution into a bunch of smaller ones? does it work for any number of dimensions or only 1?

The simplest solution is to have conv of 3+ arguments simply perform recursion. conv(A1, A2, Arrs...) = conv(conv(A1, A2), Arrs...), since that can be used regardless of the algorithm inside conv itself which still gives us some of the advantage of seperable convolutions, but as mentioned does not allow us to save extra time on planning ffts and requires us to perform N-1 iffts.

Otherwise, if the signal is long enough to want to break it up into parts, the result of the convolution probably is as well, so to do subsequent convolutions on it we can leave it split up and in fourier form. We can use the same plan to fft the signal chunks and the other args, then multiply each chunk by all the subsequent fourier representations. We calculate how much padding each output chunk accumulates over all these operations, ifft the resulting chunks, lop off the padding and cat them back together (edit: er... i later realized you wouldn't just lop them off, I guess you'd normally sum the overlapping sections? the same should work here). I hope that makes sense and I haven't missed something important about overlap-save

This assumes that the biggest array is the first argument, as is planned. (Edit: making overlap-save occur for large arrays later down the line might be more complicated, but extending conv the N arguments and enabling overlap-save for the first argument don't reduce one another's functionality, so there isn't a direct conflict).

HDictus avatar Apr 13 '19 06:04 HDictus

I'll work on some other things for a while and I'll get back to this after Galen finishes what he's doing with overlap-save, since It looks like implementing and then merging might take twice as much time as just implementing after the change.

HDictus avatar Apr 16 '19 07:04 HDictus

Whoops sorry! Busy weekend with work stuff, I'll try to reply soon.

galenlynch avatar Apr 16 '19 11:04 galenlynch

@HDictus it works for any number of dimensions.

I think what you described should work well for overlap save. I put the code that I've been working on as a PR (#286). It's pretty much complete, though there are indexing problems lurking in the code. As you can see, overlap save transforms the smaller array and sets them aside, then convolves each chunk of the large array with the smaller array. I think it would be pretty simple to extend it to accepting multiple arrays, by transforming each of the smaller (trailing) arrays and finding the product of their Fourier transforms, and then convolving each chunk of the large (first) array with the saved product.

As for lopping / adding, overlap save does indeed lop off the aliased portion of the ifft (and therefore avoids padding), while overlap add does padding and adds the pieces together. It turns out that overlap-save is a little bit more efficient than overlap-add. Both of them are more efficient than straight forward fft convolution if the second array is not small, e.g. see the computational complexity portion of https://www.allaboutcircuits.com/technical-articles/overlap-add-method-linear-filtering-based-on-the-discrete-fourier-transform/.

galenlynch avatar Apr 18 '19 12:04 galenlynch

for the moment I have an implementation where it only worries about overlap-save when given two inputs, and otherwise only does fft convolution. Unfortunately, this is slower than the naive implementation in the cases I test for (which don't even have especially large arrays). I guess the speedup from overlap-saving in these cases is more than the speedup of having only one ifft.

previously, I had it perform overlap-save on the first two inputs if they met the condition for it, and then recursing with _conv_fft!(output, (os_out, arrs...), (os_out_shape, arrs...), etc) , but this was even slower somehow.

I think to get the best results I'd need to do some major restructuring of the overlap-save portion of the code, so that we get the fourier transformed chunks out and ifft and save together only once multiplied by everything else. Though I'd definitely learn a lot from doing this, it will take a while (I expect to be pretty busy for the coming months). This should also probably wait until the overlap-save branch itself is fully merged.

I'll make a PR using a naive implementation to serve for the time being, and worry about making it faster at a later date. edit: I've added some more test cases, and I'm now fairly certain that this approach is primarily slower when the types of the inputs differ, so maybe I can make the other version work.

HDictus avatar May 08 '19 17:05 HDictus

Sounds great!

I think extending overlap save to more than two inputs should be fairly straight forward: the majority of the complexity of overlap save is splitting up the large input (assumed to be the first) into chunks. As long as the first input is still the large one, it would be fairly straight forward to take the Fourier transformation of all the trailing inputs and then multiply them together, to get a filter_fd that is the product of the Fourier transforms of all the trailing arrays. The rest of the convolution process would be identical.

galenlynch avatar May 08 '19 18:05 galenlynch

Yeah I was overthinking it, after a break I took another look and it's pretty straightforward as you say. I'm just having a little trouble figuring out optimalfftfiltlength in one case for more than two inputs.

HDictus avatar May 12 '19 13:05 HDictus