AudioIO.jl WIP Lightweight nodes

Nodes now hold exactly the amount of information needed to represent their domain of signals. removed the "Renderer" abstraction in favor of pulling samples from a node. Mostly inlinable sampleat function makes pull fast. pulling for portaudio doesn't allocate a whole lot of memory. speed and memory usage is consistently better than previous iterations, in most cases. the other cases have a lot of room for improvement.

One can now create signals of AudioNodes (signals as in Reactive.jl) very cheaply (a SinOsc takes 216 bytes), and lift such a signal to play a sound that changes in real time. Using Interact signals and the @manipulate macro, one can do things like:

    mix = @manipulate for f1=110:880, f2=110:880
        SinOsc(f1) + SinOsc(f2)
    end
    lift(play!, mix)

removed play and stop in favour of play! and stop!. play! replaces what is being played. to stop a node, simply remove it from the tree and play the tree again.

This PR also has SquareOsc and TriangleOsc, shows how the sampleat abstraction makes creating new node really simple while remaining fast thanks to inlining.

I have tried to keep backward compatibility with the old API in other places though. But it may be worth shortening names in a breaking minor release. Sin, Square, Triangle, Mix, etc. There is currently not a lot of consistency in the names, I feel.

Anyway this is not yet at a state to merge, I will work on some optimizations and get passing tests, would love comments at this point. :) I am also thinking of how this would extend to a sequencing API, note that this removes end_condition, for sequencing it's better to rely on Reactive's timing functions. I really love this talk https://www.youtube.com/watch?v=Mfsnlbd-4xQ on the subject.

Some speed and memory allocation comparisions follow, to the left are the old nodes. The measurements were for rendering 44100 (1s) samples. note that this doesn't compare the time taken to instantiate nodes in which this branch easily trumps. timing script: https://gist.github.com/shashi/dbc4a44ede975ac7d8e5

Master                                                      This Branch
======                                                     =========
NullNode()                                                   NullNode()
elapsed time: 0.000807172 seconds (366832 bytes allocated)   elapsed time: 0.000668067 seconds (190360 bytes allocated)
WhiteNoise()                                                 WhiteNoise()
elapsed time: 0.000464515 seconds (529552 bytes allocated)   elapsed time: 0.000385824 seconds (176560 bytes allocated)
LinRamp(0,1,0.75)                                            LinRamp(0,1,0.75)
elapsed time: 7.979e-6 seconds (216 bytes allocated)         elapsed time: 0.00022092 seconds (176560 bytes allocated)
SinOsc(220)                                                  SinOsc(220)
elapsed time: 0.002422144 seconds (697640 bytes allocated)   elapsed time: 0.001769491 seconds (176560 bytes allocated)
AudioMixer([SinOsc(220),LinRamp(0,1,0.75)])                  AudioMixer(SinOsc(220),LinRamp(0,1,0.75))
elapsed time: 0.002552697 seconds (697688 bytes allocated)   elapsed time: 0.001918074 seconds (353104 bytes allocated)
AudioMixer([SinOsc(220),WhiteNoise()])                       AudioMixer(SinOsc(220),SinOsc(343))
elapsed time: 0.003138189 seconds (1227088 bytes allocated)  elapsed time: 0.00332737 seconds (353104 bytes allocated)
AudioMixer([SinOsc(220),SinOsc(343)])                        AudioMixer(SinOsc(220),SinOsc(343),SinOsc(434))
elapsed time: 0.005157282 seconds (1395176 bytes allocated)  elapsed time: 0.005017761 seconds (529616 bytes allocated)
AudioMixer([SinOsc(220),SinOsc(343),SinOsc(434)])            AudioMixer(SquareOsc(220),SinOsc(343),SinOsc(434))
elapsed time: 0.007643713 seconds (2092664 bytes allocated)  elapsed time: 0.004874233 seconds (529616 bytes alloc

cc ssfrr/AudioIO.jl#27

Sep 10 '14 20:09 shashi

Some more parsimony.

rendering 1s of samples at 44100Hz
NullNode()
elapsed time: 0.000597381 seconds (190376 bytes allocated)
WhiteNoise()
elapsed time: 0.000346362 seconds (176616 bytes allocated)
LinRamp(0,1,0.75)
elapsed time: 0.000188006 seconds (176560 bytes allocated)
SinOsc(220)
elapsed time: 0.000767337 seconds (176560 bytes allocated)
SinOsc(220) * SinOsc(4)
elapsed time: 0.00136778 seconds (353224 bytes allocated)
SinOsc(220) * 0.5
elapsed time: 0.000670517 seconds (176592 bytes allocated)
SinOsc(220) + 0.5
elapsed time: 0.000690283 seconds (176560 bytes allocated)
SinOsc(220) * 5 + 0.5
elapsed time: 0.00069793 seconds (176592 bytes allocated)
AudioMixer(SinOsc(220),LinRamp(0,1,0.75))
elapsed time: 0.000934716 seconds (353104 bytes allocated)
AudioMixer(SinOsc(220),SinOsc(343))
elapsed time: 0.001359113 seconds (353104 bytes allocated)
AudioMixer(SinOsc(220),SinOsc(343),SinOsc(434))
elapsed time: 0.002047908 seconds (353136 bytes allocated)
AudioMixer(SquareOsc(220),SinOsc(343),SinOsc(434))
elapsed time: 0.002593813 seconds (353136 bytes allocated)
AudioMixer(SquareOsc(220),TriangleOsc(343,pi / 2),SinOsc(434))
elapsed time: 0.00364791 seconds (353136 bytes allocated)

Sep 11 '14 15:09 shashi

This is super cool, I'm watching your commits closely. I'm going to merge into a "signals" branch in my repo so I can play around with it.

A couple thoughts:

How do you feel about read instead of pull? One refactor I've been considering is a more IOStream-like API where nodes can be read from, though I suppose this is a less stateful abstraction so maybe it's not a good fit. What are the semantics of pull?
To handle varying frequencies (whether from a Signal or another AudioNode), the oscillators need to keep track of their phase state, so how does that fit into the @manipulate example you gave above, which creates a new AudioNode each time the signal changes? In general I'm still concerned about the overhead of creating a new AudioNode every time the signal changes. In the case that the signal is changing continuously (e.g. tracking the mouse, or a sensor value, or the output of a pitch tracking algorithm on the audio input) it seems like a lot of allocation.
One planned feature is that at some point non-AudioNode control signals should be interpolated in the audio domain so that there aren't discontinuities. For instance, when pullis called, and then pull is called again with a different control value, the calculated audio should interpolate between the two values over the span of the block. At some point we may even use the timestamps of the signal changes to allow events to happen within the block, but in the shorter time a 1-block resolution is probably fine (and is, for instance, what SuperCollider does).
I suppose the case of using an AudioNode as a control value (e.g. frequency of SinOsc) would be handled similarly to how you have Gain defined now, where the AudioNode type is parametric on the control signal type, correct?
The implementation of play! here replaces the root node, so only 1 node can be played at a time. It's important to support the the simple API of being able to easily play sounds and arrays on top of each other

Sep 11 '14 18:09 ssfrr

How do you feel about read instead of pull? One refactor I've been considering is a more IOStream-like API where nodes can be read from, though I suppose this is a less stateful abstraction so maybe it's not a good fit. What are the semantics of pull?

IOStream-like API sounds like a good fit. Less stateful doesn't prevent this. I'll look into what read entails.

To handle varying frequencies (whether from a Signal or another AudioNode), the oscillators need to keep track of their phase state, so how does that fit into the @manipulate example you gave above which creates a new AudioNode each time the signal changes?

You're right, this indeed seems to be a challenge. I created the TimeOffset node for solving this eventually, but I'm yet to figure out how to transparently do this.

In general I'm still concerned about the overhead of creating a new AudioNode every time the signal changes. In the case that the signal is changing continuously (e.g. tracking the mouse, or a sensor value, or the output of a pitch tracking algorithm on the audio input) it seems like a lot of allocation.

Hmm, it's a tradeoff between a declarative API and implementation. Nodes are currently much more inexpensive. You should see what you can do with something like this and then think of optimizing it. The macro system allows for clever optimizations. It may be worth making nodes mutable though.

One planned feature is that at some point non-AudioNode control signals should be interpolated in the audio domain so that there aren't discontinuities. For instance, when pullis called, and then pull is called again with a different control value, the calculated audio should interpolate between the two values over the span of the block. At some point we may even use the timestamps of the signal changes to allow events to happen within the block, but in the shorter time a 1-block resolution is probably fine (and is, for instance, what SuperCollider does).

This is a cool idea! :) I don't think this would also solve 2) would it?

I suppose the case of using an AudioNode as a control value (e.g. frequency of SinOsc) would be handled similarly to how you have Gain defined now, where the AudioNode type is parametric on the control signal type, correct?

Yes. I'll add them in time.

The implementation of play! here replaces the root node, so only 1 node can be played at a time. It's important to support the the simple API of being able to easily play sounds and arrays on top of each other

Okay, it's trivial to support this alongside, I'll add it back.

Sep 11 '14 18:09 shashi

Interpolation would not solve the issue with the statefulness of the Oscillators. In general I expect many AudioNodes to be stateful. Another example would be a variable delay line that stores data from its input and plays back a delayed version. It's a common effect to vary the time delay as a parameter, so we need a way to update the AudioNode's control parameters without losing their state.

I guess that's why my first inclination was to allow Signals as a Control type, so that whenever the value of the signal changed it could be handled by the AudioNode, but the AudioNode itself is persistent.

Another option would be to implement setitem! for AudioNodes as an API for setting synth parameters, so you could do:

s1 = SinOsc()
s2 = SinOsc()

@manipulate for f1=110:880, f2=110:880
    s1[:freq] = f1
    s2[:freq] = f2
end
play!(s1)
play!(s2)

This still keeps the Signal stuff orthogonal to AudioIO, but gives a consistent way to modify synth parameters. It fundamentally has the same side-effect behavior you mentioned before with the direct access, but so far I haven't come up with a simpler solution.

Sep 11 '14 22:09 ssfrr

I'm still thinking about this issue, but have been pulled away by a couple of other things.

Thought you might be interested in this slightly more complicated (and very fun) use case, a simple FM oscillator:

modosc = SinOsc(2)
baseosc = SinOsc(modosc * 50 + 220)

@manipulate for basefreq=110.0:880, modfreq=0.1:0.1:1000, moddepth=0.0:0.001:1.0
    baseosc.renderer.freq.renderer.offset = basefreq
    modosc.renderer.freq = modfreq
    baseosc.renderer.freq.renderer.in_node.renderer.in2 = moddepth*basefreq
end

play(baseosc)

For me this is nice to think about in designing the API, because this is exactly the sort of interaction i want to enable. Obviously the current API is very very bad for this, as you need to know a lot about the node internals to do any controlling.

Sep 19 '14 18:09 ssfrr