cocos-engine icon indicating copy to clipboard operation
cocos-engine copied to clipboard

[Feature] Audio system refactor

Open timlyeee opened this issue 1 year ago • 0 comments

The audio engine on the native platform and audio source on ts cannot fulfill the need for 3d audio and audio effects, thus the refactoring of the audio system is up to go. This issue is meant to write and discuss the refactor and the first comment will be updated time by time to the final version.

ver 1.1 - update 22.09.23

Inside history review

Public API design

What the developers hate and are hard to use, is hard to play audio without components. This problem makes it difficult to quickly respond to 2d scenes, especially for developers from cocos2dx. Based on the old design document, new reactive audio sound can serve more.

AudioClip optimize

OLD:

AudioClip is a resource which only indicates an audio file.

NEW:

AudioClip contains all information about the audio file and expected output audio format settings. The developer can see and modify the setting of the audio.In regard to the efficiency of runtime play, the audio format will be unified to the same format such as Ogg, and the same sample rate such as 44.1khz.

For example, soundA is an audio clip saved as "A.mp3", while sample rate is 44.1khz stereo, soundB is "B.wav" sample rate 48khz single. When the target platform is Android, normally its standard play rate is 44.1khz, so the editor will resample and save audio to "A.ogg" 44.1khz stereo, "B.ogg" 44.1khz stereo.

The lifetime of AudioClip is meant to expose to the developers. The developer can load audio clip manually and unload it too.

var audioClip = new AudioClip("open.ogg"); // Which means the open.ogg is already resampled and configured.
audioClip.load();
audioClip.unload();//When the audio clip is still in use, the unload operation is quietly failed, but marked as need unload, and delete after reference count is 0

AudioSound new

The new API design will add a new class name of AudioSound, which is created for the DSP, mixing and eventifying audio. The developer can create a simple sound object from AudioClip, and the sound object will also save the step of audio processing. A sound object can be played multiple times.

The following code shows the step to play a sound in one shot.

// Create a sound with default setting
var sound = new AudioSound(audioClip);
audiox.playOneShot(sound);

// Create a sound with setting
var soundDescriptor = new SoundDescriptor(volume, loop, [
    SoundEffect.pitchShifter(1.5),
    SoundEffect.lowPassFilter(1) // filter strength
]);
var sound = new Sound(audioClip, soundDescriptor);
sound.addEffect(SoundEffect.hightPassFilter(2));
audiox.playOneShot(sound);

The script here has already cut the steps to set bus and bus volume, so it's register to master bus by default. Only one master bus exists at the same time, and it's the fundamental to control the master bus volume.

If you want to create a sound that can be played in a 3d space, you can add a spatial sound effect. The spatial effect is dynamically changed.

mySpatialEffect = new SoundEffect.spatialEffect(spatial.2D);
mySpatialEffect.position = new Position2D(...);
sound.addEffect(mySpatialEffect);
audiox.playOneShot(sound);

AudioSource optimize

When the developer is trying to control an audio for several state, we will provide the old API AudioSource, as some effect is already added.

AudioSource will be classified as different usages, such as 2D, 3D or AMBIENCE. With different classification, new functionality will be given, including pitch shifting and pannig.

var audioSource = new AudioSource(sound, 2D);
audioSource.node = this.node;
audioSource.play();
audioSource.pause();
audioSource.seek(time, MILLISECOND);
// audioSource.seek(frame, FRAMETIME);
audioSource.volume = VOLUME.DB(-8);
audioSource.pitch = 1.5;// Accelerate.
audioSource.stop();

The audio source can exist individually or attach as a component.

EventSound new

When the audio sound designer wants to mix the audio, the pure audio sound cannot fully support the need, but since the audio sound is easy to expend, as it's basically a resource, we can make a link between different sound to fulfill this need. Then the audio sound designer can manipulate the event sound to control the ambient strength of sounds and volume.

var eventSound = audiox.load("myEventSound"); // Need discussion.
var eventSoundSource = new AudioSource(eventSound, EVENT_SOUND);
eventSoundSource.play();
eventSoundSource.sendEvent(SET_PARAMETER, "ambientLevel", 4);
eventSoundSource.stop();
eventSound.unload();

Total change

Before:

image

After:

image

Low-level architecture

Sound graph new

We can only control the whole process with a graph for all sounds. All sounds can be bonded directly to the master bus, or create a bus and bind yourself. The bus can be used to add effects for all sounds or set the volume, so it's typically a special sound. image We can directly set the volume for master bus.

ISound new

On cpp, the sound graph represents a link logic, and all links between is based on ISound. Distinguished by the old design, the new audio system is buffer-oriented, which means we will modify the PCM data step by step and finally translate it to the output port.

class ISound{
public:
    virtual bool getBuffer(uint32_t framePosition,uint32_t frameToPlay, float* output);
    void update(); // update and feed all buffers to bus.
private:
    AudioPlayerBase[] players;
}
class Sound : ISound{
public:
    bool getBuffer(uint32_t framePosition, float* output) override;
    bool resample(uint32_t sampleRate) override;
private:
    SoundBuffer* bufferSrc;
    SoundEffect*[] effects; // Some effects are instances
}
class EventSound : ISound, Eventify {
    public:
    bool getBuffer(uint32_t framePosition, float* output) override;
    bool resample(uint32_t sampleRate) override;
private:
    SoundBase*[] soundMap;
    unoredered_map<string, EventDispatcher> eventMap;
    SoundEffect*[] effects;
}


class ISoundEffect {
    public:
    virtual void process(float* buffer, uint32_t sampleCount, uint32_t level);
}
class SoundEffectEQ : ISoundEffect{
    ///
}

SoundPlayer/AudioPlayer new

For a sound that can be truly played, they contain a player list, and take buffer from all players time by time.

image

For example, we can treat the red arc as a buffer taker, it takes the frame at 1m30ms at the first time and 3m at the second time, to fill the ring buffer. the sound will take buffers from which and send to the master bus.

All the operations for the audio player will directly operate on the ring buffer.

struct RingBuffer {
	uint32_t framePosition;
    uint32_t frameToRead;
    float* buffer[ringLength];
}
class AudioPlayerBase {
    RingBuffer audioBuffer;
    
}
class OneShotAudioPlayer : AudioPlayerBase {
    // 
}
class ReactiveAudioPlayer : AudioPlayerBase {
    AudioState state;
}

RingBuffer was decided dynamically.

Graph update new

The audio engine will update time by time, and collect frame to play. In that case, even if the buffer is translate downside-up, the handle logic is actually upside-down.

class EventSound : SoundBase {
    void update(float* buffer) override{
        for(auto sound: children) {
            update(buffer);
        }
        for(auto effect: effects) {
            effect.process(buffer);
        }
    }
}

Audio Engine and thread. new

The difference between the old audio engine and the new one is the audio thread and message queue.

class AudioEngine {
    std::queue<AudioMessage> MsgQ;
    std::thread td;
    void threadFunc();
    void sendMsg(msg);// main thread
}
// thread update function
void AudioEngine::threadFunc() {
    while(true) {
        for (auto msg : msgQ) {
            // run
        }
        //In the end of a frame, update and play.
        masterBus.update(buffer);
        play(buffer);
    }
}

The life time for the audio clip, the audio source, will be manipulated here.

Resource and asset new

Sound object can be saved directly as a resource and load as asset. Especially for event sound, it can be saved as a resource and it's actually a file such as json or sth else, to save the logic link between audio clips and audio sounds -- a meta.

Total change

OLD:

image

NEW:

image

timlyeee avatar Sep 22 '22 08:09 timlyeee