RIOT [WIP] periph/i2s: Add I2S device peripheral interface

Contribution description

This is an initial draft of an I2S audio interface API.

The goal is to have an API to fit the peripherals on different platforms. Furthermore, due to the streaming nature of I2S data, it should be suitable for DMA usage and should allow for chaining transfers as not to cause glitches in the middle of a stream.

All properties of an I2S stream should be reconfigurable at runtime. This includes the sample width and the sample rate. Possibly also a switch between mono/stereo.

Testing procedure

It's only an API so far, I need an implementation or two to test it.

Issues/PRs references

None

Development

I don't have a lot of time at the moment to work on this, but lets use the branch here as a common development point. Feel free to open PRs for implementations on top of this branch or push fixes to the API directly to this branch.

Oct 01 '20 13:10 bergzand

A TODO for you or I would be to do a survey of the current APIs for this... On the list. Very cool though!

Oct 01 '20 18:10 MrKevinWeiss

For later analysis:

Oct 02 '20 14:10 bergzand

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

Jun 02 '21 16:06 stale[bot]

I'm willing to pick this one up again soonish for a personal project and I'm still looking for feedback and ideas on the api here.

Oct 12 '22 11:10 bergzand

Would probably be handy to align this with the DAC DDS API (example app) so an app could seamlessly switch between internal DAC and external I2S based DAC.

With the DAC DDS API you get a callback when the next audio frame can be queued for transmission - does this also make sense for I2S?

I suppose this could also come in handy as I2S can also be used for input, not just output.

Oct 12 '22 11:10 benpicco

With the DAC DDS API you get a callback when the next audio frame can be queued for transmission - does this also make sense for I2S?

The idea I have in mind is to submit buffers, or transactions, with audio samples to read from or write in to the I2S peripheral and register a callback to be called when the next buffer should be submitted. For this it makes sense to me to have always two buffers for one direction, one that's currently used and one that is the next buffer that should be used. The user would then get the callback as soon as the peripheral switches buffers (moving to the next transaction) and the thread would have quite some time to prepare the next transaction and submit it to the peripheral.

Oct 12 '22 12:10 bergzand

With the DAC DDS API you get a callback when the next audio frame can be queued for transmission - does this also make sense for I2S?

The idea I have in mind is to submit buffers, or transactions, with audio samples to read from or write in to the I2S peripheral and register a callback to be called when the next buffer should be submitted. For this it makes sense to me to have always two buffers for one direction, one that's currently used and one that is the next buffer that should be used. The user would then get the callback as soon as the peripheral switches buffers (moving to the next transaction) and the thread would have quite some time to prepare the next transaction and submit it to the peripheral.

Looking now at the documentation of the DAC DDS API, this exactly matches what I put above.

Oct 12 '22 12:10 bergzand

That's exactly what dac_dds does too.

Oct 12 '22 12:10 benpicco

For a hobby project I picked this up again, pairing a tlv320aic3204 codec with two SPI/I2S peripherals on the stm32f446re. After some initial poking around I got a relative clean sine wave out of the codec. Based on this I have some observations of the peripheral and the other available peripherals from different vendors. It seems that the stm32 SPI peripheral in I2S mode is pretty much worst case in terms of effort per audio sample.

I've only looked at the nRF52840, the stm32 and the atsam peripherals. I have no idea how the esp32 operates it's peripheral.

Peripheral modes

nRF52840

The nRF52840 has a single I2S peripheral with simultaneous transmit and receive of data. It supports both controller and target mode. The clock configuration in controller mode is limited to a simple divider from the 64 MHz clock.

stm32

The stm32 has both the SPI peripheral in I2S mode and the SAI blocks. Each SPI peripheral can operate in unidirectional mode with two separate instances required for bidirectional mode. A SAI peripheral consists of two independent blocks, each capable of a unidirectional mode. Both peripheral types can operate in controller and target mode and depending on the exact MCU model, a PLL is available to generate the clock.

atsam

The same54 has a I2S peripheral consisting of a transmit and a receive block. It supports both controller and target mode.

Data format

nRF52840

The nRF52840 supports 8, 16 and 24 bit modes and when using 24 bit mode it expects sign extended 32 bit words.

stm32

The SPI peripheral has a 16 bit data register. This is inconvenient when transmitting 24 and 32 bit samples. The most significant half word has to be loaded in the register first and the least significant half word next. Starting with 32 bit words, the data format can be fixed via a ROR #16 instruction.

atsam

The same54 has a flexible data format.

DMA

It is almost mandatory to couple the peripheral with a DMA stream to have some guarantees on the timely delivery of new samples to the peripheral

nRF52840

The EasyDMA of the nRF52840 automatically resolves this. Care has to be taken with that it can only access the RAM and not the ROM memory addresses. The counter register of the EasyDMA RX and TX registers is shared so buffers between these must be equal in size. The pointer registers themselves are double buffered so that a next transaction can be prepared while the current transaction is busy. It is not clear whether the the maxcnt register can be updated between transactions. If this is not the case all transactions must have an equal size.

stm32

The stm32 peripheral can be coupled with DMA streams. To get reliable performance, the double buffer (f2, f4 and f7) is almost mandatory to use, this way a new transaction can be prepared during the current transaction. Otherwise the DMA must be switched to the next transaction as soon as it is done, but before the peripheral needs the next sample. In practice this is not always reliable, even on the f4. The f1 series can use a single buffer in circular mode, and trigger an interrupt on half and full DMA transfer completion. The I2S logic can then copy the next transfer into the other half of the buffer (using DMA?).

atsam

The atsam DMA uses in RAM descriptors for the DMA transfers. These can be set up in a double buffer mode and updated while the other transfer is busy.

Conclusions

Fixed transaction buffers

A number of these peripherals put restrictions on when the number of items in a transacion can be updated. This is either explicit or implicit by either the peripheral or the DMA.

Splitting the peripherals

The most flexible way to model the peripherals is to guarantee at least unidirectional mode for a peripheral and support bidirectional mode where possible (or necessary). This means that a single SAI peripheral can be exposed as two unidirectional peripherals, but a same54 and the nrf52840 are bidirectional peripheral.

Data Format

As the peripherals differ in what they expect, a conversion function is required. For RIOT it is most convenient to treat all samples as 8, 16 or 32 bit, and this can be glued to the CMSIS-DSP data types. Conversion functions can be provided with the I2S peripheral to convert arrays from RIOT native data types to a format for the DMA and the peripheral. In the best case these are simple nop functions, in the worst case they iterate over every sample and adjust the format.

Apr 11 '23 14:04 bergzand

One more thing I noticed while developing on this:

Currently the architecture uses a linked list of transactions, each transaction containing a preconfigured number of samples. This makes for a flexible API where different chunks of memory can be chained to construct the audio stream (as long as they all have the same size). However I doubt whether this is really useful for the end user. I noticed for myself that I would usually allocate one slab of memory and divide that over a set of transactions that I would feed the peripheral. Depending on the origin of the data (static array of samples or USB audio stream), I would manually keep track of which transactions have finished and write the next chunk of samples to it and feed it back to the peripheral. See also the test application included here for an example.

What would greatly simplify the usage of the API is to include a memory region in the config struct together with how many equal sized regions it should be divided into. The implementation would then just have to keep track of a read and write pointer and signal in the callback when a chunk of the memory region has been fully consumed. The downside is that we lose some flexibility in where we get the memory regions from. On the other hand it would simplify the DMA requirements as these could in the simplest case run in circular mode and notify at the halfway points. The buffer write and read functions would still exist, but would write directly into the provided buffer without the intermediate transaction step

Apr 21 '23 12:04 bergzand

The other design decision is how to treat peripherals split in two blocks such as the SAI on the NXP iMX6 and the I2S interface of the SAMD21 and SAMD5x. Both these have an I2S peripheral with a dedicated receive and transmit block. Each block has its own configuration including clock dividers:

We can expose these as two separate instances limited to transmit only and receive only, allowing full configuration of each block. The other option is to expose the peripheral as a single instance and allow configuring it as I2S_DIRECTION_BOTH to use both data directions at the same time.

The main tradeoff is flexibility of being able to configure both blocks as separate peripherals, but this pushes the constraint of having to select the peripheral that supports the correct data direction to the API user. In the case of treating them as separate peripherals, clock synchronization could be provided by extending the i2s_mode_t enum to include an I2S_MODE_FOLLOW_OTHER_PERIPH (or along those lines). If they are exposed as single peripheral, they would always use the same 'Clock Unit 0'.

In my opinion both options are fine.

Apr 21 '23 12:04 bergzand

The other design decision is how to treat peripherals split in two blocks such as the SAI on the NXP iMX6 and the I2S interface of the SAMD21 and SAMD5x.

The ST SAI peripheral also consists of two blocks, but it doesn't have the issue described above as the two blocks are fully symmetrical and can run both as transmit and receive.

Apr 21 '23 12:04 bergzand

RIOT RIOT copied to clipboard

[WIP] periph/i2s: Add I2S device peripheral interface

Contribution description

Testing procedure

Issues/PRs references

Development

Peripheral modes

nRF52840

stm32

atsam

Data format

nRF52840

stm32

atsam

DMA

nRF52840

stm32

atsam

Conclusions

Fixed transaction buffers

Splitting the peripherals

Data Format

RIOT
RIOT copied to clipboard