paper_trail icon indicating copy to clipboard operation
paper_trail copied to clipboard

[hal] Support 16bit transfers for Spi

Open TomSaw opened this issue 3 years ago • 14 comments

While solving #666 i've templatized Spi::transfer(..).

<std::unsigned_integral T> now controls wether to send uint8_t or uint16_t. It was a small step to make it work in bit-resolution so i've changed the argument to <int Bits>.

When sending uint16_t data with 16-bit. most of these fromBigEndian(...) sanities, spreaded over modm::drivers become obsolete. Moreover there's kind of Spi-integrated 'endianess' by varying 8 and 16 bits:

Transfer same data with 8 and 16 bits. No DMA for clarity

union {
	uint8_t data8[2] = {0xDE, 0xAD};
	uint16_t data16;
};
RF_CALL(SpiMaster::transfer(data8, (uint8_t*)(nullptr), 2)); // outputs 0xDEAD
RF_CALL(SpiMaster::transfer(data16)); // outputs 0xADDE

0xDEAD_endianess_small

Same with DMA

0xDEAD_endianess+DMA_small

For driver::Touch2046 it's code beautyfication first hand but also good for performance. -> See 2nd commit

Sending color::Rgb565 (uint16_t underlying) with true 16-bit Spi and getting rid of all these endianess sanities is extremely healthy for driver::ili9341 performance -> Fixed in 3rd commit.

I've spotted lot's of endianess sanities in I2c driven devices. Gonna see if the recipe applies here too.

ToDo

  • [ ] Modernize Spi API but keep it backwards compatible
  • [x] Support direct uint16_t Spi transfers
    • [ ] original API: 'Spi::transfer16(uint16_t)' + derivates
    • [ ] new API 'Spi::transmit(T data)' + derivatives, T is deduced
  • [ ] implement non-Dma Spi-service with ISRs to prevent blocking
    • [x] AVR
    • [ ] STM32
  • [x] Update some drivers with improved API for reference.
    • [x] driver::Ili9341 - uses new repeated transfer of same data: 'transmit<>(T data, std::size_t repeat)': evaluation: local huge buffer is redundant, improved performance for any single pixel
    • [x] driver::Touch2046 - uses modernized API with STL style syntax: transmit(std::begin(tx16), std::end(tx16), std::begin(rx16)) evaluation: easy2read code, +improved performance

TomSaw avatar Sep 22 '21 17:09 TomSaw

I think half of the STM32 devices only have 8 or 16 bit transfers. Not sure the AVR has anything else but 8bit.

I would only have two transfer functions: transfer and transfer16, since the register access is the same, so the code is the same. The bit width should be set outside the transfer function (since we also set everything else like SPI Mode externally).

salkinium avatar Sep 22 '21 17:09 salkinium

There's also this->attachConfigurationHandler([]() { SPI::setTransferWidth(16); }); which is called only when the SpiDevice changes (ie. when you have multiple devices on the same SPI peripheral).

salkinium avatar Sep 22 '21 17:09 salkinium

There's also this->attachConfigurationHandler([]() { SPI::setTransferWidth(16); }); which is called only when the SpiDevice changes (ie. when you have multiple devices on the same SPI peripheral).

ili9341 requires both, 8bit and 16bit. There's no way to attach multiple Handlers as far as i know. EDIT: I'm switching the configurationHandler between 8 and 16bit in ili9341_spi.hpp. driver:touch2046 requires only 16bit configuration.

TomSaw avatar Sep 22 '21 17:09 TomSaw

I've implemented transfer(uint16_t data) for ATmega.

Moved Spi Bit configuration to ConfigurationHandler where they belong to.

But ...

TomSaw avatar Sep 26 '21 17:09 TomSaw

How about 16bit support for I2c?

In general, i like the idea of moving 16bit treatment from drivers to Hal. Dozens of I2c-drivers contain endian sanities (more or less any call of 'modm::BigEndian()' in src/modm/driver/** ) before or after talking to I2c.

👉 In addition to cleaner code we gain slightly shrinked compilate and slightly improved performance because data >> 8 in Hal is cheaper than modm::fromBigendian(data) in Drivers.

Numerous times in Drivers.

TomSaw avatar Sep 27 '21 09:09 TomSaw

How about 16bit support for I2c?

Nice feature... little more tricky than for Spi so i prefer another PR #690 and concentrate on graphics merge first.

TomSaw avatar Sep 29 '21 17:09 TomSaw

... for backwards compatibility and add transferBlocking16 and transfer16...

Please excuse my own mind but I really need these templatized Spi::methods for generalization of DisplayTransport / DisplayInterface. A Baseclass to all Transport classes that cares for moving the RAM->CGRAM with ColorType as template argument. It's preparation to implement DMA2D on top of it ;).

Please give feedback to the current modernized and also backwards compatibility solution @salkinium : I've renamed my stupid templatized transfer -> transmit and declared them protected. Defined transfer(..) and transfer16(..) as wrappers around transmit<{uint8_t/uint16_t}>. Looks solid and i can keep the compact generic implementations in spi_master_impl.hpp.

Code is also just a few lines apart from supporting any of uint8_t / uint16_t / uint32_t / uint64_t as T for template<T> transmit(...).

Why? For 24bpp transmissions: ''' // On supported hardware: Spi::setDataLenght(Bit24) // But each MCU should be able to... transmit<uint32_t> (Rgb888* tx, ...) '''

TomSaw avatar Oct 05 '21 07:10 TomSaw

I've discovered performance issues when using Ili9341 with RFs but without DMA-Spi.

-> The current non-ISR implementation of Spi requires babysitting for any byte. Running up and down the relatively deeply nested call-tree when drawing with RFs on Ili9341 results in lot's of inertia.

The setup Ili9341 with RFs + DMA-Spi is a performance pleasure, so how should i port this to AVR?:

Solutions in my mind

  1. In non-DMA setups, use transferBlocking(...) in ili9341_spi.hpp
  2. Implement Spi with ISR service

I've experimented with 2. to see if and how it works and it's quite good! (commited) -point: With ISR service it's a few pct slower rather than with blocking resumable "service". seems like the ISR context switches are more expensive than the blocking loop. +point: No special treatments for Drivers using Spi + Resumable functions required

TomSaw avatar Oct 05 '21 08:10 TomSaw

Much better, thanks for fixing the SPI ISR. There was an lbuild option once to enable blocking SPI transfers for fast speeds, since walking up and down the call chain takes so long. An ISR may be a little slower, but at least its non-blocking.

salkinium avatar Oct 05 '21 13:10 salkinium

So then I'll add this to STMs non-DMA SPI as well.

How do you like the use of MODM_FLAGS for the SPIs state variable @salkinium ? Found the answer on my own: It's wanted ✅

TomSaw avatar Oct 05 '21 13:10 TomSaw

Ahhh... dreamed another great, apparent idea for the modernized versions of Spi::transfer: STL algorithm style interface!

  1. Its intuitive to C++ people
  2. Is a safe bet on the future
  3. works with C-array & std::array (EDITED)

...I :heart: it, hope you too. The interfaces would be:

static modm::ResumableResult<T>
transmit(const T data);

static modm::ResumableResult<void>
transmit(const T data, std::size_t repeat);

// STL algorithm style
static modm::ResumableResult<void>
transmit(const T *tx_first, const T *tx_last, T *rx_first = nullptr);

TomSaw avatar Oct 06 '21 09:10 TomSaw

Ok, but let's focus on finishing this PR first before refactoring everything again.

Also keep in mind that the advantage of using stl iterators is that the Iterator is an actual separate type that overloads operator++ etc to implement any kind of data access. This is generally a good idea, where an non-sequential buffer access can be abstracted.

However, this only works generically with a template function, where this iterator type is known, and becomes difficult with an IRQ, which can be defined only once. You now need to use a function pointer or a virtual interface and then you are basically implementing the I2cTransaction that sucks a little.

I also don't see a significant usability advantage of using transmit(begin(), end(), begin()) vs the API that we have now.

salkinium avatar Oct 06 '21 11:10 salkinium

I've looked into a "stringly-typed" printf-style generic transfer interface, which would probably result in a significantly more compact implementation, since it explicitly exposes a state-machine interface, uses a shared parser between all peripherals and that can be translated at compile-time into a trivially DMA-friendly representation. Here is a prototype for SoftwareI2C that worked in hardware.

salkinium avatar Oct 06 '21 11:10 salkinium

Also keep in mind that the advantage of using stl iterators is that the Iterator is an actual separate type that overloads operator++ etc to implement any kind of data access. This is generally a good idea, where an non-sequential buffer access can be abstracted.

To continue: Containers with ContiguousIterator may return plain pointers for begin() and end(). There's also a concept contiguous_iterator Quote The contiguous_iterator concept refines random_access_iterator by providing a guarantee the denoted elements are stored contiguously in the memory.

These Containers come into question:

  • classic C-array
  • std::array
  • std::basic_string

TomSaw avatar Oct 09 '21 13:10 TomSaw