Adjust barriers in virtualizers/clients
https://github.com/au-ts/sddf/pull/348 fixed an issue where barriers were being used too often, but the barriers left over are still probably stronger than they need to be, which may have an impact on performance.
Additionally, there was a concern about client's needing to perform a memory barrier after writing to a DMA buffer and prior to enqueing it to the virtualizer. This is to ensure that any writes are visible by the point the virtualizer completes its cache-invalidation (in the case where the client and the virtualizer are on different cores).
Furthermore, it is likely that the barrier here https://github.com/au-ts/sddf/blob/f3685158faa7f8f379fa918821186ae7667e0e87/drivers/network/imx/ethernet.c#L69-L72 is incorrect, either because it is too weak (compiled into a dmb ish on ARM64, but device interaction requires at least an outer shareable domain, and only a dsb provides completion guarantees), or it is not necessary (accessing device memory with volatile quantifier).
We should also consider whether it is the driver or the virtualizer is responsible for executing some barriers.
why would it make sense for the driver to execute barriers?
For interaction with non-coherent devices. In Cheng's driver, for example, barriers to order accesses to normal memory and device memory are likely missing.
For sdcard driver, the protocol requires the driver to DMA some information to normal memory region and parse the information by the driver.