etl icon indicating copy to clipboard operation
etl copied to clipboard

Truncating access for `etl::bitset`

Open jaskij opened this issue 2 years ago • 12 comments

I find myself with a simple need: to extract a single byte from a fairly large etl::bitset. One big enough that I can not comfortably convert it to an integral variable.

While there are probably workarounds using a bitset while using uint8_t as the underlying buffer, and using etl::bitset::span(), I would prefer a more ergonomic API.

A possible option would be to have a value_truncated<typename T>(), which would return sizeof(T) * CHAR_BIT lowest bits of the bitset. Looking at the existing value<typename T> code, it would probably look something like this:

template <typename T>
ETL_CONSTEXPR14
typename etl::enable_if<etl::is_integral<T>::value, T>::type
  value_truncating(const_pointer pbuffer) const ETL_NOEXCEPT
{
  const size_t COUNT = sizeof(T) * CHAR_BIT;

  T v = T(0);
  uint_least8_t shift = 0U;

  for (size_t i = 0UL; i < COUNT; ++i)
  {
    v |= T(typename etl::make_unsigned<T>::type(pbuffer[i]) << shift);
    shift += uint_least8_t(Bits_Per_Element);
  }

  return v;
}

Combined with existing bitshift operators, this would allow nice, optimized, access to specific parts of a larger bitset.

jaskij avatar Oct 16 '23 10:10 jaskij

I'll look into that. Would you expect integrals to be extracted at byte offsets only, or arbitrary offsets into the bitset?

jwellbelove avatar Oct 17 '23 11:10 jwellbelove

e.g.

etl::bitset<256, uint32_t> bits;
int8_t c = bits.value<int8_t>(19); // Get an int8_t representation of the 8 bits from bit 19 to 26.

jwellbelove avatar Oct 17 '23 11:10 jwellbelove

Would you expect integrals to be extracted at byte offsets only, or arbitrary offsets into the bitset?

The use case it came up in was Modbus coil/discrete encoding/decoding, which actually needs arbitrary indices.

And would a byte boundary be meaningful? The one meaningful boundary I can think of is width of the interior storage type (so four-byte in your uint32_t storage example). When backing the bitset with uint32_t, an 8-bit boundary seems as meaningful as a 9-bit one.

Also: thank you for maintaining this library, and for reacting quickly to issues.

jaskij avatar Oct 18 '23 09:10 jaskij

Byte boundaries could result in more optimal code.

I'm going to be away for the next three weeks (climbing in Spain) so I may not be able to do much work on this until I get home again.

jwellbelove avatar Oct 18 '23 12:10 jwellbelove

Byte boundaries could result in more optimal code.

Since checking for byte boundaries is very cheap, maybe it would be worth the effort of writing both versions, and using one or the other at runtime? OTOH, that increases code size. Decisions, decisions.

I'm going to be away for the next three weeks (climbing in Spain) so I may not be able to do much work on this until I get home again.

Enjoy your vacation! That's more important.

jaskij avatar Oct 19 '23 07:10 jaskij

I'm experimenting with a solution for this. There are two new member functions.

template <typename T>
ETL_CONSTEXPR14
T extract(size_t position, size_t length) const

template <typename T, size_t Position, size_t Length>
ETL_CONSTEXPR14
T extract() const ETL_NOEXCEPT

T must be an integral type. position/Position is the index of the least significant bit and length/Length is the number of bits required.

e.g. A bitset containing the bit pattern 0x12345678 Extract an 8 bit uint8_t with the lsb at index 11.

00010010001101000101011001111000
.............^......^

extract<uint8_t>(11, 8) == 0x8A

jwellbelove avatar Nov 27 '23 12:11 jwellbelove

Looking at the API, would you consider defaulting length to sizeof(T) * CHAR_BIT? It seems like it would be a good default, at least for the use case I've mentioned.

Also, it would probably be better to actually express the integral constraint in code itself, although it would probably require adding a new macro.

jaskij avatar Nov 28 '23 08:11 jaskij

The default length is a good idea.

The integral constraint is built-in with an enable_if. I didn't show it for the sake of clarity.

jwellbelove avatar Nov 28 '23 08:11 jwellbelove

The API looks good to me then.

I thought a little about extraction to, say, std::output_iterator or something, but in the end it unnecessarily expands the API surface for no real gain - there isn't much difference between looping inside ETL and the user looping themselves.

jaskij avatar Nov 28 '23 10:11 jaskij

BTW rather than using the C style sizeof(T) * CHAR_BIT there is etl::integral_limits<T>::bits available.

jwellbelove avatar Nov 28 '23 10:11 jwellbelove

I now have a working version of this. The naïve implementation is pretty simple (build the value by iterating through the bitset, testing each bit individually), but I've spent some time creating a version that extracts the data optimally. There is currently a certain amount of duplicate code in the specialised classes, so the next step is to gather the common functionality into a common base class/static functions.

jwellbelove avatar Dec 05 '23 17:12 jwellbelove

Great! Thank you!

jaskij avatar Dec 06 '23 17:12 jaskij

Added 20.38.11

jwellbelove avatar Apr 21 '24 11:04 jwellbelove