fury icon indicating copy to clipboard operation
fury copied to clipboard

[C++] Use SIMD to retrofit and optimize furycpp

Open pandalee99 opened this issue 11 months ago • 6 comments

Feature Request

Maybe we can try some portable SIMD libraries, like https://github.com/xtensor-stack/xsimd https://github.com/google/highway .. instead of handwritten intrinsic calls.

because xsimd is also often used on apache arrow to improve data processing, and it works very well.

Is your feature request related to a problem? Please describe

No response

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

pandalee99 avatar Jan 19 '25 16:01 pandalee99

Then, we can continue to use Project simdutf to improve the original logic. relate #2002 #1732

pandalee99 avatar Jan 19 '25 16:01 pandalee99

Image

About simdutf then, I used Single-header version and did a simple test

std::string utf16ToUtf8WithSIMDUTF(const std::u16string &utf16) {
  // Get the length of the input UTF-16LE string
  size_t utf16_length = utf16.length();
  // Calculate the number of bytes required to convert UTF-16LE to UTF-8
  size_t utf8_length = simdutf::utf8_length_from_utf16le(reinterpret_cast<const char16_t *>(utf16.data()), utf16_length);
  // Create a string to store the UTF-8 result, initialized to the specified length
  std::string utf8_result(utf8_length, '\0');
  // Call convert_utf16le_to_utf8 to perform the conversion
  size_t written_bytes = simdutf::convert_utf16le_to_utf8(reinterpret_cast<const char16_t *>(utf16.data()), utf16_length, utf8_result.data());
  // Resize the string to match the actual number of written bytes
  utf8_result.resize(written_bytes);
  return utf8_result;
}

The operation efficiency is not as efficient

pandalee99 avatar Jan 20 '25 09:01 pandalee99

cc @chaokunyang

pandalee99 avatar Jan 20 '25 09:01 pandalee99

Could you attach a benchmark? e.g. in https://quick-bench.com/.

PragmaTwice avatar Jan 20 '25 09:01 PragmaTwice

Could you attach a benchmark? e.g. in https://quick-bench.com/.

sure, i will implement it later.

pandalee99 avatar Jan 20 '25 10:01 pandalee99

I tried to carry out a series of rigorous tests, and finally came to this result.

Image

'BM_SIMD_UTF', also known as simdutf, does seem to perform better. I also feel a little sorry for the lack of rigor in the previous test.

Thank you very much for yours guidance. @PragmaTwice @chaokunyang I will implement the benchmark module in furycpp to facilitate the later functional testing.

pandalee99 avatar Jan 23 '25 16:01 pandalee99