fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

simd: add new SIMD support for JSON escaping

Open edsiper opened this issue 1 year ago • 1 comments

For cases where JSON encoding is needed, if SIMD is available, this brings 30%-50% performance improvement.

PostgreSQL SIMD

The header file with SIMD functionality has been taken from the PostgreSQL project, stripped down, and adapted for our specific needs.

Notes on other improvements

  • the routine that is used to escape characters, now uses a lookup table which heavily improves performance when Fluent Bit is built on release mode (optimizations on). This brings performance improvements for all systems/architectures.

  • SIMD operations are available for architectures that implement SSE2 (Intel/AMD) and Neon (Arm) based instructions. Note that AVX2 is not implemented so there is still more room for improvement.


Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

edsiper avatar Oct 18 '24 01:10 edsiper

@pwhelan @leonardo-albertovich @cosmo0920 @pwhelan @patrick-stephens I need your help on this for workflows and overall testing:

  • I have introduced a new CMake option called FLB_SIMD (default: off)
  • SIMD operations are supported for x86_64, amd64 (SSE2) and aarch64 (Neon)
  • SIMD is enabled at build time and the backend selected per compiler definitions
  • A fallback mechanism exists if SIMD is not available of if is disabled (-DFLB_SIMD=Off).

so:

  • is there a chance a special architecture don't allow SIMD operations event those are supported ?
  • would be possible to ship this as a default/enabled on certain builds like containers without introducing any potential breaking change ?

comments are welcome

edsiper avatar Oct 20 '24 00:10 edsiper

is there a chance a special architecture don't allow SIMD operations event those are supported ?

The one of the candidates is RISC-V vector extension("RVV"): https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/main/doc/rvv-intrinsic-spec.adoc

cosmo0920 avatar Oct 21 '24 03:10 cosmo0920

  • would be possible to ship this as a default/enabled on certain builds like containers without introducing any potential breaking change ?

For safety, we might need to have two images for PC architecture that is:

  • x86_64-generic means without SIMD support
  • x86_64-simd means with SIMD support

PC architectures are fragmented. So, if we enforce to use SIMD support in our container images, illegal instruction errors might be happened in the ancient instances/boxes in unsupported SIMD environments.. But, SSE2 has been supported since Intel Pentium or AMD Athlon 64. So, they have 20 years history.

cosmo0920 avatar Oct 21 '24 08:10 cosmo0920

thanks for the feedback. Merging it for now since the feature needs to be enabled at build time

edsiper avatar Oct 25 '24 20:10 edsiper