Mark Reed

Results 45 comments of Mark Reed

Current benchmark results vs native f16 ``` dot_bf16_serial_1536d/min_time:10.000/threads:12 1372 ns cos_bf16_serial_1536d/min_time:10.000/threads:12 1485 ns l2sq_bf16_serial_1536d/min_time:10.000/threads:12 1393 ns kl_bf16_serial_1536d/min_time:10.000/threads:12 3352 ns js_bf16_serial_1536d/min_time:10.000/threads:12 5069 ns dot_f16_serial_1536d/min_time:10.000/threads:12 264 ns cos_f16_serial_1536d/min_time:10.000/threads:12 264 ns l2sq_f16_serial_1536d/min_time:10.000/threads:12 264...

I put bf16, f16, and f32 dot_serial() in godbolt. You can add and remove flags (avx2, avx512fp16, etc) to see whats going on. Without flags the bf16 is longer. Is...

Note avx512_bf16 only has support for conversion between bf16 and f32, and a dot product. So I believe our simd accelerated functions will be converting bf16 to f32, and running...

I added the conversion function for compilers that don't support __bf16 ``` SIMSIMD_PUBLIC simsimd_f32_t simsimd_uncompress_bf16(unsigned short x) { unsigned int tmp = x f32 conversion dot_bf16_serial_1536d/min_time:10.000/threads:12 183 ns cos_bf16_serial_1536d/min_time:10.000/threads:12 202...

This is fixed with the following PR. https://github.com/unum-cloud/ucall/pull/104

Also added a fix for never ending attempts to close already closed connections ``` if (is_corrupted()) if ( connection.stage != stage_t::waiting_to_close_k ) return close_gracefully(); ```

I'm looking for alternatives to the pointer tagging and I had a hack disabling auto formatting on save with VS code. I'll fix the formatting on my next commit. The...

Formatting update and moved the tagging to the lower 2 bits.

Going forward I'll follow the commit message rules and squash my commits on the branch prior to the PR.

@lborcard The split works for me. Can you provide details on your machine? OS ?