c-blosc2 icon indicating copy to clipboard operation
c-blosc2 copied to clipboard

inline function `swap_store` in `blosc2.h`

Open DimitriPapadopoulos opened this issue 2 years ago • 12 comments

Describe the bug

Why is the inline function swap_store part of the API?

It is mostly used for int32_t and int64_t. In that case, calls to malloc and memcpy are overkill. Just use:

uint32_t swap32(uint32_t x) {
  return ((x & 0x000000FF) << 24) | ((x & 0x0000FF00) <<  8) |
         ((x & 0x00FF0000) >>  8) | ((x & 0xFF000000) >> 24);
}

uint64_t swap64(uint64_t x) {
  return ((x & 0x00000000000000FF) << 56) | ((x & 0x000000000000FF00) << 40) | ((x & 0x0000000000FF0000) << 24) | ((x & 0x00000000FF000000) <<  8) | 
         ((x & 0x000000FF00000000) >>  8) | ((x & 0x0000FF0000000000) >> 24) | ((x & 0x00FF000000000000) >> 40) | ((x & 0xFF00000000000000) >> 56);
}

Actually, you should use compiler intrinsics if they exist:

2 bytes 4 bytes 8 bytes
MSVC _byteswap_ushort _byteswap_ulong _byteswap_uint64
GCC __builtin_bswap16 __builtin_bswap32 __builtin_bswap64
Clang __builtin_bswap16 __builtin_bswap32 __builtin_bswap64

To Reproduce

https://github.com/Blosc/c-blosc2/blob/65a1a3034a27004337d23d11f44e529114c65d7d/include/blosc2.h#L2121-L2127

Expected behavior

Remove from the API.

Rewrite using compiler intrinsics and fall back to the above piece of code.

Additional context

See xxHash implementation:

/*!
 * @internal
 * @fn xxh_u32 XXH_swap32(xxh_u32 x)
 * @brief A 32-bit byteswap.
 *
 * @param x The 32-bit integer to byteswap.
 * @return @p x, byteswapped.
 */
#if defined(_MSC_VER)     /* Visual Studio */
#  define XXH_swap32 _byteswap_ulong
#elif XXH_GCC_VERSION >= 403
#  define XXH_swap32 __builtin_bswap32
#else
static xxh_u32 XXH_swap32 (xxh_u32 x)
{
    return  ((x << 24) & 0xff000000 ) |
            ((x <<  8) & 0x00ff0000 ) |
            ((x >>  8) & 0x0000ff00 ) |
            ((x >> 24) & 0x000000ff );
}
#endif

DimitriPapadopoulos avatar May 18 '23 07:05 DimitriPapadopoulos

Actually, you already have such functions: https://github.com/Blosc/c-blosc2/blob/65a1a3034a27004337d23d11f44e529114c65d7d/blosc/blosc-private.h#L136-L150

DimitriPapadopoulos avatar May 18 '23 09:05 DimitriPapadopoulos

You even have a similar endian_handler function: https://github.com/Blosc/c-blosc2/blob/65a1a3034a27004337d23d11f44e529114c65d7d/blosc/blosc-private.h#L49-L87

DimitriPapadopoulos avatar May 18 '23 09:05 DimitriPapadopoulos

IIRC we made this in order to avoid replicating the code in codecs, which do not have access to the blosc2 internals. See e.g. https://github.com/Blosc/c-blosc2/commit/8359e6d06ad4c95f546251e78cab0df253936c09

FrancescAlted avatar May 18 '23 11:05 FrancescAlted

I see. How about moving that code to plugins/plugin_utils.h?

DimitriPapadopoulos avatar May 18 '23 12:05 DimitriPapadopoulos

I see. How about moving that code to plugins/plugin_utils.h?

+1

FrancescAlted avatar May 18 '23 12:05 FrancescAlted

Also, swap_store is used only on 32- and 64-bits integers:

$ grep -R swap_store
blosc/b2nd.c:    swap_store(pmeta, shape + i, sizeof(int64_t));
blosc/b2nd.c:    swap_store(pmeta, chunkshape + i, sizeof(int32_t));
blosc/b2nd.c:    swap_store(pmeta, blockshape + i, sizeof(int32_t));
blosc/b2nd.c:  swap_store(pmeta, &dtype_len, sizeof(int32_t));
plugins/codecs/zfp/blosc2-zfp.c:          swap_store(blockmeta + i, pmeta, sizeof(int32_t));
plugins/plugin_utils.c:    swap_store(shape + i, pmeta, sizeof(int64_t));
plugins/plugin_utils.c:    swap_store(chunkshape + i, pmeta, sizeof(int32_t));
plugins/plugin_utils.c:    swap_store(blockshape + i, pmeta, sizeof(int32_t));
include/b2nd.h:    swap_store(shape + i, pmeta, sizeof(int64_t));
include/b2nd.h:    swap_store(chunkshape + i, pmeta, sizeof(int32_t));
include/b2nd.h:    swap_store(blockshape + i, pmeta, sizeof(int32_t));
include/b2nd.h:    swap_store(&dtype_len, pmeta, sizeof(int32_t));
include/blosc2.h:static inline void swap_store(void *dest, const void *pa, int size) {
$ 

I plan on adding specialised functions for those types only (and maybe 16-bits integers just in case).

DimitriPapadopoulos avatar May 18 '23 12:05 DimitriPapadopoulos

Actually, (some) codecs do access blosc2 internals – at the very least they include headers from blosc.

For example, while compiling plugins/codecs/zfp/blosc2-zfp.c, the compiler complains about redefinitions of the new swap functions in plugins/plugin_utils.h, with the previous definitions in blosc/blosc-private.h. That's because plugins/codecs/zfp/blosc2-zfp.c includes private blosc header files: https://github.com/Blosc/c-blosc2/blob/65a1a3034a27004337d23d11f44e529114c65d7d/plugins/codecs/zfp/blosc2-zfp.c#L8-L14

What do you mean exactly by “codecs [...] do not have access to the blosc2 internals”?

DimitriPapadopoulos avatar May 18 '23 13:05 DimitriPapadopoulos

I don't remember exactly how things went, but probably we've made swap_store public without thinking too much. Having said that, recently we have introduced the concept of dynamic plugins, and I'd say that swap_store in blosc2.h can be good for them.

Now that I look into the code, a possibility is to move these functions to include/blosc2/blosc2-common.h which is distributed in binary (devel) packages, and that would be good for plugin developers?

FrancescAlted avatar May 18 '23 15:05 FrancescAlted

So, it would be part of the API after all.

I will move/merge everything from include/blosc2.h and blosc/blosc-private.h to include/blosc2/blosc2-common.h.

DimitriPapadopoulos avatar May 21 '23 16:05 DimitriPapadopoulos

These two functions are (almost) identical:

  • swap_store() from include/blosc2.h
  • endian_handler() from blosc/blosc-private.h

These functions are really strange, it feels like they were written as an example not to follow:

  • they were written as if they could swap any number of bytes,
  • yet they swap only 1, 2, 4 or 8 bytes,
  • the fail to optimise these few supported cases with compiler intrinsics,
  • they allocate memory without reason as far as I can see,
  • when they fail, they leave rubbish behind and just output the message Unhandled nitems.

I'll see if we can get rid of them, and stick to functions specialised for 2, 4 or 8 bytes.

DimitriPapadopoulos avatar May 21 '23 16:05 DimitriPapadopoulos

Any progress on this one?

FrancescAlted avatar Jun 07 '24 16:06 FrancescAlted

I can't recall why this issue is stalled. I guess I'm not entirely certain how to fix this. Ideally, these functions should be removed from the API.

DimitriPapadopoulos avatar Jun 10 '24 06:06 DimitriPapadopoulos