ArcticDB icon indicating copy to clipboard operation
ArcticDB copied to clipboard

maint: Folly Replacement Plan

Open jjerphan opened this issue 1 year ago • 0 comments

Motivation

Folly is admittedly:

This makes ArcticDB hardly portable, packageable on many platforms under a shared library on conda-forge.

Test disablement

Tests are disabled for Windows for elements ArcticDB uses including but not limited to:

  • F14Set
  • ConcurrentHashMap
  • ThreadPoolExecutors
  • Future
  • ThreadName
  • FBString
  • Hash
  • ThreadCachedInt
  • ThreadLocal
folly's dependency graph
mamba repoquery depends -t -c conda-forge folly
folly[2023.10.30.00]
  ├─ jemalloc[4.4.0]
  ├─ libboost-headers[1.82.0]
  ├─ libgcc-ng[13.2.0]
  │  ├─ _openmp_mutex[4.5]
  │  │  ├─ _libgcc_mutex[0.1]
  │  │  └─ llvm-openmp[17.0.4]
  │  │     ├─ libzlib[1.2.13]
  │  │     └─ zstd[1.5.5]
  │  │        ├─ libzlib already visited
  │  │        └─ libstdcxx-ng[13.2.0]
  │  └─ _libgcc_mutex already visited
  ├─ libzlib already visited
  ├─ zstd already visited
  ├─ libstdcxx-ng already visited
  ├─ bzip2[1.0.8]
  │  └─ libgcc-ng already visited
  ├─ gflags[2.2.2]
  │  ├─ libgcc-ng already visited
  │  └─ libstdcxx-ng already visited
  ├─ lz4-c[1.9.4]
  │  ├─ libgcc-ng already visited
  │  └─ libstdcxx-ng already visited
  ├─ glog[0.6.0]
  │  ├─ libgcc-ng already visited
  │  ├─ libstdcxx-ng already visited
  │  └─ gflags already visited
  ├─ fmt[9.1.0]
  │  ├─ libgcc-ng already visited
  │  └─ libstdcxx-ng already visited
  ├─ xz[5.2.6]
  │  └─ libgcc-ng already visited
  ├─ libboost[1.82.0]
  │  ├─ libgcc-ng already visited
  │  ├─ libzlib already visited
  │  ├─ zstd already visited
  │  ├─ libstdcxx-ng already visited
  │  ├─ bzip2 already visited
  │  ├─ xz already visited
  │  └─ icu[73.2]
  │     ├─ libgcc-ng already visited
  │     └─ libstdcxx-ng already visited
  ├─ snappy[1.1.10]
  │  ├─ libgcc-ng already visited
  │  └─ libstdcxx-ng already visited
  ├─ libsodium[1.0.18]
  │  └─ libgcc-ng already visited
  ├─ double-conversion[3.3.0]
  │  ├─ libgcc-ng already visited
  │  └─ libstdcxx-ng already visited
  ├─ libevent[2.1.10]
  │  ├─ libgcc-ng already visited
  │  └─ openssl[1.1.1w]
  │     ├─ libgcc-ng already visited
  │     └─ ca-certificates[2016.2.28]
  ├─ libaio[0.3.113]
  │  └─ libgcc-ng already visited
  ├─ openssl[3.1.4]
  │  ├─ libgcc-ng already visited
  │  └─ ca-certificates already visited
  └─ libjemalloc[5.3.0]
     ├─ libgcc-ng already visited
     └─ libstdcxx-ng already visited

WIP Removal plan

Some elements have been removed with:

  • https://github.com/man-group/ArcticDB/pull/1144
  • https://github.com/man-group/ArcticDB/pull/1370

Based on the current usage on master as of 4184a467d9eee90600ddcbf34d896c763e76f78f.

Smaller utilities

  • [ ] folly/gen/Base (12 includes): We also might need to rewrite that IMO.
  • [ ] folly/Range (16 includes): Elements of std::ranges as of C++23? e.g. std::ranges::iota_view
  • [ ] folly/Poly (6 includes): Will be rewritten (part of the SOW)
  • [x] folly/ClockGettimeWrappers (1 include): Fixed by #1448
  • [ ] folly/hash/Hash (1 include): Candidate fix #1506
  • [x] folly/portability/PThread (1 include): Fixed by #1447
  • [x] folly/portability/Time (1 include): Fixed by #1448
  • [x] folly/system/ThreadId (1 include): Fixed by #1417
  • [x] folly/system/ThreadName (2 includes): Fixed by #1446
  • [x] folly/ThreadCachedInt (1 include): Fixed by #1486
  • [ ] folly/concurrency/ConcurrentHashMap (4 include): Fixed by #1580

Task scheduling system

  • [ ] folly/executors/CPUThreadPoolExecutor (4 includes)
  • [ ] folly/executors/FutureExecutor (4 includes)
  • [ ] folly/executors/IOThreadPoolExecutor (4 includes)
  • [ ] folly/Function (9 includes)
  • [ ] folly/futures/Future (23 includes)
  • [ ] folly/futures/FutureSplitter (2 includes)

Potential alternatives:

  • taskflow: Async Parallel Task for Modern C++ (mentionned by Thorsten)
    • no dependencies
    • supporting all OSes
    • support all compilers
    • only requires C++17
    • similar abstractions to folly's and more
    • profiler included
    • only 30 GitHub issues
  • libunifex: Mentioned by Joël
    • same company as folly but different team
    • one of the implementation close to std::execution which is developped by the same team and is targetted for C++26
    • used in Facebook mobile apps
    • still considered a bit experimental
    • requires C++17 or later, C++ coroutines support if using C++20 or later
  • Intel oneAPI Threading Building Blocks (oneTBB):
    • reference, seasoned runtime for tasks execution
    • from my limited experience, navigating Intel's stack and documentation is relatively complex (a lot of unclarity due to legacy and too many duplicated websites)
  • OpenMP:
    • a super-stable specification
    • implemented in all compilers
    • supports all OSes
    • runtimes implementations threadpools' can clash with other runtimes' (e.g. if both have been built using pthread)
    • probably not as flexible as folly async executors
  • nanothread:
    • no dependencies
    • supporting all OSes
    • support all compilers
    • even more minimalistic (fewer features)

jjerphan avatar Mar 11 '24 13:03 jjerphan