ArcticDB
ArcticDB copied to clipboard
maint: Folly Replacement Plan
Motivation
Folly is admittedly:
- API and ABI unstable in between commits and not designed to be packaged as a shared library
- not supporting Windows, and untested on this platform
- depending on a lot of other libraries
This makes ArcticDB hardly portable, packageable on many platforms under a shared library on conda-forge.
Test disablement
Tests are disabled for Windows for elements ArcticDB uses including but not limited to:
F14SetConcurrentHashMapThreadPoolExecutorsFutureThreadNameFBStringHashThreadCachedIntThreadLocal
folly's dependency graph
mamba repoquery depends -t -c conda-forge folly
folly[2023.10.30.00]
├─ jemalloc[4.4.0]
├─ libboost-headers[1.82.0]
├─ libgcc-ng[13.2.0]
│ ├─ _openmp_mutex[4.5]
│ │ ├─ _libgcc_mutex[0.1]
│ │ └─ llvm-openmp[17.0.4]
│ │ ├─ libzlib[1.2.13]
│ │ └─ zstd[1.5.5]
│ │ ├─ libzlib already visited
│ │ └─ libstdcxx-ng[13.2.0]
│ └─ _libgcc_mutex already visited
├─ libzlib already visited
├─ zstd already visited
├─ libstdcxx-ng already visited
├─ bzip2[1.0.8]
│ └─ libgcc-ng already visited
├─ gflags[2.2.2]
│ ├─ libgcc-ng already visited
│ └─ libstdcxx-ng already visited
├─ lz4-c[1.9.4]
│ ├─ libgcc-ng already visited
│ └─ libstdcxx-ng already visited
├─ glog[0.6.0]
│ ├─ libgcc-ng already visited
│ ├─ libstdcxx-ng already visited
│ └─ gflags already visited
├─ fmt[9.1.0]
│ ├─ libgcc-ng already visited
│ └─ libstdcxx-ng already visited
├─ xz[5.2.6]
│ └─ libgcc-ng already visited
├─ libboost[1.82.0]
│ ├─ libgcc-ng already visited
│ ├─ libzlib already visited
│ ├─ zstd already visited
│ ├─ libstdcxx-ng already visited
│ ├─ bzip2 already visited
│ ├─ xz already visited
│ └─ icu[73.2]
│ ├─ libgcc-ng already visited
│ └─ libstdcxx-ng already visited
├─ snappy[1.1.10]
│ ├─ libgcc-ng already visited
│ └─ libstdcxx-ng already visited
├─ libsodium[1.0.18]
│ └─ libgcc-ng already visited
├─ double-conversion[3.3.0]
│ ├─ libgcc-ng already visited
│ └─ libstdcxx-ng already visited
├─ libevent[2.1.10]
│ ├─ libgcc-ng already visited
│ └─ openssl[1.1.1w]
│ ├─ libgcc-ng already visited
│ └─ ca-certificates[2016.2.28]
├─ libaio[0.3.113]
│ └─ libgcc-ng already visited
├─ openssl[3.1.4]
│ ├─ libgcc-ng already visited
│ └─ ca-certificates already visited
└─ libjemalloc[5.3.0]
├─ libgcc-ng already visited
└─ libstdcxx-ng already visited
WIP Removal plan
Some elements have been removed with:
- https://github.com/man-group/ArcticDB/pull/1144
- https://github.com/man-group/ArcticDB/pull/1370
Based on the current usage on master as of 4184a467d9eee90600ddcbf34d896c763e76f78f.
Smaller utilities
- [ ]
folly/gen/Base(12 includes): We also might need to rewrite that IMO. - [ ]
folly/Range(16 includes): Elements ofstd::rangesas of C++23? e.g.std::ranges::iota_view - [ ]
folly/Poly(6 includes): Will be rewritten (part of the SOW) - [x]
folly/ClockGettimeWrappers(1 include): Fixed by #1448 - [ ]
folly/hash/Hash(1 include): Candidate fix #1506 - [x]
folly/portability/PThread(1 include): Fixed by #1447 - [x]
folly/portability/Time(1 include): Fixed by #1448 - [x]
folly/system/ThreadId(1 include): Fixed by #1417 - [x]
folly/system/ThreadName(2 includes): Fixed by #1446 - [x]
folly/ThreadCachedInt(1 include): Fixed by #1486 - [ ]
folly/concurrency/ConcurrentHashMap(4 include): Fixed by #1580
Task scheduling system
- [ ]
folly/executors/CPUThreadPoolExecutor(4 includes) - [ ]
folly/executors/FutureExecutor(4 includes) - [ ]
folly/executors/IOThreadPoolExecutor(4 includes) - [ ]
folly/Function(9 includes) - [ ]
folly/futures/Future(23 includes) - [ ]
folly/futures/FutureSplitter(2 includes)
Potential alternatives:
taskflow: Async Parallel Task for Modern C++ (mentionned by Thorsten)- no dependencies
- supporting all OSes
- support all compilers
- only requires
C++17 - similar abstractions to folly's and more
- profiler included
- only 30 GitHub issues
libunifex: Mentioned by Joël- same company as
follybut different team - one of the implementation close to
std::executionwhich is developped by the same team and is targetted forC++26 - used in Facebook mobile apps
- still considered a bit experimental
- requires
C++17or later, C++ coroutines support if usingC++20or later
- same company as
- Intel oneAPI Threading Building Blocks (
oneTBB):- reference, seasoned runtime for tasks execution
- from my limited experience, navigating Intel's stack and documentation is relatively complex (a lot of unclarity due to legacy and too many duplicated websites)
- OpenMP:
- a super-stable specification
- implemented in all compilers
- supports all OSes
- runtimes implementations threadpools' can clash with other runtimes' (e.g. if both have been built using
pthread) - probably not as flexible as folly async executors
nanothread:- no dependencies
- supporting all OSes
- support all compilers
- even more minimalistic (fewer features)