ClickHouse.Client Refactor Bulk Insert to Use Memory<object> for Enhanced Flexibility and Efficiency

🔧 Summary This PR refactors the ClickHouse bulk insert mechanism to leverage Memory<object> and IMemoryOwner<Memory<object>>, replacing the previous object[][]-based implementation.

🧾 Key Changes

Batch.Rows
- Changed from object[][] to IMemoryOwner<Memory<object>>
- Updated Dispose() to properly release owned memory
WriteToServerAsync
- Added support for IEnumerable<Memory<object>>
- Existing overloads (object[], DataTable, etc.) now internally convert to Memory<object>
IntoBatches
- Refactored to use MemoryPool<T> instead of ArrayPool<T>
- Returns IMemoryOwner<T> for improved memory safety
Row Serializers (IRowSerializer, RowBinarySerializer, RowBinaryWithDefaultsSerializer)
- Refactored to use Span<object> instead of object[]
- Enables zero-allocation serialization where possible

✅ Benefits

Improved Flexibility Allows upstream systems to pass reusable or pooled Memory<object> buffers directly.
Better Memory Management Controlled lifetime via IMemoryOwner<T>, reduced GC pressure, and support for memory pooling.
Zero Allocation Serialization Using Span<object> avoids intermediate heap allocations when writing rows.
Backward Compatibility Maintained Existing interfaces remain usable; new overloads provide better performance paths.

📌 Motivation This refactor lays the foundation for more scalable, high-performance bulk insert scenarios. It will enable cleaner integration with upstream systems that use buffer pooling or stream processing architectures.

Apr 22 '25 05:04 xwwwx

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 82.56%. Comparing base (0d4b060) to head (722904a).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #628      +/-   ##
==========================================
+ Coverage   82.18%   82.56%   +0.38%     
==========================================
  Files         103      103              
  Lines        2245     2243       -2     
  Branches      340      339       -1     
==========================================
+ Hits         1845     1852       +7     
+ Misses        275      264      -11     
- Partials      125      127       +2

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:

:snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Apr 22 '25 05:04 codecov[bot]

Hi,

Thank you for your PR, I appreciate the intent to help with optimization. However, according to benchmark it doesn't quite improve performance and in fact makes it slightly worse:

With ArrayPool:

Method	Count	Mean	Error	StdDev	Gen0	Gen1	Gen2	Allocated
BulkInsertInt32	100000	32.02 ms	0.503 ms	0.618 ms	812.5000	812.5000	812.5000	8.14 MB

With MemoryPool:

Method	Count	Mean	Error	StdDev	Gen0	Gen1	Gen2	Allocated
BulkInsertInt32	100000	33.02 ms	0.636 ms	1.285 ms	906.2500	906.2500	906.2500	8.27 MB

The primary cause for that seems to be that MemoryPool is actually backed by ArrayPool internally: https://stackoverflow.com/a/61859719/1732138

But a good attempt nevertheless

Apr 22 '25 18:04 DarkWanderer

Thank you for the feedback!

The main motivation behind this change is not to improve performance per se, but to improve flexibility for upstream callers.

In our use case, we use a pooled object[] array and slice it into variable-sized segments for bulk insert. Since the segment size isn't fixed, using Memory

I agree that Memory

That said, I’m open to suggestions on how we can better support this kind of usage while minimizing the impact on performance and the library’s core design.

Apr 23 '25 03:04 xwwwx

I've reverted back to ArrayPool as suggested. Please help check if this improves the performance. 🙇‍♂️

Apr 23 '25 03:04 xwwwx