Yijie Shen

Results 29 comments of Yijie Shen

@houqp Do you think datafusion-contrib is the right place to hold this hdfs rust repo? or should I make it under my account?

@zijie0 let's cooperate here : https://github.com/datafusion-contrib/hdfs-native

Hi @mingruimingrui , I've met with a similar problem, a customized HDFS version similar to yours. To make it worse, we even use HDFS with federation that isn't supported by...

Yes, that is true for ClickHouse. For now, our hosted ClickHouse cluster can only use one single HDFS NameNode. Lack the capability to use federated HDFS.

I find the original arrow-rs has two methods for array size estimation, one for buffer size and the other for total physical memory consumption: ```rust /// Returns the total number...

More background on this: For this specific test in DataFusion: https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/physical_plan/common.rs#L292-L310 If we use `get_buffer_memory_size `, it results in 128 in total size, 64 bytes for each float array. And...

Thanks for the clarification. I agree that for large record batches, the implementation variance is neglectable. But there are chances that users could potentially create tiny record batches with likely...

Thanks @nevi-me for trying Blaze out. I've created a fix for a more meaningful rename mismatch infos first. However, I expect more follow-up efforts would be needed before we are...

I've merged the log info first; let me know if you have a chance for further exploration.