Raghavendra M Dani

Results 15 issues of Raghavendra M Dani

This commit does two things: 1. Fixes the BlockWritePathProvider implementation to be compatible with latest ray version as it's deprecated in new ray version. Ray Dataset code path only works...

This feature allows reading the DeltaCAT catalog (Iceberg, internal catalog etc.) into a Daft distributed dataframe. It also provides hints regarding data layout to Daft reader so that the reads...

enhancement
P1
iceberg

For rebase and backfill scenarios, we correctly calculate input stats. However, for incremental this will only represent incremental delta stats. Due to this we are having to hack around our...

Currently, we only limit deltas in a compaction round based on total object store memory available in a cluster. When there is a very large delta that contains many manifest...

P3

Currently, _execute_compaction_round method takes more than 220GB while rest of the nodes take up 70GB. The size always increases as it continues to next step. This becomes a problem when...

P1

**Is your feature request related to a problem? Please describe.** Currently, S3 credentials specified in S3Config are static. It is possible that the data frame can actually perform reads long...

p1

**Is your feature request related to a problem? Please describe.** No, this feature allows us to write data using Daft to our internal catalog. This feature is similar to what...

p1

**Is your feature request related to a problem? Please describe.** Today, we do not write bloom filters metadata for each column in parquet files which makes the reads inefficient. **Describe...

### What happened + What you expected to happen We run Ray jobs in production. Right after upgrading Ray version from 2.3.0 to 2.20.0, we saw a significant increase in...

bug
triage
performance
core-autoscaler
core

This REP introduces EC2 fleet support on ray autoscaler for AWS provider.