embucket-labs
embucket-labs copied to clipboard
Concurrent INSERT operations result in data loss despite successful execution
Problem
Concurrent INSERT operations on the same table complete successfully but only the last operation's data persists to the table, resulting in silent data loss.
Reproduction
async fn test_concurrent_query_execution() {
let session = create_df_session().await;
// Create a test table
let num_operations = 5;
// Execute multiple parallel INSERT operations
let mut handles = vec![];
for i in 0..num_operations {
let session_clone = session.clone();
let handle = tokio::spawn(async move {
let insert_query = format!("INSERT INTO concurrent_test VALUES ({})", i + 1);
let mut query = session_clone.query(&insert_query, QueryContext::default());
let result = query.execute().await;
result
});
handles.push(handle);
}
// SELECT COUNT(*) FROM concurrent_test
- Metastore's
update_tableseem to fail persisting concurrent changes, but is easily updated with locks ( per-key locking toSlateDBMetastore::update_table()) that fixes the behaviour. - The issue might be on a different level and requires metastore interface change (
select_for_update?) - Silent data loss