lance icon indicating copy to clipboard operation
lance copied to clipboard

`compact_files` raises OSError: LanceError(IO): Execution error: Row ids did not arrive in sorted order: integers are ordered up to the 0th element`

Open tonyf opened this issue 5 months ago • 9 comments

In [57]: dataset.optimize.compact_files(max_bytes_per_file=1024*1024*256, batch_size=1024, num_threads=100)
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[57], line 1
----> 1 dataset.optimize.compact_files(max_bytes_per_file=1024*1024*256, batch_size=1024, num_threads=100)

File ~/anaconda3/lib/python3.10/site-packages/lance/dataset.py:2624, in DatasetOptimizer.compact_files(self, target_rows_per_fragment, max_rows_per_group, max_bytes_per_file, materialize_deletions, materialize_deletions_threshold, num_threads, batch_size)
   2562 """Compacts small files in the dataset, reducing total number of files.
   2563
   2564 This does a few things:
   (...)
   2613 lance.optimize.Compaction
   2614 """
   2615 opts = dict(
   2616     target_rows_per_fragment=target_rows_per_fragment,
   2617     max_rows_per_group=max_rows_per_group,
   (...)
   2622     batch_size=batch_size,
   2623 )
-> 2624 return Compaction.execute(self._dataset, opts)

OSError: LanceError(IO): Execution error: Row ids did not arrive in sorted order: integers are ordered up to the 0th element, /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/task/poll.rs:288:44

Let me know what other debug information I can provide

tonyf avatar Aug 29 '24 14:08 tonyf