tony

Results 11 issues of tony

Do you have any examples on how to use PBR materials in plotoptix? I'm working on transitioning rendering a trimesh scene from pyrender to plotoptix and trying to figure out...

## Why are these changes needed? When processing large datasets, ray data doesn't support any sort of resumption scheme. This is an experimental progress tracker for reading & writing `webdataset`...

triage
stale
data

Currently, `LanceFragmentWriter` supports a `storage_options` kwarg to support specifying s3 credentials/endpoints. However, `LanceCommitter` does not have the same option and raises `TypeError: object.__init__() takes exactly one argument (the instance to...

good first issue
python

I have a remote dataset stored on s3. Without scalar indices, using the scanner api with a filter works fine. However, once a scalar index is added, I get a...

Default batch size for lance may cause OOM depending on the size of a row and available system memory

enhancement
rust

Running into s3 rate limits when trying to cleanup a very large dataset with `dataset.cleanup_old_versions`. Can't seem to control this via `LANCE_IO_THREADS`

bug
rust

``` In [57]: dataset.optimize.compact_files(max_bytes_per_file=1024*1024*256, batch_size=1024, num_threads=100) --------------------------------------------------------------------------- OSError Traceback (most recent call last) Cell In[57], line 1 ----> 1 dataset.optimize.compact_files(max_bytes_per_file=1024*1024*256, batch_size=1024, num_threads=100) File ~/anaconda3/lib/python3.10/site-packages/lance/dataset.py:2624, in DatasetOptimizer.compact_files(self, target_rows_per_fragment, max_rows_per_group, max_bytes_per_file, materialize_deletions,...

Looks like filter is broken? Repro: ``` import lance ds = lance.dataset(path) ds.filter("split = 'test'") >>> Traceback (most recent call last): File "", line 1, in File "pyarrow/_dataset.pyx", line 796,...

It looks like `to_batches` isn't respecting the filter kwarg Repro ``` import lance ds = lance.dataset(path) fragments = ds.get_fragments() for batch in fragments[0].to_batches( batch_size=1, filter="split == 'test'", columns=["image", "split"], with_row_id=True,...

bug
good first issue

Problem: fragment sampler downloads fragments in a blocking fashion. This is a bottleneck if your batch size is larger than the number of rows in a fragment. Solution: read-ahead with...