tony issues

Results 11 issues of


                                            tony

PBR Materials Support

Do you have any examples on how to use PBR materials in plotoptix? I'm working on transitioning rendering a trimesh scene from pyrender to plotoptix and trying to figure out...

[Data][Experimental] Progress Tracking for Webdataset

## Why are these changes needed? When processing large datasets, ray data doesn't support any sort of resumption scheme. This is an experimental progress tracker for reading & writing `webdataset`...

triage

stale

data

Add `storage_options` to `LanceCommitter`

Currently, `LanceFragmentWriter` supports a `storage_options` kwarg to support specifying s3 credentials/endpoints. However, `LanceCommitter` does not have the same option and raises `TypeError: object.__init__() takes exactly one argument (the instance to...

good first issue

python

Scanning dataset with scalar indices results in `Generic S3 Error: error decoding response body`

I have a remote dataset stored on s3. Without scalar indices, using the scanner api with a filter works fine. However, once a scalar index is added, I get a...

Enable setting batch size in `add_columns`

Default batch size for lance may cause OOM depending on the size of a row and available system memory

enhancement

rust

Limit parallelism in `Dataset.cleanup_old_versions`

Running into s3 rate limits when trying to cleanup a very large dataset with `dataset.cleanup_old_versions`. Can't seem to control this via `LANCE_IO_THREADS`

bug

rust

`compact_files` raises OSError: LanceError(IO): Execution error: Row ids did not arrive in sorted order: integers are ordered up to the 0th element`

``` In [57]: dataset.optimize.compact_files(max_bytes_per_file=1024*1024*256, batch_size=1024, num_threads=100) --------------------------------------------------------------------------- OSError Traceback (most recent call last) Cell In[57], line 1 ----> 1 dataset.optimize.compact_files(max_bytes_per_file=1024*1024*256, batch_size=1024, num_threads=100) File ~/anaconda3/lib/python3.10/site-packages/lance/dataset.py:2624, in DatasetOptimizer.compact_files(self, target_rows_per_fragment, max_rows_per_group, max_bytes_per_file, materialize_deletions,...

AttributeError: 'NoneType' object has no attribute 'get' on `dataset.filter`

Looks like filter is broken? Repro: ``` import lance ds = lance.dataset(path) ds.filter("split = 'test'") >>> Traceback (most recent call last): File "", line 1, in File "pyarrow/_dataset.pyx", line 796,...

`LanceFragment.to_batches` not respecting `filter` kwarg

It looks like `to_batches` isn't respecting the filter kwarg Repro ``` import lance ds = lance.dataset(path) fragments = ds.get_fragments() for batch in fragments[0].to_batches( batch_size=1, filter="split == 'test'", columns=["image", "split"], with_row_id=True,...

bug

good first issue

Faster ShardedFragmentSampler

Problem: fragment sampler downloads fragments in a blocking fashion. This is a bottleneck if your batch size is larger than the number of rows in a fragment. Solution: read-ahead with...