litdata
litdata copied to clipboard
Add support for TiffStreamingRawDataset
🚀 Feature
Notes from @tchaton
We could add https://developmentseed.org/async-tiff/latest to the StreamingRawDataset
from litdata import StreamingRawDataset
from litdata.raw.types import TIFF
import torch
class TiffStreamingRawDataset(StreamingRawDataset):
def setup(self, urls):
return [TIFF(url, tile=(512, 512, 3), ....]
def __getitem__(self, decoded_bytes: bytes):
return torch.frombuffer(decoded_bytes, torch.uint8)
example: https://github.com/microsoft/pytorch-cloud-geotiff-optimization/blob/5fb6d1294163beff822441829dcd63a3791b7808/optimized_cog_streaming/datamodules.py#L89 and https://github.com/microsoft/pytorch-cloud-geotiff-optimization/blob/5fb6d1294163beff822441829dcd63a3791b7808/optimized_cog_streaming/datasets.py#L42
Motivation
Pitch
Alternatives
Additional context
Hey @bhimrazy I'd like to take a stab at this. Can you assign this issue to me? Thanks!