lance icon indicating copy to clipboard operation
lance copied to clipboard

[Python] Add video extension type

Open rok opened this issue 2 years ago • 3 comments

After adding ImageURIArray, EncodedImageArray and FixedShapeImageTensorArray it is straightforward to add analogous types for video. Namely VideoURIArray, VideoEncodedArray and FixedShapeVideoTensorArray array. For decoder see TFs decode_webp.

rok avatar Oct 11 '23 10:10 rok

@wjones127 not sure if this should be closed.

rok avatar Dec 21 '23 01:12 rok

Are there any docs on how to write these types to a lance dataset? Specifically I'm trying to create a video column that's some sort of image array type.

I'm experimenting with doing this instead of just storing the video as bytes to save on decoding time in my training loop.

tonyf avatar Jul 18 '24 18:07 tonyf

Hi @tonyf. In general, you can write an Apache Arrow extension array, and these can be written and read from Lance. A good reference for this would be Rok's changes for the image extension types:

https://github.com/lancedb/lance/pull/1272/files

wjones127 avatar Jul 18 '24 18:07 wjones127

We are currently exploring enhancing Lance's multimodal capabilities and came across this issue. Could @rok @wjones127 share why it was not continued?

ddupg avatar Jul 25 '25 11:07 ddupg