Modify camera async_read/read API to return a dictionary instead of tuple for better compatability?

Open StoneT2000 opened this issue 9 months ago • 0 comments

Currently the intel real sense camera api supports returning either a single rgb image or a rgb image and depth image as a 2-uple

https://github.com/huggingface/lerobot/blob/3c0a209f9fac4d2a57617e686a7f2a2309144ba2/lerobot/common/robot_devices/cameras/intelrealsense.py#L440-L443

However this is not super compatible to work with since not all cameras might return two values (open cv one only does rgb?). For a potentially better API would it be possible to have the async read / read functions always return a dictionary instead with some standard names and data types for the types of image data returned?

e.g.

return dict(rgb=..., depth=...)

This way it is also easier for me to check if the returned data has depth data or not. The current solution is a bit complicated as I need to check if its the IntelRealSenseCamera and if its config has use_depth=True or not.

Thanks!

Mar 13 '25 18:03 StoneT2000