Sidd Karamcheti issues

Results 23 issues of


                                            Sidd Karamcheti

Configurable Help Text

This is an awesome project! One small enhancement - it would be really nice to support "configurable" population orders for help text beyond the default (look for docstring above argument,...

Add Voltron-X Cites

Updated venues for SIGGRAPH, add new cites. Leaving as a draft PR until #34 is merged.

Add Voltron RSS Cites

Additionally fixes TMLR venue formatting (typo and lack of a volume number, as it's a rolling publication cycle).

Clarification: SigLIP Image Transform

Thanks for open-sourcing the SigLIP models! Clarification question: in the demo IPython notebook, the image transform function has the form `pp_img = pp_builder.get_preprocess_fn(f'resize({RES})|value_range(-1, 1)')`. Looking at the code [here](https://github.com/google-research/big_vision/blob/main/big_vision/pp/ops_image.py#L64), this...

Get Permission to Host/Script Downloads for Various Datasets

It's pretty inconvenient to have users manually download the various datasets (especially the bigger ones such as OCID-Ref and the Franka/Adroit Demonstrations); having a nice interface/hosting solution would be nice....

enhancement

Upload to PyPI, host Dataset on HuggingFace Hub

Once finalized, would be nice to treat this as a modular, general library akin to the main [Voltron Robotics](https://github.com/siddk/voltron-robotics) package. Would ideally be nice so that `pip` handled everything (all...

enhancement

question

[Roadmap] Consolidate V-Evaluation Harness into a General API / Runner

Once the Visuomotor Control & Intent Scoring tasks have been integrated properly, would be nice to consolidate the V-Evaluation Harness into a more general API that can be used for...

documentation

refactor

roadmap

[Roadmap] Acquire WHiRL Videos for Intent Scoring Example

Check with Shikhar + co. around sharing the video examples from the [WHiRL website](https://human2robot.github.io/) - possibly get more examples. Then: - [ ] Upload videos to GDrive add simple `gdown`...

roadmap

[Roadmap] Refactor Visuomotor Control Evaluation

Currently, both the Franka Kitchen & Adroit Visuomotor Control Tasks are implemented on top of the [R3M evaluation code](https://github.com/facebookresearch/r3m/tree/eval/evaluation). While a fairly straightforward addition to that codebase, it would be...

enhancement

roadmap

Add Prismatic VLMs to Transformers

### Model description Hi! I'm the author of ["Prismatic VLMs"](https://github.com/TRI-ML/prismatic-vlms), our upcoming ICML paper that introduces and ablates design choices of visually-conditioned language models that are similar to LLaVa or...

New model