[DRAFT] Initial LLM collector
Description
Add LLM Collector
Motivation and Context
#2872
Types of changes
What types of changes does your code introduce? Remove all that do not apply:
- [ ] New feature (non-breaking change which adds core functionality)
Checklist
Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!
- [X] I have read the CONTRIBUTION guide (required)
- [ ] My change requires a change to the documentation.
- [ ] I have updated the tests accordingly (required for a bug fix or a new feature).
- [ ] I have updated the documentation accordingly.
Testing
python torchrl/collectors/llm_collector.py
Results in
Traceback (most recent call last):
File "/home/lucaskabela/.conda/envs/cabernet/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/home/lucaskabela/.conda/envs/cabernet/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/home/lucaskabela/.conda/envs/cabernet/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 429, in process_input_socket
request = decoder.decode(data_frame.buffer)
File "/home/lucaskabela/.conda/envs/cabernet/lib/python3.10/site-packages/vllm/v1/serial_utils.py", line 34, in decode
return self.decoder.decode(obj)
msgspec.ValidationError: Expected `int | null`, got `bool` - at `$[6].logprobs`
Currently, coming from L83
:link: Helpful Links
:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2879
- :page_facing_up: Preview Python docs built from this PR
Note: Links to docs will display an error until the docs builds have been completed.
This comment was automatically generated by Dr. CI and updates every 15 minutes.
Originally, this code ran into the error:
Traceback (most recent call last):
File "/home/lucaskabela/.conda/envs/cabernet/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/home/lucaskabela/.conda/envs/cabernet/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/home/lucaskabela/.conda/envs/cabernet/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 429, in process_input_socket
request = decoder.decode(data_frame.buffer)
File "/home/lucaskabela/.conda/envs/cabernet/lib/python3.10/site-packages/vllm/v1/serial_utils.py", line 34, in decode
return self.decoder.decode(obj)
msgspec.ValidationError: Expected `int | null`, got `bool` - at `$[6].logprobs`
This seems to have come from a versioning issue; downgrading torch and vllm with
pip install --force torch==2.5.1 vllm==0.7.3
Then
python setup.py clean && python setup.py develop
Did the trick and fixed this error :) Previously, I had torch==2.6.0 and vllm==0.8.2, so note these versions may have cause this error
Amazing! I did a bit of refactoring. Things that are missing:
- Tests in test/test_collectors.py. These should include the vLLMWrapper and the TransformersWrapper
- docstrings + register in the
docsdirectory - ultimately remove the
if __name__ == "__main__":from the file - use
trust_policy=Trueas not doing so can trigger checks that we want to avoid
Amazing! I did a bit of refactoring. Things that are missing:
- Tests in test/test_collectors.py. These should include the vLLMWrapper and the TransformersWrapper
- docstrings + register in the
docsdirectory- ultimately remove the
if __name__ == "__main__":from the file- use
trust_policy=Trueas not doing so can trigger checks that we want to avoid
Thanks for the comments here; I think I incorporated the majority of this feedback and updated the test on my end; the only thing I am not sure about is how to register in the docs directory - could you provide more details on this?