Woosuk Kwon issues

Results 65 issues of


                                            Woosuk Kwon

`AbstractStore.download_remote_dir` is broken.

`download_remote_dir` doesn't work. Printed error: ```python Traceback (most recent call last): File "test.py", line 3, in list(storage.stores.values())[0].download_remote_dir('') File "/Users/woosuk/workspace/sky-proj/sky/sky/data/storage.py", line 178, in download_remote_dir iterator = self._remote_filepath_iterator() AttributeError: 'S3Store' object has...

Problems in using LocalDockerBackend for debugging setup

We've discussed using our local docker backend to resolve the difficulties in the setup, prior to provisioning (see #670 and this [gist](https://gist.github.com/concretevitamin/51e82f8b210ed7b905de20eb95bc8d6d)). As a preliminary investigation, I tried to use...

Refactor resources

Support BLOOM

BLOOM is an open-source LLM developed by BigScience. The BLOOM models have achieved high rankings in HuggingFace downloads. It'd be great to have these models in our catalog.

new model

Publish wheels with pre-built CUDA binaries

Currently, pip installing our package takes 5-10 minutes because our CUDA kernels are compiled on the user machine. For better UX, we should include pre-built CUDA binaries in our PyPI...

help wanted

Installation

Implement custom kernels for top-k and top-p sampling

As mentioned in https://github.com/WoosukKwon/cacheflow/pull/81#issuecomment-1546980281, the current PyTorch-based top-k and top-p implementation is memory-inefficient. This can be improved by introducing custom kernels.

help wanted

performance

Woosuk Kwon

`AbstractStore.download_remote_dir` is broken.

Problems in using LocalDockerBackend for debugging setup

Refactor resources

Support BLOOM

Publish wheels with pre-built CUDA binaries

Implement custom kernels for top-k and top-p sampling

Do not initialize process group when using a single GPU

Build failure due to CUDA version mismatch

Add tests for models

Add tests for sampler