Abulhair Saparov

Results 15 comments of Abulhair Saparov

Not really but I think the HuggingFace folks are trying to workaround the issue since it seems to be affecting a bunch of other people. See: https://github.com/bigscience-workshop/Megatron-DeepSpeed/issues/318 But they did...

@bri25yu I got it working after pulling some newer code from a branch. See: https://github.com/bigscience-workshop/Megatron-DeepSpeed/issues/318#issuecomment-1195958248

Thank you for the suggestion. Are you trying to reproduce the experiments in the paper? Or are you trying to run your own RL code using the JBW? You don't...

Our evaluation was quite simple, and we describe it in our paper: we just plot the "reward rate" over time. Where the reward rate is defined as the total reward...

In our experiments, the reward depends on the experiment (i.e. the task). For example, if the task is Collect[Jellybean], the agent receives +1 reward whenever it collects a jellybean item....

Yes the Swift experiments require Swift for Tensorflow. In the README, it is listed under both the [Requirements](https://github.com/eaplatanios/jelly-bean-world#requirements) and [Using Swift](https://github.com/eaplatanios/jelly-bean-world#using-swift) sections that you need Swift for Tensorflow 0.8. We...

@stas00 It seems to be working with `CUDA_LAUNCH_BLOCKING=1`! I'll test with `bigscience/bloom-1b3` next.

@stas00 Actually I just tested both `bigscience/bloom` and `bigscience/bloom-1b3` without `CUDA_LAUNCH_BLOCKING=1` and they both work. This is probably because I pulled newer code from the `bloom-inference` branch of this repo...

@pai4451 I didn't change any code from this repo at all. I followed the installation instructions in the [readme](https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/main/README.md). I invoke the inference script using: `deepspeed --hostfile=$hostfile Megatron-DeepSpeed/scripts/inference/bloom-ds-inference.py --name bigscience/bloom`...