Motoki Wu comments

Results 12 comments of


                                            Motoki Wu

error with spacy

Probably this error? So I think update SpaCy. https://github.com/spacy-io/spaCy/issues/375

multiple values for keyword argument 'softmax_loss_function'

Not sure but the code probably won't work on newer TensorFlow. It should work on 0.5. > On May 6, 2016, at 12:18 AM, aewhatley [email protected] wrote: > > I...

multiple values for keyword argument 'softmax_loss_function'

Cool, thanks. There's been a few changes since 0.5 but don't have time to debug now. MT is tricky since it uses a softmax on a large vocab. The Shakespeare...

multiple values for keyword argument 'softmax_loss_function'

relevant issue in TensorFlow: https://github.com/tensorflow/tensorflow/issues/550

[BUG] Pythia (GPT-NeoX based) models degrade in generation quality using DeepSpeed Inference

Hi @satpalsr , I've updated to DeepSpeed 0.8.2, but I'm getting the same results: ``` Setting `pad_token_id` to `eos_token_id`:0 for open-end generation. [2023-03-17 06:03:09,070] [INFO] [logging.py:77:log_dist] [Rank -1] DeepSpeed info:...

[BUG] Pythia (GPT-NeoX based) models degrade in generation quality using DeepSpeed Inference

Looks like version 0.9.4 works :) Closing. Guessing llama support fixed the gpt-neox type models: https://github.com/microsoft/DeepSpeed/pull/3425

[BUG] Incorrect Model Outputs When Using Beam Search

Hi! It would be great if beam search works with DeepSpeed. I'm guessing it's probably the most common decoding algo. used in prod. are other generation strategies supported too? 1....

[BUG] Incorrect Model Outputs When Using Beam Search

@mallorbc I've done some benchmarks using `gpt2` with `fp16` precision on my own data (of course ymmv). System info * cuda version 11.7 * A10G instance 24G * DeepSpeed 0.7.7...

[BUG] Incorrect Model Outputs When Using Beam Search

> @tokestermw Thanks so much for sharing your insights! I assume to get these results you did something like a string compare for results generated with and without DeepSpeed? @mallorbc...

Is the Dependency (child sum tree) option working?

I tried `DEPENDENCY == True` with `FINE_GRAINED == False` and I turned off the relevant asserts in `data_utils` and `tree_rnn` and I got the following error: ``` tree_rnn $ python...