Joe Hoover comments

Results 7 comments of


                                            Joe Hoover

[BUG] GPT-J InferenceEngine outputs diverging from base GPT-J

Hey @RezaYazdaniAminabadi , thanks for looking into this. Something seems weird...are you using the fp16 weights? The issue I described above aside, I'm getting consistently different results from you even...

[BUG] GPT-J InferenceEngine outputs diverging from base GPT-J

@PanQiWei , could you share a reproducible code chunk? I'd like to see if we're getting the same outputs and if I can reproduce your observations. Also, just to confirm,...

[BUG] GPT-J InferenceEngine Initialization Failure: `RuntimeError`

@jeffra , sorry for the delay. I just confirmed that I am able to initialize the InferenceEngine. Thanks! unfortunately, I'm now noticing strongly divergences between Transformers GPT-J outputs and GPT-J...

`cog push` fails without sudo if user is not in `docker` group, and fails with sudo if `cog login` is done without sudo

@andreasjansson, +1 to an annoying road block when getting started with Cog. But, glad I found this issue, saved me some heartache! Fwiw, I ran into this on a Lambda...

High latency on the first inference call

Hey @dankolesnikov and @rlancemartin, sorry for the delay! @dankolesnikov, I was thinking the same thing; however, I just checked and we have the model set to always on. @rlancemartin, have...

Spaces before dots and commas

Thanks @python273! This may be caused by an interaction between Llama's sentencepiece tokenizer and the way we process token streams (e.g. see [here](https://huggingface.co/docs/transformers/main/model_doc/llama2#:~:text=The%20LLaMA%20tokenizer%20is%20a%20BPE%20model%20based%20on%20sentencepiece.%20One%20quirk%20of%20sentencepiece%20is%20that%20when%20decoding%20a%20sequence%2C%20if%20the%20first%20token%20is%20the%20start%20of%20the%20word%20(e.g.%20%E2%80%9CBanana%E2%80%9D)%2C%20the%20tokenizer%20does%20not%20prepend%20the%20prefix%20space%20to%20the%20string.)).

Enhancement: support larger file download

I just have an anecdotal sample size, but I've found that pget works as-is for that model when run on a A100 instance. Download time with pget was between 21-24...