Jeff Kinnison

Results 9 comments of Jeff Kinnison

Did we determine why variations of "prompt" were being ignored in the template (ex. "Prompt", "prompT", "PrOmPt")?

Is this causing a crash? If so, could you send the traceback? In the past, I used the `.shadhorc` file to point to the Work Queue installation in `.shadho` so...

Hi @MarselScheer, thank you for raising this! I'll see about tracking this down.

Hi @MarselScheer, I was able to repro this. LightGBM GPU training seems to be unstable regardless of whether `skip_save_model` is used. I'm looking into whether pinning an earlier version of...

@ANarayan Let's assume that we can pass in the tokenizer and model into the protocol functions. That should simplify initial development, and we can revisit that design if needed. Also,...

Hi @NishaDeepak while we dig into this could you try the following: 1. Moving the `eos_token_id` and `pad_token_id` to the `generation` section of the config or 2. Removing `eos_token_id` and...

Hi @haoyuejudy, Thanks for bringing this to our attention! I'm looking into this now and will let you know when it's resolved.

The tests are all passing, but locally I found an `IndexError` that occurs in embedding layers when using tied weights. Looking into that and will update here with a fix.

After some more testing, it looks like the `IndexError` is likely an issue with `sample_ratio` rather than tied weights. Setting `sample_ratio: 0.1` in an `agnews` config seems to cause the...