Jonathan Tow comments

Results 34 comments of


                                            Jonathan Tow

Implement more layers that are available in Keras

I've attempted an implementation of an Embedding layer but am running into problems with the Layer protocol's input type requirements. Given that an Embedding layer consumes tensors of indices (UInt/Int)...

Implement more layers that are available in Keras

Hey @Shashi456. Yup. It just wouldn't compile as it relied on the `Raw.gather(params:, atIndices:)` function which requires a BinaryInteger for the second argument. Thanks @rxwei I'll give it a try.

Implement more layers that are available in Keras

Richard's advice resolved the compiler issues I had before regarding input types. Thanks for the suggestion @eaplatanios. The only issue left seems to be differentiating `gathering`. I'll keep an eye...

Add more optimizers and losses

@Shashi456 Categorical Cross-Entropy seems to already be implemented through Softmax Cross-Entropy with Logits. Maybe we can cross it off the list?

Add derivative tests for layers

**Layer Gradient Tests** - [ ] Sequential - [ ] Conv1D - [x] Conv2D - [x] Conv3D - [x] DepthConv2D - [ ] SeparableConv1D - [x] SeparableConv2D - [x] ZeroPadding1D...

Spanish prompts for bias-shades

Hi, @ArjunSubramonian ! The Hindi version of the prompt also results in that same error. I'll merge the PR on the eval-harness side so that we can await the `promptsource`...

Are StableLMs Multilingual Causal Decoders?

These models were only intentionally trained on English data, but some sources within the dataset are known to contain text from other languages. Therefore, you may be able to interact...

initial commit for trlx LORA support

@ethankim00 This looks great! I've made a few changes based on some testing on our cluster. Here's the summary: * Updates "gpt_neo"` model type name in the modifier map. *...

How to implement a conditional reward?

Hello, @James4Ever0 ! We do not plan on incorporating reward modeling into this repository. If you want to get a better idea of such fine-tuning (SFT + RMs) in practice,...

fix type errors and add mypy

Note: #43 removes the `flake8` F82 undefined-name check. This needs to be caught by `mypy`.