kadarakos

Results 29 comments of kadarakos

Ok I've added a legacy version in `remap_ids.py` is this what you've meant?

> As a final step can you format the docs? Sorry, I've forgot, what are we using to format the docs?

Hey @naveenjafer, We have not implemented parametric `ReLU` functions, but added a bunch of activations since: 1. [`Swish`](https://thinc.ai/docs/api-layers#swish) 2. [`Gelu`](https://thinc.ai/docs/api-layers#gelu) 3. [`Dish`](https://github.com/explosion/thinc/blob/master/thinc/layers/dish.py) (this is our custom more efficient Swish using...

Hey @amirj, Sorry for the late reply! Currently we do not support the Triplet loss, but you could try out [ `L2Distance` ](https://github.com/explosion/thinc/blob/master/thinc/loss.py#L295) and [`CosineDistance`](https://github.com/explosion/thinc/blob/master/thinc/loss.py#L325) losses. Both losses are applicable...

> > This PR makes the `CategoricalCrossentropy` loss more strict only allowing `guesses` and `truths` that represent exclusive classes > > Why can't / shouldn't it (also) support multi-label classification?...

> I don't think it's being used in spaCy right now, but it should. And when we do, I'm proposing we work via specific versions, e.g. > > ``` >...

> We discussed this a while back, but I think this PR would be a good opportunity to replace the overloaded `CategoricalCrossEntropy` class by splitting the non-legacy classes into two...

@danieldk I was looking at the new version of the cross-entropy that handles the `Ragged` as input. I think it makes a lot of sense, but I'm wondering if its...

Using the default PyTorch reset_parameters() for initialization actually fixed it!

Maybe implementing a more standard initialization method as default in the LSTMCell might be helpful for future users. I just have: ```python stdv = 1.0 / math.sqrt(self.hidden_size) for weight in...