Matt Watson
Matt Watson
Lemmatization and stemming are things we may need to add eventually, but hey are kinda a whole can of worms. Ideally we could consider them separately. I also worry about...
Yeah, that's a good call re starting on deletion. I am worried about adding a ton of language specific rules based code that we directly ship in our library. If...
This is open for contributions if anyone would like to pick it up!
@chenmoneygithub assigning you again as it sound like you are actively working on this!
This is a pretty important feature, as it will unlock some important models and is widely used. However, there are some technical roadblocks here currently. We would like to keep...
@abheesht17 you are definitely welcome to help with this! This will require some diving into other libraries, to understand the support we have today.
I'm not sure we need to actually train a sentence piece model, though that might help understand things. Basically, the public API we can rely on that might give us...
To add a little more color for others finding this issue, you can train a BPE-style vocabulary with [sentecepiece](https://github.com/google/sentencepiece) today, and a sentencpiece model can be used with tensorflow text,...
I think an `examples/electra` directory is something we would love to have at some point! It is probably still a little early for us... The main issue here is we...
Sounds good. Yeah really an open question to me if we think there could be a good https://keras.io/examples/ sized demonstration of a simplified ELECTRA model. If we think so, that...