Aditya Malte comments

Results 43 comments of


                                            Aditya Malte

Significant struggles with name identification

Hi, So do we now have a large dataset? Would be great if it was open-source

Is there a way to infer topics on new data?

Hi @joewandy , Any updates on this. Could be a crucial feature for document retrieval.

XGBoostJsonParser not working well with 'binary features'

@shah-sid-cutshort are you facing this same problem?

XGBoostJsonParser not working well with 'binary features'

Hey @o19s-admin, This seems to be an old issue, do we have an update/fix on it? If not, then it must be at least mentioned somewhere in the docs that...

[how-to-train] Link to a Google Colab version of the blogpost

@OP I’m working on it, will share when done. Thanks

[how-to-train] Link to a Google Colab version of the blogpost

Check this, A small example I have created https://gist.github.com/aditya-malte/2d4f896f471be9c38eb4d723a710768b#file-smallberta_pretraining-ipynb

[how-to-train] Link to a Google Colab version of the blogpost

@julien-c , I have pruned the dataset to the first 200,000 samples so that the notebook may run quickly on Colab, as this is meant to be more like a...

[how-to-train] Link to a Google Colab version of the blogpost

Hi, The easiest solution (and I have also used the same in my Colab notebook) is just to rename the files using !mv. I know this is a hack but...

[how-to-train] Link to a Google Colab version of the blogpost

@julien-c , this is another issue that I wanted to point out. While renaming does work, it is a bit confusing for the programmer and takes some time to figure...

[how-to-train] Link to a Google Colab version of the blogpost

I’m not sure, I’ll have to see your code for that. Perhaps it could be possible that it is just an incorrect path.