allennlp-guide icon indicating copy to clipboard operation
allennlp-guide copied to clipboard

Adding the notebook for quick_start/train.py

Open ekdnam opened this issue 4 years ago • 10 comments

currently only the code has been added.

the information and content is yet to be added.

ekdnam avatar Feb 27 '21 13:02 ekdnam

The notebook can be viewed here (incase there are problems on github): nbviewer.jupyter.org

ekdnam avatar Feb 27 '21 13:02 ekdnam

(For https://github.com/allenai/allennlp/issues/5024)

ekdnam avatar Feb 27 '21 14:02 ekdnam

This looks really good in notebook format!

My one concern is that we'd have two copies of the same information (in the notebook vs in the original guide chapter), which means more difficulty in maintaining.

That said, we are striving hard to maintain backwards compatibility moving forward, so it's unlikely we'll have to update any of this anytime soon.

@matt-gardner any concerns?

epwalsh avatar Mar 01 '21 17:03 epwalsh

I don't have any particular objections, because I'm not the one maintaining it =). Seems pretty awkward to be making identical changes in two places if you want to change any content in the guide (assuming you do this for all of the chapters). It doesn't bother me one way or the other if you want to go this route, but finding a way to make one of these auto-generated from the other seems like a much more scalable solution. On the other hand, there are print statements and other binder-specific considerations that you might want to remove when the code goes into a notebook that you run on your own machine, or on colab or something.

matt-gardner avatar Mar 01 '21 17:03 matt-gardner

This looks really good in notebook format!

Thanks @epwalsh!

My one concern is that we'd have two copies of the same information (in the notebook vs in the original guide chapter), which means more difficulty in maintaining.

Once I began the work, I felt the same

ekdnam avatar Mar 01 '21 17:03 ekdnam

... but finding a way to make one of these auto-generated from the other seems like a much more scalable solution.

I agree @matt-gardner.

On the other hand, there are print statements and other binder-specific considerations that you might want to remove when the code goes into a notebook that you run on your own machine, or on colab or something.

Noted

ekdnam avatar Mar 01 '21 17:03 ekdnam

I will look into how the notebooks can be auto-generated. Thanks a lot for the suggestion @matt-gardner!

While writing the content for the colab, I noticed some discrepancies in the code, its explanation on the website, and the current code in allenai\allennlp-guide\quick_start.

For example, the tokenizer in ClassificationTsvReader is tokenizer or WhitespaceTokenizer(), but in the guide it is SpacyTokenizer(). The class in the guide has @DatasetReader.register('classification-tsv') , but not so in the actual code.

I am going to make a list of the discrepancies observed once the train.py notebook would be ready, and get your insights on it once it is done!

ekdnam avatar Mar 01 '21 17:03 ekdnam

@epwalsh can you please once go through your_first_model.ipynb?

ekdnam avatar Mar 06 '21 10:03 ekdnam

@epwalsh thanks a lot for your review! I am very sorry for the late reply. Our study workload at the college has increased, and I have not been able to give much attention to any work outside of college. Once it gets completed, I will begin working on your suggestions immediately.

ekdnam avatar Mar 19 '21 04:03 ekdnam

No rush @ekdnam!

epwalsh avatar Mar 19 '21 16:03 epwalsh