swift icon indicating copy to clipboard operation
swift copied to clipboard

Need more easy-to-follow first tutorial and use of convenient Data APIs, etc.

Open yogeshhk opened this issue 3 years ago • 2 comments

In Tuotrial https://www.tensorflow.org/swift/tutorials/model_training_walkthrough

  • You can read statement like "Import data with the Epochs API". Nomenclature issue. Data should be imported by Data API, right? How come "Epochs". At training loop, data can be batched, but that's later.
  • Instead of loading txt using Numpy, can pandas be used, as done popularly in python-ML ecosystem. Two calls of loadtxt (one for features, and one again for labels), does not make sense, either.
  • Mention of batching should not be mentioned for such beginners level tutorial, IMO, and especially when dataset is small. The section titled "Create a dataset using the Epochs API" looks unnecessarily complicated and irrelevant.
  • In summary, this tutorial should be as simplistic as corresponding keras tutorial https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/

yogeshhk avatar Oct 16 '20 03:10 yogeshhk

Thank you for the detailed feedback. Hearing specific points we can improve in the documentation is always really appreciated, because when you've been working on something for a while, it's difficult to have the perspective of someone coming to this as a new project.

My impression was that in the early, very experimental days of the project, the target audience of Swift for TensorFlow consisted of researchers or practitioners who had machine learning experience in other frameworks and languages. That colored the tutorials and documentation that were assembled at the time, including the Model Training Walkthrough. The goal in that walkthrough as written was to show how Swift for TensorFlow approached all the parts of this problem, not necessarily to introduce someone to machine learning via Swift.

Now that the project has matured, and we are seeing more people who might be coming from a Swift background that want to start with machine learning, or people who want to start with both Swift and ML, I agree that more introductory material would be valuable.

When it comes to mentions of Epochs and batching in that tutorial, that might be an artifact of the migration from using TensorFlow Datasets to the newer Epochs API. In the original version of that tutorial, we had hidden helper functions that managed the dataset for you and we referred only these functions and how they loaded a dataset. When these Datasets were deprecated in favor of the Epochs API, we replaced mentions of them with Epochs and its parts. I can see how referring to that directly in a tutorial like this might be presenting too many details too soon.

Likewise for the exposure of NumPy there, which was originally hidden in a separate Swift file and not exposed in the notebook. When we converted from Datasets to Epochs, it simplified the code enough that we could include the file loading directly in the notebook. That all could even be done without needing to call out to Python libraries, if we wanted to rework that code. There's no reason that pandas couldn't be used there, but NumPy was what had been chosen originally. We'd be open to having a simpler, cleaner implementation of that.

Again, thank you for the feedback. If you (or anyone else) would like to try to assemble a better introductory tutorial, pull requests would be welcomed.

BradLarson avatar Oct 21 '20 21:10 BradLarson

Thanks a lot for such a detailed response. As I get to know Swift-for-Tensorflow better, I am planning to prepare introductory tutorials and then, I would communicate the same, here.

yogeshhk avatar Oct 22 '20 01:10 yogeshhk