deeptrain
deeptrain copied to clipboard
Can we have an example showing how to use in-memory object, e.g., TF Dataset
Thanks for creating this project. It will be very useful to allow TF users to have something like PyTorch Lighting.
I already have my codes that convert text files into TF Dataset and want to give DeepTrain a try. Will you please provide a simple example showing how to use Dataset in DataGenerator.
Glad you find it useful.
I'm not too familiar with tf.Dataset
; I found some of its mechanisms needlessly complicated, and performance gains from using TFRECORDs negligible. If you have any specific features in mind that tf.Dataset
provides that DataGenerator
doesn't, I can note it as a feature request (or explain it if you're unsure how-to). For example, there's much to gain from Dataset's train-load parallelism, for which I'm willing to consider built-in support (currently, there isn't one).
You have two options: (1) convert your data pipeline from Dataset to a DataGenerator-equivalent; (2) make Dataset work with DataGenerator. I can't say much without seeing your specific Dataset configuration, but I'll show where to look: DataGenerator and DataLoader here, and DataLoader docs. So for (2), you'll need to set up Dataset so that it takes set_num
as input and returns data as output. Each DataGenerator
has two DataLoader
s, one for "data" x
and other for labels y
(see here).
I'm guessing others will have similar concerns, so I'm willing to take a closer look at your case if you share sufficient code.