Parametric UMAP run time performance on GPU
Currently, I am testing the Parametric UMAP on the GPU and I was expecting the performance lift compare to CPU performance.
But it does seem like the run time is not improved at all or even worse.
Here is my setup: Instance Type - g5.16xlarge tf version: 2.8 training data set size: 817,614 with 107 columns (after one hot it become 362 columns) - with intention to run 10 times large size later. full data set size is 80MM + `Parametric parameters: keras_fit_kwargs = {"callbacks": [ tf.keras.callbacks.EarlyStopping( monitor='loss', min_delta=10**-2, patience=10, verbose=1, ) ]}
embedder = ParametricUMAP(verbose=True, batch_size = 512, ##512
keras_fit_kwargs = keras_fit_kwargs,
n_training_epochs = 10)
with tf.device(gpus[0].name): ## '/device:GPU:0'
umap_features = embedder.fit_transform(df_train_transformed)`
both GPU and CPU never get to close on the finish line but based on estimated epoche time: CPU estimated 7 mins versus GPU estimated 13 mins.
Any tutorial/best practice to speed up running Parametric UMAP would be highly appreciated! (either GPU or CPU) thanks a lot!