gluonts
gluonts copied to clipboard
Training is extremely slow on Gluonts [Torch]
Description
I am quite frustrating because I am training a model and training is very very slow on RTX 3080. I am training on 500 CSV files. If anyone can help me for this,
To Reproduce
Define the DeepAR estimator
estimator = DeepAREstimator(
prediction_length=12, # Adjust based on how far you want to predict
context_length=24, # Context length should be at least as long as prediction length
freq="1min", # Change to your data's frequency
batch_size=64,
trainer_kwargs={"max_epochs": 1, "accelerator": "gpu"}
).train(training_data, )
predictor = estimator.train(training_data=training_data)
# Load your dataset
# Base directory where the folders are located
base_dir = '/media/cvpr/CM_1/coremax_cpu_usage/coremax_cpu/rnd'
# List of folder names
folders = ['2013-7', '2013-8', '2013-9']
# Initialize an empty DataFrame to store all data
all_data = pd.DataFrame()
# Iterate over each folder and read each file
for folder in folders:
folder_path = os.path.join(base_dir, folder)
for file in os.listdir(folder_path):
if file.endswith('.csv'):
file_path = os.path.join(folder_path, file)
temp_df = pd.read_csv(file_path, delimiter=';')
temp_df.columns = temp_df.columns.str.strip() # Strip whitespace from column names here
all_data = pd.concat([all_data, temp_df], ignore_index=True)
print(all_data)
# Convert timestamp to datetime and set it as the index
all_data['Timestamp'] = pd.to_datetime(all_data['Timestamp [ms]'], unit='ms')
all_data.set_index('Timestamp', inplace=True)
# Prepare the dataset for GluonTS
training_data = ListDataset([{
"start": all_data.index[0],
"target": all_data['CPU usage [MHZ]'].values,
"feat_dynamic_real": all_data[
['CPU cores', 'Memory usage [KB]', 'Disk read throughput [KB/s]', 'Disk write throughput [KB/s]',
'Network received throughput [KB/s]', 'Network transmitted throughput [KB/s]']].values.T
}], freq="1min") # Change '5min' to the actual frequency of your data
# Define the DeepAR estimator
estimator = DeepAREstimator(
prediction_length=12, # Adjust based on how far you want to predict
context_length=24, # Context length should be at least as long as prediction length
freq="1min", # Change to your data's frequency
batch_size=64,
trainer_kwargs={"max_epochs": 1, "accelerator": "gpu"}
).train(training_data, )```
## Error message or code output
(Paste the complete error message, including stack trace, or the undesired output that the above snippet produces.)
Epoch 0: | | 3/? [08:49<00:00, 0.01it/s, v_num=22]
## Environment
- Operating system: 20.04
- Python version: 3.8.18
- GluonTS version: 0.14.3
- MXNet version: Using torch
(Add as much information about your environment as possible, e.g. dependencies versions.)
@khawar-islam what is the performance when running on CPU?
I'm not sure you can expect great performance with a DeepAR model (at least with default hyperparameters) since it's based on a recurrent neural network: this makes the model operations non-parallelizable, hence the GPU utilization will be extremely low.