CPT icon indicating copy to clipboard operation
CPT copied to clipboard

predict methods extremely slow

Open QhenryQ opened this issue 1 year ago • 7 comments

Screen Shot 2024-01-17 at 12 04 45 PM

QhenryQ avatar Jan 17 '24 17:01 QhenryQ

Hi, just curios what objects sequence has? I understand the sequence length, but element in sequence is how big/small?

catchthemonster avatar Jan 17 '24 17:01 catchthemonster

[['soy', 'soy', 'soy', 'corn', 'rice', 'cotton'], ..... ] The training sequences look like above. all of the same length.

QhenryQ avatar Jan 17 '24 17:01 QhenryQ

I am doing with huge amounts of ints... I will run something similar as you did to check performance. Just for fun, I have thread ripper 3970x 32 core 64 threads, and I can tell you in previous versions of cpt I could not overclock the mobo/cpu as machine will overheat and lock. When cpt hits all cores are 100% utilized. I even setup cpu scaling in unix to try to lower frequency in mid run but it would heat mobo/cpu to 104 C just takes a bit more time. Finally I gave up and I am running without any overclocking wich keeps mobo/cpu in check ... I will let you know what comes up with my models with your configurations ...

catchthemonster avatar Jan 17 '24 17:01 catchthemonster

I appreciate the help. This is what I ended up doing. still slow. less 1000 rows per minute. I will just leave it run. If I use predict for the entire sequences, it just stuck there. Screen Shot 2024-01-17 at 12 47 22 PM

QhenryQ avatar Jan 17 '24 17:01 QhenryQ

Hello @QhenryQ, indeed 1k row / sec seems a bit slow.

Usually it can come from several factors, on which we can play to speed up a bit the predictions, at the price of accuracy (probably, not guaranteed)

The MBR is the number of similar sequences needed to be found in the training set before making a prediction. If the dataset is noisy, a higher MBR might increase prediction time by a lot. You can try to let it at 0 and see how it impacts the prediction time and KPI.

The noise_ratio is supposed to speed up a bit the predictions. But there might be a bug I am not aware of, and tuning this meta parameter can also help a bit.

Can you check the CPU of your cores? Checking they are indeed using all the threads correctly?

Don't hesitate to answer it here if the predictions are still slow after playing with the meta parameters

bluesheeptoken avatar Jan 20 '24 15:01 bluesheeptoken

Sounds good. I will test these approaches and keep you posted. Thank you for getting back to me.

QhenryQ avatar Jan 20 '24 15:01 QhenryQ

Hey :wave:

Were you able to make it work faster?

bluesheeptoken avatar Feb 27 '24 10:02 bluesheeptoken