pandas-ta icon indicating copy to clipboard operation
pandas-ta copied to clipboard

STC and PSAR performance improvements

Open Rossco8 opened this issue 11 months ago • 0 comments

Two updates in this PR to address the next slowest indicators. Also a proposed logic change to PSAR see below

STC - All I was able to do was to combine the 2 for loops into 1. Performance is around twice as fast as previous version

Orig STC

132884 function calls (132735 primitive calls) in 0.113 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     8441    0.031    0.000    0.067    0.000 series.py:1086(__getitem__)
     8441    0.017    0.000    0.029    0.000 series.py:1211(_get_value)
        1    0.013    0.013    0.089    0.089 stc.py:300(schaff_tc)
     8481    0.008    0.000    0.009    0.000 series.py:827(_values)
        1    0.005    0.005    0.005    0.005 {built-in method _imp.create_dynamic}
     2516    0.005    0.000    0.005    0.000 {built-in method builtins.round}
     8444    0.003    0.000    0.005    0.000 indexing.py:2765(check_dict_or_set_indexers)
        1    0.003    0.003    0.005    0.005 {built-in method _imp.exec_dynamic}
43050/43030    0.002    0.000    0.002    0.000 {built-in method builtins.isinstance}
     8441    0.002    0.000    0.003    0.000 range.py:408(get_loc)
        1    0.002    0.002    0.015    0.015 __init__.py:2(<module>)
     8481    0.001    0.000    0.001    0.000 managers.py:2002(internal_values)
     8444    0.001    0.000    0.001    0.000 common.py:372(apply_if_callable)
      316    0.001    0.000    0.002    0.000 __init__.py:24(_wrapper)

NEW STC

64301 function calls (64186 primitive calls) in 0.047 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     3433    0.009    0.000    0.021    0.000 series.py:1086(__getitem__)
     3433    0.005    0.000    0.009    0.000 series.py:1211(_get_value)
        1    0.005    0.005    0.033    0.033 stc.py:158(schaff_tc)
     2516    0.005    0.000    0.005    0.000 {built-in method builtins.round}
        1    0.004    0.004    0.004    0.004 {built-in method _imp.create_dynamic}
     3451    0.002    0.000    0.003    0.000 series.py:827(_values)
        1    0.002    0.002    0.003    0.003 {built-in method _imp.exec_dynamic}
     3436    0.001    0.000    0.002    0.000 indexing.py:2765(check_dict_or_set_indexers)
     3433    0.001    0.000    0.001    0.000 range.py:408(get_loc)
17810/17792    0.001    0.000    0.001    0.000 {built-in method builtins.isinstance}
        2    0.001    0.000    0.001    0.000 rolling.py:601(calc)
        1    0.001    0.001    0.001    0.001 base.py:2313(is_unique)

PSAR - Using Numpy arrays within the for loop has resulted in some decent improvements.

ORIG PSAR

265070 function calls (264994 primitive calls) in 0.188 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
3773/3772    0.024    0.000    0.050    0.000 blocks.py:1373(setitem)
     3771    0.016    0.000    0.149    0.000 indexing.py:2529(__setitem__)
     3774    0.015    0.000    0.073    0.000 managers.py:317(apply)
        1    0.014    0.014    0.188    0.188 psar.py:10(psar)
     5028    0.013    0.000    0.022    0.000 indexing.py:2518(__getitem__)
     3771    0.011    0.000    0.123    0.000 series.py:1406(_set_values)
     3772    0.011    0.000    0.085    0.000 managers.py:372(setitem)
     3775    0.009    0.000    0.012    0.000 cast.py:1760(np_can_hold_element)
     3772    0.006    0.000    0.025    0.000 series.py:1486(_maybe_update_cacher)
     3772    0.006    0.000    0.013    0.000 generic.py:3992(_maybe_update_cacher)
     3772    0.005    0.000    0.006    0.000 generic.py:4399(_check_setitem_copy)
     3774    0.004    0.000    0.005    0.000 managers.py:1848(from_blocks)
11486/11482    0.004    0.000    0.006    0.000 {built-in method builtins.getattr}

NEW PSAR

4491 function calls (4418 primitive calls) in 0.009 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.002    0.002    0.004    0.004 psar.py:164(psar_new)
       11    0.000    0.000    0.001    0.000 series.py:389(__init__)
     1340    0.000    0.000    0.000    0.000 {built-in method builtins.min}
  636/617    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
       12    0.000    0.000    0.000    0.000 construction.py:517(sanitize_array)
       89    0.000    0.000    0.000    0.000 generic.py:42(_instancecheck)
       13    0.000    0.000    0.000    0.000 generic.py:6233(__finalize__)
       29    0.000    0.000    0.000    0.000 generic.py:6298(__setattr__)
  183/133    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.004    0.004 {built-in method builtins.exec}
      454    0.000    0.000    0.000    0.000 {built-in method builtins.max}
       18    0.000    0.000    0.000    0.000 generic.py:278(__init__)
       11    0.000    0.000    0.000    0.000 managers.py:1861(from_array)
        6    0.000    0.000    0.000    0.000 series.py:6192(_construct_result)
        3    0.000    0.000    0.000    0.000 base.py:1371(_arith_method)
       11    0.000    0.000    0.000    0.000 generic.py:806(_set_axis)
       20    0.000    0.000    0.000    0.000 {built-in method builtins.all}
       12    0.000    0.000    0.000    0.000 config.py:127(_get_single_key)

Proposed logic change for PSAR The original version did not include row 1 when calculating the long and short values. I don't know if that is intentional. The updated code starts at row 1 in the for loop with for row in range(1, m): rather than for row in range(2, m):

The results of the old and the new compare like this Screenshot 2024-03-11 at 2 46 34 pm

Screenshot 2024-03-11 at 2 46 54 pm

Rossco8 avatar Mar 11 '24 04:03 Rossco8