vectorbt How to skip nan

Hi, if stock suspended, i need to skip the nan, how to do it in vectorbt

import vectorbt as vbt
import numpy as np
import pandas as pd
import talib

# test price
price = np.array([1,2,3,4,5,6,7,8,9], dtype=float)
print(talib.SMA(price, timeperiod=2))

# price with nan
price = np.array([1,2,3,4,5,np.nan, np.nan, 6,7,8,9], dtype=float)
print(talib.SMA(price, timeperiod=2))

# use vectorbt
SMA = vbt.IndicatorFactory.from_talib('SMA')
print(SMA.run(price, timeperiod=2).real.values)


# my way to skip nan in talib
output = np.full_like(price, np.nan)
notnan = ~np.isnan(price)
output[notnan] = talib.SMA(price[notnan], timeperiod=2)
print(output)

output:

[nan 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5]
[nan 1.5 2.5 3.5 4.5 nan nan nan nan nan nan]
[nan 1.5 2.5 3.5 4.5 nan nan nan nan nan nan]
[nan 1.5 2.5 3.5 4.5 nan nan 5.5 6.5 7.5 8.5]

Jun 27 '21 09:06 wukan1986

Talib doesn't play well with nan, use built-in MA:

vbt.MA.run(np.array([1, 2, 3, 4, 5, np.nan, np.nan, 6, 7, 8, 9], dtype=float), 2).ma
0     NaN
1     1.5
2     2.5
3     3.5
4     4.5
5     NaN
6     NaN
7     NaN
8     6.5
9     7.5
10    8.5
Name: 2, dtype: float64

Jun 27 '21 09:06 polakowo

How about some indicators not build in?

Jun 27 '21 09:06 wukan1986

Just forward-fill the price before running an indicator.

Jun 27 '21 09:06 polakowo

but forward fill price get the result is not my want

Jun 27 '21 10:06 wukan1986

There is no easy way of filling nan values after an indicator has been run. You cannot just take all nonna values, run an indicator, and overwrite nan values with them.

Jun 27 '21 10:06 polakowo

Read this

Jun 27 '21 10:06 polakowo

This is also an issue that I constantly encounter when working with portfolios that consist of securities with different trading calendars. When you align the calendars in a pd.DataFrame, you introduce np.nan in various places of the individual time series (columns).

That becomes an issue once you want to compute indicators (such as moving averages, for instance) for all securities. In my view, the correct way is to compute the indicator column-wise. You have to remove all np.nan's in order to clean the time series before actually computing the indicator. After finishing the computation, I usually reindex back to the original datetime index, which gives you an indicator time series with np.nan's at the correct position. You can then decide if you want to propagate the indicator values.

It's probably fair to say that I can't use vbt.MA, if I want this behavior, isn't it?

Dec 14 '21 11:12 andreas-vester

@andreas-vester I'm using the same approach. No, none of the built-in indicators do this on per-column basis. But it's fairly easy to create an own indicator that splits columns, runs own indicator, and merges the results. Maybe I could even integrate it into IndicatorFactory, but most likely into the pro version which is in development. The only drawback of this approach is a (small) performance hit.

Dec 14 '21 12:12 polakowo

I find a way to push value from top to bottom. then use talib

https://stackoverflow.com/questions/32062157/move-non-empty-cells-to-the-left-in-pandas-dataframe

def pushna(arr):
    idx = (~np.isnan(arr)).argsort(axis=0)
    col = np.arange(arr.shape[1])[None]
    return arr[idx, col], idx, col
	
def pullna(arr, row, col):
    tmp = np.empty_like(arr)
    tmp[row, col] = arr
    return tmp

a, row, col = pushna(df)

// call SMA
b = SMA(a)

print(pullna(b, row, col))

Dec 15 '21 01:12 wukan1986

I like this solution.

I found that I need to include the stable kind for argsort to preserve the index order for larger time series.

idx = (~np.isnan(arr)).argsort(axis=0)

idx = (~np.isnan(arr)).argsort(axis=0, kind="stable")

Dec 27 '21 11:12 andreas-vester

vectorbt vectorbt copied to clipboard

How to skip nan

vectorbt
vectorbt copied to clipboard