deepdow
deepdow copied to clipboard
Increase in Columns
After running raw_to_Xy to take in multi indexed data, how do I amend the code in the getting started Python notebook such that it now takes into account the change in matrix size? Sorry, I am still trying to figure it out and is quite new to this
Hey @isaactyj
Do you think you could share a minimal reproducible example of your issue? Without leaking any private information - feel free to to create a minimal dataset and upload it if necessary.
The data I use is just from Yahoo Finance, so i can just upload the code to get the multi-index csv
stocks = ['AAPL', 'GOOGL', 'MSFT', 'AMZN', 'TSLA']
start_date = '2013-01-01'
end_date = '2019-01-01'
data = yf.download("AAPL", start=start_date, end=end_date)
data = data.reset_index()
columns_multi_index = pd.MultiIndex.from_product([stocks, ['Close', "Volume"]], names=['Stock', 'Metric'])
all_data = pd.DataFrame(columns=columns_multi_index, index=dates)
for i in stocks:
print(i)
data = yf.download(i, start=start_date, end=end_date)
close = data['Close']
volume = data['Volume']
all_data[(i, 'Close')] = close.tolist()
all_data[(i, 'Volume')] = volume.tolist()
n_timesteps = len(all_data) # 20
n_channels = len(all_data.columns.levels[0]) # 2
n_assets = len(all_data.columns.levels[1]) # 2
lookback, gap, horizon = 5, 2, 4
n_samples = n_timesteps - lookback - horizon - gap + 1 # 10
X, timestamps, y, asset_names, indicators = raw_to_Xy(all_data,
lookback=lookback,
gap=gap,
freq="B",
horizon=horizon,
use_log=True)
assert X.shape == (n_samples, n_channels, lookback, n_assets)
assert timestamps[0] == all_data.index[lookback]
I instantly get an assertion error for both assertions. I am actually using 18 stocks but for simplicity ill just change it to 5 stocks