vectorbt
vectorbt copied to clipboard
Consider resampling the ccxt data or binance data
Thank you for your amazing project.
I am learning this project now.
But when I try to use the BinanceData or CCXTData
I found that there are only 9900 rows.
however there should be 7 * 24 * 60 which is 10080 rows.
That means there are some missing data from ccxt or from binance.
However, doing time series analysis should first resample the datas, so how about adding resampling option for the downloaded data?
Thank you again for you great work
@XieXiaonan there are gaps when Binance is down. Resampling means changing the frequency of data. But you rather want to add missing data points with nan? Please elaborate.
I mean resampling with the same frequency. And fill the missing data with previous data. For example, the data between 01:00 - 01:20 are lost dual to Binance. We can fill all the candlestick between these time with Open=High=Low=Close=Close['00:59'] Of course we can do it without vectorbt, but it is a necessary process for everyone(I believe, otherwise it make no sense for time series analysis when the data freq are changing). So I think maybe it can be add to vbt
Forward filling missing data isn't always the best approach. Having hundreds of data points with the same price and volume can impact your model severely (since it doesn't know that your data is missing, it just thinks that everything stays the same and can easily overfit on those data). I would rather fill those gaps with nan so your algorithm knows there is missing data, and you as a user can then forward fill those nans if you want.
Cool, filling with nan is also what I want