pandas-ta VWAP - Base Anchor Multiples (5M, 2H, et al) with regex

Firstly, most of the END of time frame anchors work perfectly except the 6 month. The 6 month is calculating exactly as the 1 month(M).

df_D['VWAP_Y'] = ta.vwap(df_D.High, df_D.Low, df_D.Close, df_D.Volume, anchor="Y")
df_D['VWAP_6M'] = ta.vwap(df_D.High, df_D.Low, df_D.Close, df_D.Volume, anchor="6M") ###NOT WORKING
df_D['VWAP_Q'] = ta.vwap(df_D.High, df_D.Low, df_D.Close, df_D.Volume, anchor="Q")
df_D['VWAP_M'] = ta.vwap(df_D.High, df_D.Low, df_D.Close, df_D.Volume, anchor="M")
df_D['VWAP_W'] = ta.vwap(df_D.High, df_D.Low, df_D.Close, df_D.Volume, anchor="W")

Secondly, the START of time frame anchors dont work at all.

df_D['VWAP_Y'] = ta.vwap(df_D.High, df_D.Low, df_D.Close, df_D.Volume, anchor="SY")
df_D['VWAP_6SM'] = ta.vwap(df_D.High, df_D.Low, df_D.Close, df_D.Volume, anchor="6SM")
df_D['VWAP_Q'] = ta.vwap(df_D.High, df_D.Low, df_D.Close, df_D.Volume, anchor="SQ")
df_D['VWAP_M'] = ta.vwap(df_D.High, df_D.Low, df_D.Close, df_D.Volume, anchor="SM")

I get the following errors respectively per line.

'pandas._libs.tslibs.offsets.YearBegin' object has no attribute '_period_dtype_code'
'pandas._libs.tslibs.offsets.MonthBegin' object has no attribute '_period_dtype_code'
'pandas._libs.tslibs.offsets.QuarterBegin' object has no attribute '_period_dtype_code'
'pandas._libs.tslibs.offsets.MonthBegin' object has no attribute '_period_dtype_code'

Note: Im following the offset aliases guide in the link provided from help(ta.vwap)

https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases

Dec 08 '21 03:12 psychoMLM

Hello @psychoMLM,

Yeah, those are Pandas errors. It's also nice when people properly fill out bug reports so I don't have to ask the same questions, over and over, such as:

What version of Pandas TA are you running?
Do you have TA Lib installed?
Do you have a sample csv of the data and code (which you partially included) to share so the errors can be replicated?

Unfortunately, there are numerous other bugs and indicator requests queued up. If you want this to be resolved sooner and know how to fix it, I recommend forking, editing and submitting a PR.

Kind Regards, KJ

Dec 08 '21 04:12 twopirllc

I' running into the same issue. I'm using PandasTA-v0.3.14b I have TA-Lib installed I've also include the dataset used in the code below The dataset is from a YouTube tutorial found here

I'm going to presume that the 'anchor' parameter is bases on the 'frez' parameter in pandas. Both is based on the same timeseries offsets as per.

If I use the the base values (D, H, T, S) every thing works fine, but the moment I use a multiple of a offset like 6H it only applies the base offset H

df_D['VWAP_Y'] = ta.vwap(df_D.High, df_D.Low, df_D.Close, df_D.Volume, anchor="H")
df_D['VWAP_6M'] = ta.vwap(df_D.High, df_D.Low, df_D.Close, df_D.Volume, anchor="6H") ###Returns the same as

Furthermore, if it's based on the pandas 'frez' parameter. Those work fine, so I don't believe it's a pandas issue.

data = data.resample("1T").ffill()
data_5m = data.resample("5T").last()

I'm no coder, but I manage to slap something together that work. Maybe it could help. Sorry if posting something so long is inappropriate for this forum, but I'm new to this. I did keep it to the minimum. This is my first every posting.

import os
import numpy as np
import pandas as pd
import datetime as dt
from datetime import timedelta
import re

# region Setting pandas display settings
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
# endregion


# region DateTime Extension
class datetime(dt.datetime):
    def __divmod__(self, delta):
        seconds = int((self - dt.datetime.min).total_seconds())
        remainder = dt.timedelta(
            seconds=seconds % delta.total_seconds(),
            microseconds=self.microsecond,
        )
        quotient = self - remainder
        return quotient, remainder

    def __floordiv__(self, delta):
        return divmod(self, delta)[0]

    def __mod__(self, delta):
        return divmod(self, delta)[1]
# endregion


# region Regex Methods
def extract_numbers(text):
    regex = "[0-9]+"
    return re.findall(regex, str(text))


def extract_text(text):
    regex = "[a-zA-Z]+"
    return re.findall(regex, str(text))


def split_achor(text):
    try:
        bin = int(extract_numbers(text)[0])
    except:
        bin = 1
    timeframe = extract_text(text)[0]
    return bin, timeframe
# endregion


# region Implementation of the DateTime Extension
def get_mod(date, anchor='D'):
    t, b = split_achor(anchor)

    # TODO Implement a delta for Years and Months
    # years = ['Y', 'y', 'Year', 'year', 'Years', 'years']
    # months = ['M', 'm', 'Mon', 'mon', 'Months', 'months']
    days = ['D', 'd', 'Day', 'day', 'Days', 'days']
    hours = ['H', 'h', 'hour', 'hours']
    minutes = ['T', 't', 'M', 'm', 'min']
    seconds = ['S', 's', 'Sec', 'sec', 'sec']

    if b in set(minutes):
        delta = datetime.strptime(date, "%Y-%m-%d %H:%M:%S") % timedelta(minutes=t)
    elif b in set(hours):
        delta = datetime.strptime(date, "%Y-%m-%d %H:%M:%S") % timedelta(hours=t)
    elif b in set(seconds):
        delta = datetime.strptime(date, "%Y-%m-%d %H:%M:%S") % timedelta(seconds=t)
    elif b in set(days):
        delta = datetime.strptime(date, "%Y-%m-%d %H:%M:%S") % timedelta(days=t)
    else:
        delta = datetime.strptime(date, "%Y-%m-%d %H:%M:%S") % timedelta(days=1)

    if delta == timedelta(0.0):
        return 1
    else:
        return 0
# endregion


def vwap(data, anchor='D'):
    data['typical_price (hlc3)'] = (data['High'] + data['Low'] + data['Close']) / 3
    data['pw'] = data['typical_price (hlc3)'] * data['Volume']
    data['group'] = (data.apply(lambda x: get_mod(str(x.name), anchor=anchor), axis=1)).cumsum()
    data['vwap'] = data.groupby(['group'])['pw'].cumsum() / data.groupby(['group'])['Volume'].cumsum()
    print(data[0:20])


if __name__ == "__main__":
    # region Preprocessing
    root = '..\Data'
    file_name = 'BTCUSD_Candlestick_15_M_ASK_05.08.2019-29.04.2022'
    path = os.path.join(root, f'{file_name}.csv')
    df = pd.read_csv(path)

    df['Gmt time'] = (df['Gmt time'].apply(
        lambda x: datetime.strptime(str(x), "%d.%m.%Y %H:%M:%S.%f"))
    )
    df = df.set_index('Gmt time')
    # endregion

    vwap(df, anchor='2H')

BTCUSD_Candlestick_15_M_ASK_05.08.2019-29.04.2022.csv

Jul 13 '22 00:07 FunckyMonkey

Hello @FunckyMonkey,

Thanks for providing a detailed comment with sample code. It helps a lot. 😎

Furthermore, if it's based on the pandas 'frez' parameter. Those work fine, so I don't believe it's a pandas issue.

Correct. Pandas works as intended. It only handles base anchors (H, M, T, ...) and not their multiples (2H, 5M, 10T, ...). 🤔 Surprised that they do not have a built in anchor string parser like you have provided... or I am not looking in the right place.

I'll check out the Anchor parser when I get a chance, but off the top of my head it looks legit. 😎

Kind Regards, KJ

Jul 17 '22 18:07 twopirllc

pandas-ta pandas-ta copied to clipboard

VWAP - Base Anchor Multiples (5M, 2H, et al) with regex

pandas-ta
pandas-ta copied to clipboard