mplfinance icon indicating copy to clipboard operation
mplfinance copied to clipboard

Dataframe index used for plots

Open Philip030601 opened this issue 2 years ago • 3 comments

I have a pandas dataframe with roughly 45 stocks that is indexed with numbers (0 to 2705) with columns: Dates, Open, High, Low, Close, etc.. Data is from dates between 2022-06-15 to 2022-08-15. Dates are in a Date column in the dataframe and not used as the index and type pandas.core.series. I would like to plot a chart for each stock for dates between 2022-06-15 to 2022-08-15. I looked at the examples and all the examples have the Date column as the index. Please advise how I can use mplfinance to plot all as stated. Much thanks in advance,

image

Philip030601 avatar Aug 16 '22 01:08 Philip030601

@Philip030601 -- How did you get the DataFrame in the first place? I'm asking only because the simplest thing to do is to create the datetime index at the same time that you are initially reading in the DataFrame. I realize that sometimes this is not possible, in which case I will shortly post alternative solutions here.

DanielGoldfarb avatar Aug 16 '22 18:08 DanielGoldfarb

In the future, if you have a question about manipulating data, please provide the data file somewhere (not just an image of some of the data). I created my own data for this example (just to be sure it would work). You can see my sample data here.

My DataFrame looks similar to yours:

	Date	Adj Close	Close	High	Low	Open	Volume	Ticker
0	2022-06-13	98.930000	98.930000	105.330002	98.360001	102.849998	9587300  ABNB
1	2022-06-14	98.870003	98.870003	101.349998	97.200996	99.540001	5915300  ABNB
2	2022-06-15	101.470001	101.470001	103.650002	98.919998	100.019997	6260400  ABNB
3	2022-06-16	93.260002	93.260002	97.500000	92.089996	96.190002	9435700  ABNB
4	2022-06-17	99.489998	99.489998	100.610001	93.529999	93.529999	11288700  ABNB
...	...	...	...	...	...	...	...	...
130	2022-08-10	419.989990	419.989990	420.140015	416.720001	418.779999	68665700  SPY
131	2022-08-11	419.989990	419.989990	424.950012	419.209991	422.989990	59489700  SPY
132	2022-08-12	427.100006	427.100006	427.209991	421.029999	422.029999	61644600  SPY
133	2022-08-15	428.859985	428.859985	429.410004	424.709991	424.769989	54048300  SPY
134	2022-08-16	429.700012	429.700012	431.730011	426.880005	427.730011	57589116  SPY
135 rows × 8 columns

There are 3 tickers in it:

print(df['Ticker'].unique())
['ABNB', 'PSHG', 'SPY']

The solution to your problem is as follows:

  1. The 'Date' column values are most likely strings (if they are already datetime objects then you can skip this step), so you need to convert them to datetimes, so that the .set_index() method will turn them into a DatetimeIndex.
  2. Use the DataFrame.set_index() method to create a MultiIndex from the 'Ticker' and 'Date' Columns. This way you can easily reference (select) each Ticker separately and each such reference will be a DataFrame with a DatetimeIndex for that Ticker:
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index(['Ticker','Date'],inplace=True)

Now the DataFrame looks like this:

dfs

Now you can access, and plot, the data for each ticker individually:

for ticker in ['ABNB', 'PSHG', 'SPY']:
    mpf.plot(df.loc[ticker])

DanielGoldfarb avatar Aug 16 '22 18:08 DanielGoldfarb

Much thanks for the reply and the elegant solution. I will try it out and get back with you .

Philip030601 avatar Aug 17 '22 05:08 Philip030601