pandas-datareader
pandas-datareader copied to clipboard
Parse FamaFrenchReader 'DESCR'
This is a feature request.
Currently
from pandas_datareader.famafrench import FamaFrenchReader
description = FamaFrenchReader("F-F_Research_Data_Factors").read()["DESCR"]
Returns data as a string with various information, rather than something more easily accessible in Python such as a dictionary.
Suggestion: Parse "DESCR" to return something akin to
{
"title" : "F-F Research Data Factors",
"info" : "This file bla bla",
"names" : {"0" : "(59 rows x 4 cols)", "1" : "Annual Factors: January-December (5 rows x 4 cols)"}
}
Toy example that works as intended on "F-F_Research_Data_Factors" but assumes a very distinct structure on 'descr':
from itertools import groupby
def parse_descr(descr):
"""
Parses the currently returned 'DESCR' string into a dictionary.
Params
------
descr : str
Returns
-------
dict
"""
# Split the string into list on new line and groupby empty rows
descr = [i.strip() for i in descr.split("\n")]
data = [list(s) for e, s in groupby(descr, key=bool) if e]
keys = ["title", "info", "names"]
description = {}
for d, k in zip(data, keys):
if k == "names":
description[k] = {i.split(":")[0]: i.split(":")[1] for i in d}
else:
description[k] = "".join(d).strip("-")
return description