data-pipelines-course
data-pipelines-course copied to clipboard
can not fetch google nor yahoo data by pandas data reader
Hi Kjam,
I have issue when try to fetch the finance data.
it seems like some authority issue, not sure if you have encountered it and got it resolved.
Best regards,
YX
Hi YX,
Thanks for reaching out. What was the error that you received?
Thanks! -kj
Hi, I had the same issue. Here is an error message when using 'yahoo' as a source:
Traceback (most recent call last): File "C:/Users/Administrator/Dropbox/Projects/drafts/yahoo.py", line 18, in
print(get_stock_info('FB', datetime(2017, 1, 1), datetime(2017, 3, 1))) File "C:/Users/Administrator/Dropbox/Projects/drafts/yahoo.py", line 6, in get_stock_info df = data.DataReader(stock, source, start, end) File "C:\Program Files\Python36\lib\site-packages\pandas_datareader\data.py", line 291, in DataReader raise ImmediateDeprecationError(DEP_ERROR_MSG.format('Yahoo Daily')) pandas_datareader.exceptions.ImmediateDeprecationError: Yahoo Daily has been immediately deprecated due to large breaks in the API without the introduction of a stable replacement. Pull Requests to re-enable these data connectors are welcome. See https://github.com/pydata/pandas-datareader/issues
Here is my workaround: Based on documentation: https://pandas-datareader.readthedocs.io/en/latest/remote_data.html I picked a 'morningstar' source (cause it sounds cool), instead of 'yahoo'
The only issue with that, it doesn't return the 'Adj Close' value, so I had to remove this string from the aggregation part.
Here is the solution that works for me:
from pandas_datareader import data
def get_stock_info(stock, start, end, source='morningstar'):
df = data.DataReader(stock, source, start, end)
df['Stock'] = stock
agg = df.groupby('Stock').agg({
'Open': ['min', 'max', 'mean', 'median'],
'Close': ['min', 'max', 'mean', 'median'],
'High': ['min', 'max', 'mean', 'median'],
'Low': ['min', 'max', 'mean', 'median'],
})
agg.columns = [' '.join(col).strip() for col in agg.columns.values]
return agg.to_json()
Hope that helps.
Thanks
Morningstar is now discontinued per: https://github.com/pydata/pandas-datareader/issues/557
You can still go to Morningstar's site and sign up for API access, but pandas_datareader has no input I can see for a Morningstar key.
The IEX source seems to work (in general, I haven't tested this course's code with it yet), use 'iex'.
'iex' works if you change the column names to use lower case and the 'Adj close' to 'close'.
Hi, I am having problems following the tutorial. I keep getting this error while importing get_stock_info from tasks
In [1]: from tasks import get_stock_info
KeyError Traceback (most recent call last) /anaconda3/envs/test/lib/python3.6/configparser.py in _unify_values(self, section, vars) 1137 try: -> 1138 sectiondict = self._sections[section] 1139 except KeyError:
KeyError: 'celery'
During handling of the above exception, another exception occurred:
NoSectionError Traceback (most recent call last)
in () ----> 1 from tasks import get_stock_info ~/Downloads/data-pipelines-course-master/celery_app/tasks.py in
() 1 ''' Task module for showing celery functionality. ''' 2 from pandas_datareader import data ----> 3 from celeryapp import app 4 from urllib.error import HTTPError, URLError 5 import pandas as pd ~/Downloads/data-pipelines-course-master/celery_app/celeryapp.py in
() 16 config.read(os.path.join(current_dir, 'config/dev.cfg')) 17 ---> 18 app = Celery('tasks', broker=config.get('celery', 'broker_url')) 19 20 CELERY_CONFIG = { /anaconda3/envs/test/lib/python3.6/configparser.py in get(self, section, option, raw, vars, fallback) 779 """ 780 try: --> 781 d = self._unify_values(section, vars) 782 except NoSectionError: 783 if fallback is _UNSET:
/anaconda3/envs/test/lib/python3.6/configparser.py in _unify_values(self, section, vars) 1139 except KeyError: 1140 if section != self.default_section: -> 1141 raise NoSectionError(section) 1142 # Update with the entry specific variables 1143 vardict = {}
NoSectionError: No section: 'celery'
I'm having the same issue as @ajatau https://github.com/kjam/data-pipelines-course/issues/5#issuecomment-465202197