TensorFlow-Book icon indicating copy to clipboard operation
TensorFlow-Book copied to clipboard

Chapter 08: Using Yahoo historic_Data fails

Open gfilios opened this issue 7 years ago • 4 comments

Hi,

your demo Code - as shown in chapter 8 - which access yahoos historic data fails. The failure is "yahoo_finance.YQLResponseMalformedError: Response malformed." after calling: prices = get_prices('MSFT', '1992-07-22', '2016-07-22')

This "failure" is written in many articles, like: http://www.financial-hacker.com/bye-yahoo-and-thank-you-for-the-fish/#more-2443

gfilios avatar Nov 03 '17 20:11 gfilios

same problem here.

it appears yahoo API has been descontinued, so this code does not work, not even for those who bought the book at manning.

there is a workaround to scrape yahoo finance data using beautiful soup, but it is not workable with this book.

hopefully, there is a fix that I have found to work well:

Yahoo Finance API / URL not working: Python fix for Pandas DataReader where I followed the steps in https://pypi.python.org/pypi/fix-yahoo-finance to:

$ pip install fix_yahoo_finance --upgrade --no-cache-dir $ pip install pandas_datareader

then, do the following:

from pandas_datareader import data as pdr
import fix_yahoo_finance
from matplotlib import pyplot as plt
import numpy as np
import tensorflow as tf
import random


def get_prices(share_symbol, start_date, end_date,
               cache_filename='stock_prices.npy'):

	data = pdr.get_data_yahoo('MSFT', '1992-07-22', '2016-07-22')
	cols = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close']
	data = data.reindex(columns=cols)
	data.reset_index(inplace=True,drop=False)

	stock_prices = data['Open'].values
	np.save(cache_filename, stock_prices)

	return stock_prices

def plot_prices(prices):

	plt.title('Opening stock prices')
	plt.xlabel('day')
	plt.ylabel('price ($)')
	plt.plot(prices)
	plt.savefig('prices.png')
	plt.show()



if __name__ == '__main__':
  prices = get_prices('MSFT', '1992-07-22', '2016-07-22')
  plot_prices(prices)

this should work.

patalanov avatar Nov 08 '17 03:11 patalanov

My solution is to switch the provider. Here is the code in case you want to use alpha vantage.com

from io import StringIO

import numpy as np
import requests
from matplotlib import pyplot as plt



def get_csv_from_alpha(share_symbol):
    # https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&apikey=YOUR_API_CODE_GOES_HERE_&datatype=csv&outputsize=full&symbol=MSFT
    base_url = "https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&apikey= YOUR_API_CODE_GOES_HERE_&datatype=csv&outputsize=full"
    symbol = "&symbol=" + share_symbol
    final_url = base_url + symbol
    response = requests.get(final_url)
    return response.text


def csv2stock(csv_text):
    # timestamp,open,high,low,close,volume
    # 2017-11-03,174.0000,174.2600,171.1200,172.5000,58683826
    c = StringIO(csv_text)
    stocks = np.loadtxt(c, skiprows=1, delimiter=',', usecols=[1])
    return stocks[::-1]


def get_stock_prices(share_symbol):
    csv_text = get_csv_from_alpha(share_symbol)
    stock_prices = csv2stock(csv_text)
    return stock_prices


def get_prices(share_symbol, cache_filename='stockprices.npy'):
    try:
        stock_prices = np.load(cache_filename)
    except IOError:
        stock_prices = get_stock_prices(share_symbol)
        np.save(cache_filename, stock_prices)
    return stock_prices


def plot_prices(prices):
    plt.title('Opening Stock Prices')
    plt.xlabel('day')
    plt.ylabel('price($)')
    plt.plot(prices)
    plt.savefig('prices.png')



if __name__ == '__main__':
    prices = get_prices('MSFT')
    plot_prices(prices)

gfilios avatar Nov 09 '17 19:11 gfilios

I used pandas_datareader which worked nicely: I used 'iex' as the provider

import pandas as pd
pd.core.common.is_list_like = pd.api.types.is_list_like
from pandas_datareader import data

def get_prices(provider,symbol, start_date, end_date, cache_filename='stock_prices.npy'):
    expire_after = datetime.timedelta(days=3)
    session = requests_cache.CachedSession(cache_name='cache', backend='sqlite', expire_after=expire_after)

    stock_hist = data.DataReader(symbol, provider, start_date, end_date, session=session)            
    close_prices = stock_hist['close']
    return [close_prices.values.tolist()]

nisbus avatar Jun 13 '18 15:06 nisbus

same problem here.

it appears yahoo API has been descontinued, so this code does not work, not even for those who bought the book at manning.

there is a workaround to scrape yahoo finance data using beautiful soup, but it is not workable with this book.

hopefully, there is a fix that I have found to work well:

Yahoo Finance API / URL not working: Python fix for Pandas DataReader where I followed the steps in https://pypi.python.org/pypi/fix-yahoo-finance to:

$ pip install fix_yahoo_finance --upgrade --no-cache-dir $ pip install pandas_datareader

then, do the following:

from pandas_datareader import data as pdr
import fix_yahoo_finance
from matplotlib import pyplot as plt
import numpy as np
import tensorflow as tf
import random


def get_prices(share_symbol, start_date, end_date,
               cache_filename='stock_prices.npy'):

	data = pdr.get_data_yahoo('MSFT', '1992-07-22', '2016-07-22')
	cols = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close']
	data = data.reindex(columns=cols)
	data.reset_index(inplace=True,drop=False)

	stock_prices = data['Open'].values
	np.save(cache_filename, stock_prices)

	return stock_prices

def plot_prices(prices):

	plt.title('Opening stock prices')
	plt.xlabel('day')
	plt.ylabel('price ($)')
	plt.plot(prices)
	plt.savefig('prices.png')
	plt.show()



if __name__ == '__main__':
  prices = get_prices('MSFT', '1992-07-22', '2016-07-22')
  plot_prices(prices)

this should work.

Worked for me. Saved my day

MUmarAmanat avatar Nov 06 '19 15:11 MUmarAmanat