pandas-datareader icon indicating copy to clipboard operation
pandas-datareader copied to clipboard

Decoding stores that was encrypted by Yahoo! Finance recently

Open raphi6 opened this issue 2 years ago • 29 comments

Sorry for any invonvenience, I am new to working on git in such a professional manor so expect errors with pull request.

Changes:

In pandas-datareader/yahoo/daily.py/

I have added function decrypt_cryptojs_aes() to decode the
stores that were previously not allowing any stock data to be 
accessed from Yahoo! Finance due to their new change.

Additionally just changed _read_one_data() so that it reads the
decoded stores and passes on stock data correctly.

I have tested this on a limited number of stocks on my personal project and works good. I have ran the test_yahoo.py and passed 16, failed 4. However, it still is more tests than the current version on GitHub now due to Yahoo! Finance new change (I dont think any Yahoo! stocks work atm). I am unsure of the tests that are failing so some help would be great. I am sure this can be used just as a temporary fix!

I also don't know how to run the 3rd and 4th bullet points below.

  • [x] closes #952
  • [x] tests added / passed
  • [ ] passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • [x] passes black --check pandas_datareader
  • [x] added entry to docs/source/whatsnew/vLATEST.txt

raphi6 avatar Dec 18 '22 23:12 raphi6

Please make this pr a priority. Yahoo api is entirely bricked across related packages.

CKDarling avatar Dec 21 '22 15:12 CKDarling

I think we need to update requirements.txt with packaging and pycryptodome. Some folks also mentioned pycryptodomex but I didn't need this package as far as my testing.

satoshi avatar Dec 23 '22 22:12 satoshi

I think we need to update requirements.txt with packaging and pycryptodome. Some folks also mentioned pycryptodomex but I didn't need this package as far as my testing.

Thanks, just updated it now.

raphi6 avatar Dec 25 '22 00:12 raphi6

Seems your main failures in the Azure DevOps logs are you are failing both the linter (flake8) and the formatter (black) tests.

To pass the formatter, in your project root directory, run: black pandas_datareader then run black --check pandas_datareader to check that worked and commit your changes.

To pass the linter, in your project root directory, run git diff upstream/master -u -- "*.py" | flake8 --diff Then you will have to fix the issues manually. Once you do, run it again to confirm you didn't miss anything.

Finally, commit your changes! Good luck :)

mariamragab avatar Dec 28 '22 20:12 mariamragab

Can't wait till this request gets committed! I use a data reader a lot and this error is causing a lot of problems in my code. thank you for your work @raphi6

sangar3 avatar Dec 28 '22 21:12 sangar3

Seems your main failures in the Azure DevOps logs are you are failing both the linter (flake8) and the formatter (black) tests.

To pass the formatter, in your project root directory, run: black pandas_datareader then run black --check pandas_datareader to check that worked and commit your changes.

To pass the linter, in your project root directory, run git diff upstream/master -u -- "*.py" | flake8 --diff Then you will have to fix the issues manually. Once you do, run it again to confirm you didn't miss anything.

Finally, commit your changes! Good luck :)

I meant to say changed 3 files in the above commit, and I also noticed that it got rid of some 'u's from pandas_datareader/tests/io/test_jsdmx.py AND pandas_datareader/tests/yahoo/test_options.py and I have no idea what that is doing/if it breaks anything?

But now I am struggling with the second command using flake8. I installed it with pip and tried to run the above and get the error message: fatal: bad revision 'upstream/master' usage: flake8 [options] file file ... flake8: error: unrecognized arguments: --diff

Have been trying to understand what the problem is but im completely unfamiliar with git diff and flake8

raphi6 avatar Dec 28 '22 23:12 raphi6

Can't wait till this request gets committed! I use a data reader a lot and this error is causing a lot of problems in my code. thank you for your work @raphi6

Me too, my dissertation is using this library and it wont work until this gets accepted :) I have already submitted

raphi6 avatar Dec 28 '22 23:12 raphi6

Has anyone reached out to get this merged? And does it work on Windows?

spot92 avatar Jan 05 '23 03:01 spot92

Has anyone reached out to get this merged? And does it work on Windows?

Asking the same question here. I'm trying to use Tiingo instead but that doesn't seem to be working either?

robliou avatar Jan 05 '23 04:01 robliou

Has anyone reached out to get this merged? And does it work on Windows?

I've emailed @bashtage a couple of times with no luck, i believe he is the only one to merge. Also I'm on Windows 10 and works. Maybe you guys can try contact him as well @robliou ?

raphi6 avatar Jan 05 '23 12:01 raphi6

If you've already emailed (or contacted on github or whatever) him/her, then all there is to do is wait

spot92 avatar Jan 05 '23 14:01 spot92

Sometimes I get the following error : If anyone could help out that would be great

""" Traceback (most recent call last): File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 406, in Backtest().range_of_days() File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 392, in range_of_days var = VaR(stock_list, temp_start, temp_end, weights, alpha).historical_var() * np.sqrt(t) File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 38, in init yahoo_data = pandasdr.get_data_yahoo(s, end=end, start=start)['Close'] File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\data.py", line 80, in get_data_yahoo return YahooDailyReader(*args, **kwargs).read() File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 258, in read df = self._dl_mult_symbols(self.symbols) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 268, in _dl_mult_symbols stocks[sym] = self._read_one_data(self.url, self._get_params(sym)) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\yahoo\daily.py", line 238, in _read_one_data data = new_j["HistoricalPriceStore"] UnboundLocalError: local variable 'new_j' referenced before assignment

Process finished with exit code 1 """

raphi6 avatar Jan 12 '23 20:01 raphi6

https://github.com/ranaroussi/yfinance/issues/1291#issuecomment-1382278565 this work for me but keeping unpad block size at 16

         encrypted_stores = data['context']['dispatcher']['stores']
-        _cs = data["_cs"]
-        _cr = data["_cr"]
-
-        _cr = b"".join(int.to_bytes(i, length=4, byteorder="big", signed=True) for i in json.loads(_cr)["words"])
-        password = hashlib.pbkdf2_hmac("sha1", _cs.encode("utf8"), _cr, 1, dklen=32).hex()
+        password_key = next(key for key in data.keys() if key not in ["context", "plugins"])
+        password = data[password_key]

         encrypted_stores = b64decode(encrypted_stores)

Fconel avatar Jan 14 '23 02:01 Fconel

Sometimes I get the following error : If anyone could help out that would be great

""" Traceback (most recent call last): File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 406, in Backtest().range_of_days() File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 392, in range_of_days var = VaR(stock_list, temp_start, temp_end, weights, alpha).historical_var() * np.sqrt(t) File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 38, in init yahoo_data = pandasdr.get_data_yahoo(s, end=end, start=start)['Close'] File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\data.py", line 80, in get_data_yahoo return YahooDailyReader(*args, **kwargs).read() File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 258, in read df = self._dl_mult_symbols(self.symbols) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 268, in _dl_mult_symbols stocks[sym] = self._read_one_data(self.url, self._get_params(sym)) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\yahoo\daily.py", line 238, in _read_one_data data = new_j["HistoricalPriceStore"] UnboundLocalError: local variable 'new_j' referenced before assignment

Process finished with exit code 1 """

Hi @raphi6 !

FIrst of all, thank you very much for trying to fix this, I have just starting playinf with this and now is broken :cry:

This error you are seeing is happening because the response for that particular stock you are looking for does not have the keys _cr and _cs. This new variable new_j its created ONLY if the condition is met, however the rest of your code is reliying on this variable.

I have tried with 'AAPL', 'GOOGL', 'AMZN' and none of them return the keys _cs and _cr, so I was wondering if you could share an example of a stock that returns that keys.

And now I want to ask you, how did you get to that _cs and _cr keys, is there some documentation for the API we are consuming? (sorry if it's a dummy question but I am not able to find it)

NOTE: Also, it feels weird that they are encrypting something... and sharing also the key. That's pointless (or I might be missing something :sweat_smile: )

CarlosEspinoTimon avatar Jan 14 '23 09:01 CarlosEspinoTimon

Sometimes I get the following error : If anyone could help out that would be great

""" Traceback (most recent call last): File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 406, in Backtest().range_of_days() File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 392, in range_of_days var = VaR(stock_list, temp_start, temp_end, weights, alpha).historical_var() * np.sqrt(t) File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 38, in init yahoo_data = pandasdr.get_data_yahoo(s, end=end, start=start)['Close'] File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\data.py", line 80, in get_data_yahoo return YahooDailyReader(*args, **kwargs).read() File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 258, in read df = self._dl_mult_symbols(self.symbols) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 268, in _dl_mult_symbols stocks[sym] = self._read_one_data(self.url, self._get_params(sym)) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\yahoo\daily.py", line 238, in _read_one_data data = new_j["HistoricalPriceStore"] UnboundLocalError: local variable 'new_j' referenced before assignment

Process finished with exit code 1 """

Hi @raphf6,

Thanks for all your work! In my opinion this isn't an error that occurs sometimes or for specific stocks but something must have been changed on Yahoo's side again. Your solution worked for me until yesterday and since yesterday I get the same error (and I am only requesting one specific fund all the time).

bneumayer avatar Jan 14 '23 12:01 bneumayer

https://github.com/hellc/pandas-datareader/commit/87dda3f297df8f4b3253c6f2d5006b5ac43a9150 fixed

hellc avatar Jan 15 '23 20:01 hellc

If u cant wait to merge use this pip install git+https://github.com/hellc/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150

hellc avatar Jan 15 '23 20:01 hellc

If u cant wait to merge use this pip install git+https://github.com/hellc/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150

Encryption genius! Thank you so much Ivan! I will test this out later tonight hopefully.

raphi6 avatar Jan 15 '23 20:01 raphi6

If u cant wait to merge use this pip install git+https://github.com/hellc/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150

Do I have to do any merging? Sorry I am quite new to Git

raphi6 avatar Jan 15 '23 20:01 raphi6

If u cant wait to merge use this pip install git+https://github.com/hellc/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150

Do I have to do any merging? Sorry I am quite new to Git

Accept this PR into your branch and u would be fine. https://github.com/raphi6/pandas-datareader/pull/1

hellc avatar Jan 15 '23 20:01 hellc

@CharliesAngel1 What does it mean that these are approved? Do we still have to wait for a merge?

raphi6 avatar Jan 18 '23 14:01 raphi6

pip install git+https://github.com/hellc/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150

you are a legend bro

thanks again

sangar3 avatar Jan 20 '23 15:01 sangar3

All looks good.

Just, to be sure - is this solution working for everybody? @hellc's solution does not work for me (out of the box) as it requires packaging version 22 or higher, which is not available for my setting, but it seems the changes were accepted into @raphi6's solution, right? I saw the update but I still get only errors for my request. If these solutions work for everyone else, it's obviously me. Which is OK, I just want to make sure :)

bneumayer avatar Jan 23 '23 08:01 bneumayer

All looks good.

Just, to be sure - is this solution working for everybody? @hellc's solution does not work for me (out of the box) as it requires packaging version 22 or higher, which is not available for my setting, but it seems the changes were accepted into @raphi6's solution, right? I saw the update but I still get only errors for my request. If these solutions work for everyone else, it's obviously me. Which is OK, I just want to make sure :)

It's still working in BTC pairs so was enough for me. Let us know which pairs is not working for you, maybe we will figure out what else they have changed and fix it too

hellc avatar Jan 24 '23 04:01 hellc

All looks good.

Just, to be sure - is this solution working for everybody? @hellc's solution does not work for me (out of the box) as it requires packaging version 22 or higher, which is not available for my setting, but it seems the changes were accepted into @raphi6's solution, right? I saw the update but I still get only errors for my request. If these solutions work for everyone else, it's obviously me. Which is OK, I just want to make sure :)

It's still working in BTC pairs so was enough for me. Let us know which pairs is not working for you, maybe we will figure out what else they have changed and fix it too

File "rsi.py", line 14, in <module> data = web.DataReader(stock, 'yahoo', start, end) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas\util\_decorators.py", line 211, in wrapper return func(*args, **kwargs) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\data.py", line 370, in DataReader return YahooDailyReader( File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\base.py", line 253, in read df = self._read_one_data(self.url, params=self._get_params(self.symbols)) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 227, in _read_one_data new_j = decrypt_cryptojs_aes( File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 81, in decrypt_cryptojs_aes plaintext = unpad(plaintext, 16, style="pkcs7") File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\Crypto\Util\Padding.py", line 92, in unpad raise ValueError("Padding is incorrect.") ValueError: Padding is incorrect.

All looks good.

Just, to be sure - is this solution working for everybody? @hellc's solution does not work for me (out of the box) as it requires packaging version 22 or higher, which is not available for my setting, but it seems the changes were accepted into @raphi6's solution, right? I saw the update but I still get only errors for my request. If these solutions work for everyone else, it's obviously me. Which is OK, I just want to make sure :)

It's still working in BTC pairs so was enough for me. Let us know which pairs is not working for you, maybe we will figure out what else they have changed and fix it too

I am getting this error for all the pairs now,

File "rsi.py", line 14, in data = web.DataReader(stock, 'yahoo', start, end) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas\util_decorators.py", line 211, in wrapper return func(*args, **kwargs) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\data.py", line 370, in DataReader return YahooDailyReader( File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\base.py", line 253, in read df = self._read_one_data(self.url, params=self._get_params(self.symbols)) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 227, in _read_one_data new_j = decrypt_cryptojs_aes( File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 81, in decrypt_cryptojs_aes plaintext = unpad(plaintext, 16, style="pkcs7") File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\Crypto\Util\Padding.py", line 92, in unpad raise ValueError("Padding is incorrect.") ValueError: Padding is incorrect.

sangar3 avatar Jan 25 '23 19:01 sangar3

I am getting this error for all the pairs now,

File "rsi.py", line 14, in data = web.DataReader(stock, 'yahoo', start, end) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas\util_decorators.py", line 211, in wrapper return func(*args, **kwargs) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\data.py", line 370, in DataReader return YahooDailyReader( File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\base.py", line 253, in read df = self._read_one_data(self.url, params=self._get_params(self.symbols)) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 227, in _read_one_data new_j = decrypt_cryptojs_aes( File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 81, in decrypt_cryptojs_aes plaintext = unpad(plaintext, 16, style="pkcs7") File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\Crypto\Util\Padding.py", line 92, in unpad raise ValueError("Padding is incorrect.") ValueError: Padding is incorrect.

Can confirm. When I try:

import pandas_datareader.data as web symbol = '0P0001ICNW.F' res_yahoo = web.DataReader(symbol, 'yahoo')

the result is the same for me: ValueError: Padding is incorrect.

bneumayer avatar Jan 28 '23 15:01 bneumayer

Also going to bump this ValueError: Padding is incorrect. issue.

I installed pandas-datareader with: pip install git+https://github.com/raphi6/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150 And I got the following error:

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): finance.yahoo.com:443
DEBUG:urllib3.connectionpool:https://finance.yahoo.com:443 "GET /quote/ATEN/history?period1=1674947342&period2=1675587599&interval=1d&frequency=1d&filter=history HTTP/1.1" 200 None
DEBUG:root:################################################################
ERROR:root:Padding is incorrect.
Traceback (most recent call last):
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\libs\stock_data.py", line 66, in _get_ticker_data
    __get_recent_price(ticker)
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\libs\stock_data.py", line 51, in __get_recent_price
    data = __get_data(ticker).tail(1)
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\libs\stock_data.py", line 41, in __get_data
    data = web.DataReader(ticker, data_source='yahoo', start=start, end=dt.datetime.today())
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\pandas\util\_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\pandas_datareader\data.py", line 379, in DataReader
    ).read()
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\pandas_datareader\base.py", line 253, in read
    df = self._read_one_data(self.url, params=self._get_params(self.symbols))
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\pandas_datareader\yahoo\daily.py", line 227, in _read_one_data
    new_j = decrypt_cryptojs_aes(
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\pandas_datareader\yahoo\daily.py", line 81, in decrypt_cryptojs_aes
    plaintext = unpad(plaintext, 16, style="pkcs7")
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\Crypto\Util\Padding.py", line 92, in unpad
    raise ValueError("Padding is incorrect.")
ValueError: Padding is incorrect.

I went into pycryptodome to figure out where this error is coming from: https://github.com/Legrandin/pycryptodome/blob/8bba4a056fb6b5cb7cc9616da3d36893f759efe8/lib/Crypto/Util/Padding.py#L92 inside of daily.py line 81 plaintext = unpad(plaintext, 16, style="pkcs7") causes this error because the condition

        padding_len = bord(padded_data[-1])
        if padding_len<1 or padding_len>min(block_size, pdata_len):
            raise ValueError("Padding is incorrect.")

fails. I'm not familiar with cryptography or whatever it is that is happening here. However, I did this investigation to hopefully get someone on the right track here. Hopefully someone fixes this soon.

VoxLight avatar Feb 04 '23 19:02 VoxLight

Haven't heard anything in a month. Any status on fixing issue 953/952? Will Pandas-datareader every work again to scrape price data from Yahoo?

uad1098 avatar Mar 11 '23 22:03 uad1098

Haven't heard anything in a month. Any status on fixing issue 953/952? Will Pandas-datareader every work again to scrape price data from Yahoo?

More than likely, Pandas-datareader will eventually become functional again. If you need to access market data right now, I recommend checking out the yfinance library. It's clearly possible to get data from Yahoo! Finance, the question is just when it will be supported inside of pandas again.

VoxLight avatar Mar 22 '23 12:03 VoxLight