node-yahoo-finance2
node-yahoo-finance2 copied to clipboard
Some values are wrong with large dataset when concurrency is larger than 1
Bug Report
Describe the bug
I've noticed a weird bug: when I query a large dataset (more than approximately 1000 stocks) in my script, some fields return wrong values that do not match the Yahoo Finance website. For example, financialData.totalRevenue
comes with some random value, and incomeStatementHistory.incomeStatementHistory[0]
brings not the latest year 2022, but 2021 instead.
When I set concurrency in the configuration to 1, it executes much slower but the values are correct. With the concurrency set to 2 the issue appears again. Pretty weird stuff. Have you ever noticed such behavior? I have noticed it for the AAPL stock.
Minimal Reproduction
Promise.all(stocks.map(stock => await yahooFinance.quote(stock)))
where stocks.length > 1000
Replicated here: https://replit.com/join/uyyxlrdydi-yuridrabik
Environment
Browser or Node: Node.js
Node version (if applicable): 18
Npm version: 8
Library version (e.g. 1.10.1
): 2.3.10 (latest)
Additional Context
I've managed to replicate this bug here. Please take a look. Just run the script and wait a moment for the output.
Hey @yurist38! Thanks for the report and most especially for the great reproduction.
Umm... from the way you describe it, I'm going to say that this is likely a Yahoo issue. Unfortunately we do quite commonly see inconsistent behaviour from their API. When I tried your reproduction now, I got validation errors with the big array, but not with a smaller one, even without changing the concurrency
value.
My guess is that it's a caching issue. With higher concurrency, you're more likely to hit different nodes, some which might have a stale cache and hence is returning the wrong results. I have seen cases where Yahoo web will return wrong results too, it really can vary.
Unfortunately there's not much we can do about it. The best thing is all the strict validation we do, which do manage to catch a lot of these cases (case in point, my test now), but obviously if we still get the results back in the correct format, its hard to know if there's a problem with them.
Sorry I can't be of any more help and thanks for all your time in trying to diagnose this issue... please do let me know if you figure out anything else and I'll be happy to dig into further :pray:
Hi @gadicc! Thank you for reviewing this and sharing your experience with the Yahoo API! I also have a feeling that this issue is on their side. Your guess sounds possible, indeed.
Shall we keep this issue open for now? I will keep digging into it. If I discover anything, I'll share my findings here.
Thanks again!
My pleasure, @yurist38, and thanks for your efforts here.
Yes, let's leave this open... and let us know if you figure out anything else.
I'm also leaving open so I can add something to https://github.com/gadicc/node-yahoo-finance2/blob/devel/docs/modules/quote.md#quirks.
I should probably also add something to the README in general about inconsistencies... we currently have "quirk" notes per module, but I think a general note about inconsistencies would be very helpful.