stockdex icon indicating copy to clipboard operation
stockdex copied to clipboard

LRU Cache Buys Nothing if you are bulk loading and saving statements

Open wittling opened this issue 4 months ago • 0 comments

If someone were to use this Ticker object to save off a statement, and then want to hold onto it and use it later, than this cache makes some sense.

But if people are going to poll for refreshment statement data en masse (as a bulk job - which I think is probably the prevailing use case), by looping through thousands of tickers and download annual and/or quarterly statements for that ticker before moving on to the next ticker, I think in this use case, a Least-Recently-Used cache has unneeded overhead and complexity - the caching doesn't buy you anything because you are fetching and saving, never revisiting that ticker or its cached statement data again.

What does need to be cached, from a Macrotrends perspective, are two things:

  1. Delisted symbols
  2. Symbol mnemonics

Delisted Symbols I have noticed that in some cases, when you fetch a statement, you might get a 503, a 429, or some other http code, but in the lower-url of the response you will see a message that contains "delisted". Obviously you don't want to keep fetching delisted symbols continuously. But unless you persisted these Ticker objects and cache-refreshed them with a flag (delisted), the only way to deal with delisted symbols is to save them to a local (i.e. on-disk) file or db cache and then refer to that cache before making API calls on that symbol.

Symbol mnemonics Macrotrends likes to have the "name" of the symbol, in addition to its trading ticker, in the url. In previous code, because this was unknown, it was stubbed with "TBD" which seemed to work (in fact, it did work with annual statements only). But, if you stub in TBD and try to fetch a quarterly statement, you will get an annual statement back. So through testing, I learned that the mnemonic is critically important. One challenge, is that the mnemonics used by macrotrends do not necessarily match the SEC name of the company. So, you have to "discover" the mnemonics as a first-step, and then save them, and pull them out and use them when you subsequently want to fetch statements. This could effectively double the calls, so using a cache for these mnemonics is quite important. For this reason, I suggest adding an attribute called "mnemonic" so that one can fetch the symbol in the mnemonic cache - or make the call northbound to get it - and then add it to the object so that it can be pulled out and used to fetch as many as 6 statements (3 per frequency).

You may have a better idea or suggestion on how to handle these two scenarios I ran into. I did fix them locally on my end, and submitted a pull request. There is some redundant code in the statement handling that could be refactored and streamlined, but so far it seems to work fine in my testing thus far.

wittling avatar Sep 04 '25 19:09 wittling