finam-export icon indicating copy to clipboard operation
finam-export copied to clipboard

support of pkl output format and compression

Open Evgeny7777 opened this issue 5 years ago • 4 comments

Tests to be added, but what do you think in general? Does it make sense?

Evgeny7777 avatar Oct 24 '20 13:10 Evgeny7777

Hey @Evgeny7777

Thanks for posting you suggestion. What's the use case so that you want it built into finam-export?

It could be sovled via an external tool trivially applied in the following way:

set -e
finam-export.py ... --destdir=some_dir --ext=txt
find some_dir -name '*.txt' | xargs convert_script

where convert_script might convert data into whatever is wanted, not only pickle or gzip

ffeast avatar Oct 24 '20 14:10 ffeast

Hey @ffeast ,

There are different reasons for me

  1. pd.DataFrame --> csv --> pd.DataFrame conversion path may easily introduce conversion problems. pickle dumps/restores object without conversions
  2. I'm thinking about maintaining local data lake, so that size matters. No sense to keep csv
  3. It could be done via pipe or by making own version of the script, but I thought it may be useful for others.

Other feature I'm going to add anyway will be delta-loading. Script would open file if exists and request only missing data. Do you think it would be useful for others?

Evgeny7777 avatar Oct 26 '20 14:10 Evgeny7777

Hey @Evgeny7777

  1. What kind of conversions do you mean?
  2. Sounds reasonable
  3. That's definitely a useful feature from both user and operational standpoints!
  • for users it would allow faster download times as I believe a typical scenario is to update historical data regularly so that 99% of data downloaded is already stored locally
  • it would allow to decrease load on finam's services Lets just move it to a separate issue

ffeast avatar Oct 26 '20 19:10 ffeast

@ffeast,

  1. I mean conversion of Dataframe object to cvs (on saving) and back (on loading) (because further on most probably you will need to work with Dataframe again). At least this is my usecase. interim csv state may introduce weird type/precision problems.
  2. Then will eventually cover it with test 👍
  3. Will make an issue then

Cheers

Evgeny7777 avatar Oct 27 '20 09:10 Evgeny7777