gramex icon indicating copy to clipboard operation
gramex copied to clipboard

GRAMEX-144 ⁃ Avoid HDF5 for MLHandler storage

Open sanand0 opened this issue 3 years ago • 5 comments

MLHandler is the only Gramex component that internally requires HDF5. (UploadHandler used to do this, but we migrated away from that.)

So PyTables is necessary for Gramex. But since we can't pip install pytables, I'd like to make this optional.

Can we use Excel storage for MLHandler Frames?

┆Issue is synchronized with this Jira Bug

sanand0 avatar Jan 23 '22 05:01 sanand0

@sanand0, we have three options:

  1. Move h5py / pytables to conda - this is already happening, almost no change required
  2. Remove data storage in MLHandler completely - Might need some refactoring, but will greatly simplify the API. Users will have to POST data on every train / retrain.
  3. Use Excel - this will make MLHandler slow for larger datasets

I would pick option 2. Keeps things simple and clean. What would you pick?

jaidevd avatar Jan 24 '22 05:01 jaidevd

#2. Remove data storage in MLHandler completely.


From: Jaidev Deshpande @.> Sent: Monday, January 24, 2022 10:49 AM To: gramener/gramex @.> Cc: Subscribed @.***> Subject: Re: [gramener/gramex] GRAMEX-144 ⁃ Avoid HDF5 for MLHandler storage (Issue #491)

@sanand0https://ind01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fsanand0&data=04%7C01%7Csandeep.bhat%40gramener.com%7C513b7cc2dc564092834e08d9def91026%7Cdd3e2cbf8642480c9db6b7ba55bbf330%7C0%7C0%7C637785983585688227%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=nqlg7cd24yOUcwU4kccUAYmf%2FCFpHiYS36LwK%2B4%2FB2s%3D&reserved=0, we have three options:

  1. Move h5py / pytables to conda - this is already happening, almost no change required
  2. Remove data storage in MLHandler completely - Might need some refactoring, but will greatly simplify the API. Users will have to POST data on every train / retrain.
  3. Use Excel - this will make MLHandler slow for larger datasets

What would you pick?

— Reply to this email directly, view it on GitHubhttps://ind01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgramener%2Fgramex%2Fissues%2F491%23issuecomment-1019731118&data=04%7C01%7Csandeep.bhat%40gramener.com%7C513b7cc2dc564092834e08d9def91026%7Cdd3e2cbf8642480c9db6b7ba55bbf330%7C0%7C0%7C637785983585688227%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=eRwPuwI8KHGgVx6jb2IHP959N9ACW8EyD9Vspbenkxw%3D&reserved=0, or unsubscribehttps://ind01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAQC65AZWVVX5M2NAGR3HOL3UXTONBANCNFSM5MS5AYYQ&data=04%7C01%7Csandeep.bhat%40gramener.com%7C513b7cc2dc564092834e08d9def91026%7Cdd3e2cbf8642480c9db6b7ba55bbf330%7C0%7C0%7C637785983585688227%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=CeSdj5Q11huCSbHkZw4VbDusVX%2FzKjW8U1vfeBN8yrM%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://ind01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Csandeep.bhat%40gramener.com%7C513b7cc2dc564092834e08d9def91026%7Cdd3e2cbf8642480c9db6b7ba55bbf330%7C0%7C0%7C637785983585688227%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Z6EEJM5ebEnHZ151MPiNG9aSSSFjsIVQm7zc2uDW5YA%3D&reserved=0 or Androidhttps://ind01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7Csandeep.bhat%40gramener.com%7C513b7cc2dc564092834e08d9def91026%7Cdd3e2cbf8642480c9db6b7ba55bbf330%7C0%7C0%7C637785983585688227%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=oDnR5eMWkbZjAUqbEMRPbnNAxB7NZ5PBqMhaksY9orM%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Disclaimer: This email may be confidential. Don't share without consent. Inform sender if you got it by mistake.

bhatsandeep avatar Jan 24 '22 05:01 bhatsandeep

I'd like to try out Excel first, please. It's hopefully low effort. We can then evaluate the impact of removing data storage completely.

sanand0 avatar Jan 24 '22 05:01 sanand0

Noted, I'll send a PR today.

jaidevd avatar Jan 24 '22 05:01 jaidevd

Meanwhile, FWIW, if we pick option 1, MLHandler works fine on Python 3.7, 3.8 and 3.9

jaidevd avatar Jan 24 '22 07:01 jaidevd