cobrapy icon indicating copy to clipboard operation
cobrapy copied to clipboard

Support compression on file import and export

Open matthiaskoenig opened this issue 6 years ago • 7 comments
trafficstars

In the discussion with @Midnighter it came up that there should be a generic support for compressed files for the various io modules. I.e. cobrapy should support the reading and writing of compressed files to the various formats (JSON, MAT, SBML, YAML). To avoid code duplication there should be a single implementation of the compression support.

Compression becomes important for the genome-scale models which are very large (uncompressed).

Just as a note: The new SBML parser supports reading compressed files from paths, but not yet from file handles. But support for compression on writing is missing.

matthiaskoenig avatar Mar 05 '19 09:03 matthiaskoenig

Just as a note: The new SBML parser supports reading compressed files from paths, but not yet from file handles. But support for compression on writing is missing.

I suppose libsbml doesn't support writing to a Python file stream, i.e., it can only either take a filename or create a string?

Midnighter avatar Mar 05 '19 09:03 Midnighter

Unfortunately not. Libsbml only supports reading or writing SBML to strings or paths. So the support for file handles and file streams will be implemented in python (via temporary files and reading the SBML strings via read()).

matthiaskoenig avatar Mar 05 '19 09:03 matthiaskoenig

Is this still valid? I'm asking because cobra/sbml.py seems to work well with gz and bz2, both reading and writing

akaviaLab avatar Jul 12 '22 13:07 akaviaLab

I implemented the compression support via libsbml for SBML. But there is no compression support for the other input formats.

matthiaskoenig avatar Jul 13 '22 07:07 matthiaskoenig

Hi Would you like me to use python modules gzip and bz2 to implement reading/writing of compressed files for the other formats? It would be slightly different from libsbml, which does it internally, but then the other formats could use them as well. I'm not sure it makes sense for matlab, but I might implement it for completeness.

On Wed, Jul 13, 2022 at 3:36 AM Matthias König @.***> wrote:

I implemented the compression support via libsbml for SBML. But there is no compression support for the other input formats.

— Reply to this email directly, view it on GitHub https://github.com/opencobra/cobrapy/issues/812#issuecomment-1182871303, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACQYYZXUQGMQXNFO7GUA42LVTZWWFANCNFSM4G3YOIHA . You are receiving this because you commented.Message ID: @.***>

akaviaLab avatar Jul 13 '22 11:07 akaviaLab

Yes, this was basically the idea. Especially the JSON can be compressed very efficiently.

matthiaskoenig avatar Jul 13 '22 15:07 matthiaskoenig

Okay - if you can review and hopefully merge #1245, which allows I/O to use Paths, I can build upon that to get all formats to deal with compressed files. Probably a helper function/file in the io directory, and all the formats will call it.

On Wed, Jul 13, 2022 at 11:58 AM Matthias König @.***> wrote:

Yes, this was basically the idea. Especially the JSON can be compressed very efficiently.

— Reply to this email directly, view it on GitHub https://github.com/opencobra/cobrapy/issues/812#issuecomment-1183401665, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACQYYZTTKOPOJLVLI5G4323VT3RTPANCNFSM4G3YOIHA . You are receiving this because you commented.Message ID: @.***>

akaviaLab avatar Jul 13 '22 16:07 akaviaLab