hdmf
hdmf copied to clipboard
Fix conversion of Zarr string dataset to HDF5
Motivation
Fix https://github.com/hdmf-dev/hdmf-zarr/issues/211
Zarr does not natively support variable length strings, instead the data is stored as objects with an encoder. When converting from Zarr to HDF5 the ObjectMapper tries to change the dtype in some cases for text datasets to unicode, which in turn can cause reading the data from Zarr to fail. This PR changes this behavior to: 1) not change the dtype for Zarr string datasets, and 2) detect the correct dtype to use in HDF5 in the backend.
Checklist
- [x] Did you update
CHANGELOG.mdwith your changes? - [X] Does the PR clearly describe the problem and the solution?
- [X] Have you reviewed our Contributing Guide?
- [X] Does the PR use "Fix #XXX" notation to tell GitHub to close the relevant issue numbered XXX when the PR is merged?
Codecov Report
:x: Patch coverage is 83.33333% with 4 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 92.29%. Comparing base (704e99d) to head (a8b807c).
:warning: Report is 1 commits behind head on dev.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/hdmf/backends/hdf5/h5tools.py | 33.33% | 1 Missing and 1 partial :warning: |
| src/hdmf/build/objectmapper.py | 85.71% | 1 Missing and 1 partial :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## dev #1171 +/- ##
==========================================
- Coverage 92.29% 92.29% -0.01%
==========================================
Files 42 42
Lines 9685 9703 +18
Branches 1962 1968 +6
==========================================
+ Hits 8939 8955 +16
- Misses 470 471 +1
- Partials 276 277 +1
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@rly could you please take a look to see if this is right way to address this issue
This PR also resolves this issue: https://github.com/hdmf-dev/hdmf-zarr/issues/301
@oruebel could you please review?
@rly thanks for updating the PR. This looks good to me.