convert_matlab73_hdf5 icon indicating copy to clipboard operation
convert_matlab73_hdf5 copied to clipboard

Smarter string heuristic

Open pao opened this issue 13 years ago • 5 comments

Just a drive-by; I'm reverse engineering the MATLAB HDF5 myself. The uint16 character strings all appear to have the HDF5 attribute MATLAB_class = char, so you can probably key on that.

pao avatar Oct 24 '12 19:10 pao

Hi Pao,

Sorry for the late reply and thank you for the hint. Please let me know when you have progress on the reverse engineering. As far as I see the problem is to be robust enough for the various use (and misuse? ;)) of that format by users.

Did you find this preliminary attempt of mine of some use for your case?

On 10/24/2012 09:43 PM, pao wrote:

Just a drive-by; I'm reverse engineering the MATLAB HDF5 myself. The uint16 character strings all appear to have the HDF5 attribute |MATLAB_class = char|, so you can probably key on that.

— Reply to this email directly or view it on GitHub https://github.com/emanuele/convert_matlab73_hdf5/issues/1.

emanuele avatar Nov 03 '12 18:11 emanuele

Unfortunately no on both counts. I really couldn't justify spending any more time on it, and there's some crazy stuff going on with object serialization. The data is all there, but I couldn't connect the dots from the entry point in the data hierarchy.

pao avatar Nov 03 '12 19:11 pao

Hi Emanuele and Pao,

I wrote up a tool to do what this project does before I was aware it existed, and I believe I have a solution for the string heuristic that works for the mat files I've tested it on:

If you check f.attrs.keys() for the presence of the string 'MATLAB_int_decode' that will tell you whether the variable is a string or not. If the variable is a uint16, this key is not present at all.

Best,

Jim.

jim-rafferty avatar Apr 14 '14 09:04 jim-rafferty

Hi Jim,

Thanks a lot for the hint! I am not working on this project since a while. You are welcome to submit a pull request in order to fix the issue. Nevertheless, I'll have a look to the issue and try to fix it myself if you don't. By the way, do you have code to share from your tool?

Best,

Emanuele

emanuele avatar Apr 14 '14 09:04 emanuele

No problem :) I've tried submitting a pull request but I'm not 100% certain I've done it correctly as I am new to github. I've committed the proposed change to a fork of your repo in any case.

J.

jim-rafferty avatar Apr 14 '14 10:04 jim-rafferty