stdlib icon indicating copy to clipboard operation
stdlib copied to clipboard

Loadtxt real format update to list directed

Open chuckyvt opened this issue 2 years ago • 5 comments

This is a minor change to the loadtxt function to be more robust based on my experience in 'real world' use. Currently loadtxt specifies a format for the real read function, which makes it 'brittle'. If the text format is too far off, the read fails. Changing to list directed read is more robust in my use over the last few months in the version I build and use. The existing test for loadtxt is a good functional test but doesn't do a good job of simulating actual use, as by definition the text input and read format are defined to be identical.

chuckyvt avatar Apr 25 '24 02:04 chuckyvt

this format has been introduced by @milancurcic with this commit. This PR would be a change towards a previous implementation. Personally, I am fine with this change.

@milancurcic what was the reason of introducing this format? What do you think of this PR?

jvdp1 avatar Apr 25 '24 07:04 jvdp1

Thanks for tagging me. It was a long time ago, but I think we added the explicit edit descriptors to allow savetxt and loadtxt to do a bit-for-bit roundtrip of data. That is important for some cases, and obviously it's limiting in some others, as @chuckyvt points out. I think it's a design choice in the end, and stdlib should support a broad range of use cases. How about adding an optional argument? Not sure what's the best solution.

milancurcic avatar Apr 25 '24 08:04 milancurcic

FYI I see the loadtxt test sometimes fails in the CI, see i.e. https://github.com/fortran-lang/stdlib/actions/runs/8844659963/job/24286968620?pr=801:

152/301 Test #152: loadtxt ................................***Failed    0.01 sec
At line 159 of file /Users/runner/work/stdlib/stdlib/build/src/stdlib_io.f90
Fortran runtime error: Bad value during floating point read

Error termination. Backtrace:
#0  0x104945a8e
#1  0x104946735
#2  0x10494731b
#3  0x104b66528
#4  0x104b69f19
#5  0x104b6b5cf
#6  0x104b683dd
#7  0x104513351
#8  0x104509db6
#9  0x104509df2

perazz avatar Apr 26 '24 07:04 perazz

How about adding an optional argument? Not sure what's the best solution.

thanks @milancurcic . I like this idea of optional argument. What about something like:

call loadtxt (filename, array [, skiprows] [, max_rows], [, format])

similar to to_string?

The default format would be the current one.

jvdp1 avatar Apr 26 '24 07:04 jvdp1

I looked at the original pull request referenced above. It added format specifier to both write and read, which were both list directed at the time. Specifying a write format makes sense, and no changes to that in this pull request. But as far as I know a list directed read would not introduce additional rounding error or loss of precision, and is the simplest and best approach for loadtxt imho.

With that said, I updated the pull request to add an optional 'fmt' field to see how that would look and work. There currently isn't a way to specify list directed via character variable, so it has to be handled via if statements. This update uses '*' to specify list directed, but could change to anything. The thread below has some discussion on this and proposes "(LD)" for example.

https://fortran-lang.discourse.group/t/format-string-corresponding-to-list-directed-i-o/522/10

chuckyvt avatar Apr 28 '24 03:04 chuckyvt

Thank you @chuckyvt for the explanations. Please, before merging it, could you:

  • rebase your branch to solve the CI issue?
  • update the specs?

jvdp1 avatar May 07 '24 06:05 jvdp1

Ok, this should be complete. Also added a line to the loadtxt example.

chuckyvt avatar May 09 '24 12:05 chuckyvt

Thank you @chuckyvt . I will merge it.

jvdp1 avatar May 13 '24 09:05 jvdp1