stdlib icon indicating copy to clipboard operation
stdlib copied to clipboard

WIP: Addition of a subroutine get_other_scalar in stdlib_hashmap_wrappers

Open jvdp1 opened this issue 3 years ago • 11 comments

Addition of the subroutine get_other_scalar( other, value [, exists]) to extract the content of other_type into the scalar variable value of a kind provided by the module stdlib_kinds.

The aim of this subroutine is to help the user to get the content of other_type when it is only a scalar value.

jvdp1 avatar Jun 22 '22 20:06 jvdp1

I am on vacation with many things going on so it will take some time to review this. FWIW gfortan11 on a MacBook with an M1 processor is unable to compile the code because the math modules in stdlib require quad precision and there is no quad precision library for gfortran11 on the M1 processor. I will see if ifort can compile this code or I can download a gfortran12 for this MacBook.

wclodius2 avatar Jun 23 '22 19:06 wclodius2

I have been using CMake to compile and link the code. Should I now be using fpm, and if so how do I download it and its documentation?

wclodius2 avatar Jun 23 '22 19:06 wclodius2

I am on vacation with many things going on so it will take some time to review this. FWIW gfortan11 on a MacBook with an M1 processor is unable to compile the code because the math modules in stdlib require quad precision and there is no quad precision library for gfortran11 on the M1 processor. I will see if ifort can compile this code or I can download a gfortran12 for this MacBook.

Strange, it passed the test of the CI on MacOS with gfortran 11. CMake checks if these are supported and defines the variable WITH_QP.

jvdp1 avatar Jun 23 '22 20:06 jvdp1

I have been using CMake to compile and link the code. Should I now be using fpm, and if so how do I download it and its documentation?

CMake should be fine. To compile with fpm, you must first run the script ci/fpm-deployment, and then run fpm build inside the directory stdlib-fpm.

jvdp1 avatar Jun 23 '22 20:06 jvdp1

@awvwgk I modified the API by generating specific procedures for each type. I think this is more in-line with what you propose.

jvdp1 avatar Aug 04 '22 18:08 jvdp1

Seems to fail with Intel due to an ambiguous interface

/home/runner/work/stdlib/stdlib/build/src/stdlib_hashmap_wrappers.f90(283): error #5286: Ambiguous generic interface GET: previously declared specific procedure GET_OTHER is not distinguishable from this declaration. [GET_OTHER_SCALAR_CHAR]
    subroutine get_other_scalar_char(other, value, exists)
---------------^

awvwgk avatar Aug 04 '22 19:08 awvwgk

Seems to fail with Intel due to an ambiguous interface

/home/runner/work/stdlib/stdlib/build/src/stdlib_hashmap_wrappers.f90(283): error #5286: Ambiguous generic interface GET: previously declared specific procedure GET_OTHER is not distinguishable from this declaration. [GET_OTHER_SCALAR_CHAR]
    subroutine get_other_scalar_char(other, value, exists)
---------------^

Yes, I was afraid for this issue, and I was a bit surprised that gfortran didn't complain about it. However, I don't know which one is right. Any idea on how to solve it?

jvdp1 avatar Aug 04 '22 19:08 jvdp1

I think the optional attribute of the exists dummy argument is the issue here. I generally dislike the exists flag because it doesn't carry any information about the cause of error (type mismatch, value not present, ...).

In TOML Fortran the get function is split, there is a type-bound version which only retrieves pointers using their association status to communicate success and failure and a generic interface get_value (but could be named get as well) which provides a convenience interface built from the type-bound procedure. The advantage in TOML Fortran is that I can now the possible data I have to store in advance as the data types are limited, this option is not available for the hash map implementation here.

awvwgk avatar Aug 04 '22 20:08 awvwgk

I think the optional attribute of the exists dummy argument is the issue here.

Making this argument not optional will definitively solve the issue. But do we want it not optional? Personnaly, I would always provide such an argument.

I generally dislike the exists flag because it doesn't carry any information about the cause of error (type mismatch, value not present, ...).

I totally agree with you. But exists could be replaced by an integer variable e.g., info or stat, which could take different values (e.g., succes, not_present, type_mismatch).

In TOML Fortran the get function is split, there is a type-bound version which only retrieves pointers using their association status to communicate success and failure and a generic interface get_value (but could be named get as well) which provides a convenience interface built from the type-bound procedure. The advantage in TOML Fortran is that I can now the possible data I have to store in advance as the data types are limited, this option is not available for the hash map implementation here.

I need to think a bit more on this approach. I don't understand directly how this approach will solve this issue.

jvdp1 avatar Aug 04 '22 21:08 jvdp1

I need to think a bit more on this approach. I don't understand directly how this approach will solve this issue.

I use a zero copy approach for the data structure, you can either obtain a pointer to the object in the data structure (%get) or remove the allocation from the data structure (%pop). If working with built-in types a copy is usually preferred and can be implemented via first obtaining the pointer and than creating a copy by assignment, which is provided for the user in a convenience wrapper.

This map implementation is designed differently, specially get_other only gives you a copy via a sourced allocation. Depending on the data structure you are copying this can be a lot of data and not all data structures are safe for sourced allocations (pointer components usually break in strange ways, also user defined assignment is circumvented by this). Since the returned value is unlimited polymorphic, I don't see why we would need an additional specialized API, as the user side can do the dispatch via select type or create a view from a pointer with the correct type.

I agree that a specialized API is needed because an unlimited polymorphic value is inconvenient to work with, but I can't see a good way to fit it in the current structure.

awvwgk avatar Aug 04 '22 22:08 awvwgk

I use a zero copy approach for the data structure, you can either obtain a pointer to the object in the data structure (%get) or remove the allocation from the data structure (%pop). If working with built-in types a copy is usually preferred and can be implemented via first obtaining the pointer and than creating a copy by assignment, which is provided for the user in a convenience wrapper.

Thank you @awvwgk for yoour explanations. Really useful.

This map implementation is designed differently, specially get_other only gives you a copy via a sourced allocation. Depending on the data structure you are copying this can be a lot of data and not all data structures are safe for sourced allocations (pointer components usually break in strange ways, also user defined assignment is circumvented by this).

Could the use of c_ptr not solve this issue?

Since the returned value is unlimited polymorphic, I don't see why we would need an additional specialized API, as the user side can do the dispatch via select type or create a view from a pointer with the correct type.

I agree with you. My main aim was to provide a simple API for most scalar types to users of stdlib. The user can still write his own code for other specific types.

Maybe a small example in the docs with such a code using select type would be enough.

  subroutine get_other_scalar_int32(other, value, exists)
!! Version: Experimental
!!
!! Gets the content of the other as a scalar of a kind provided by stdlib_kinds
        class(other_type), intent(in) :: other
        integer(int32), intent(out) :: value
        logical, intent(out), optional :: exists

        logical :: exists_

        exists_ = .false.

        if (.not.allocated(other % value)) then
            if (present(exists)) exists = exists_
            return
        end if

        select type(d => other % value)
            type is ( integer(int32) )
                value = d
                exists_ = .true.
        end select

        if (present(exists)) exists = exists_

    end subroutine

This topic would be also a good candidate for a tutorial.

I agree that a specialized API is needed because an unlimited polymorphic value is inconvenient to work with, but I can't see a good way to fit it in the current structure.

Let wait and see the experience of other users.

jvdp1 avatar Aug 05 '22 19:08 jvdp1