emscripten icon indicating copy to clipboard operation
emscripten copied to clipboard

Embind vectors, sets or maps to be converted to JS Arrays, Sets and Maps

Open skonstant opened this issue 4 years ago • 12 comments

Why do std::vectors need to be registered in embind and then appear in JS as "vector" types and not as array? This makes them not iterable in for loops or in Angular templates.

Same for maps or unordered_maps could be JS Map.

And set or unordered_set could be JS Set.

This would come really handy as they could be used in JS without knowing they come from C++ and changing the JS code to adapt to some C++ constructs (iteration is probably the first one that comes to mind).

skonstant avatar May 02 '20 22:05 skonstant

I'd like to see this too. As a workaround I allow both JS arrays and std::vector classes as input.

Details
/**
 * Determines whether an JS value is a specified type.
 */
inline bool is_type(emscripten::val value, const std::string &type) {
    return value.typeOf().as<std::string>() == type;
}

inline bool is_vector(emscripten::val value) {
    return is_type(value["size"], "function");
}

/**
 * Converts an input array to a vector.
 */
template <typename T>
std::vector<T> to_vector(emscripten::val v) {
    std::vector<T> rv;

    if (v.isArray()) {
        rv = emscripten::vecFromJSArray<T>(v);
    } else if (is_vector(v)) {
        rv = v.as<std::vector<T>>();
    } else {
        // Allow single values as well
        rv = {v.as<T>()};
    }

    return rv;
}

EMSCRIPTEN_BINDINGS(my_module) {
    // Register vector bindings
    register_vector<int>("VectorInt");
    register_vector<double>("VectorDouble");
    register_vector<std::string>("VectorString");
    
    function("inputIntVector", optional_override([](emscripten::val value) {
                std::vector<int> vector = to_vector<int>(value);

                // ...
            }));
    function("inputDoubleVector", optional_override([](emscripten::val value) {
                std::vector<double> vector = to_vector<double>(value);

                // ...
            }));
    function("inputStringVector", optional_override([](emscripten::val value) {
                std::vector<std::string> vector = to_vector<std::string>(value);

                // ...
            }));
}

Perhaps std::vector should also appear as JS array upon returning from C++? As an example, this makes such conversions unnecessary if supported:

Details
function* vector_values(vector) {
    for (let i = 0; i < vector.size(); i++)
        yield vector.get(i);
    vector.delete();
}

const ints = [...vector_values(emval_test_return_vector())];
console.log(ints); // (3) [10, 20, 30]

https://github.com/emscripten-core/emscripten/blob/0ea8070948c030dd681a524162d420d501b95e9a/tests/embind/embind_test.cpp#L1216-L1219

kleisauke avatar May 05 '20 11:05 kleisauke

For the std::vector output, it can be converted in javascript wit this line for example:

new Array(moves.size()).fill(0).map((_, id) => moves.get(id))

And then you're free to iterate this array.

octopoulos avatar Jul 16 '20 05:07 octopoulos

You should be able to define implicit bindings for any std::vector using custom marshalling. Something like this:

namespace emscripten {
namespace internal {

template <typename T, typename Allocator>
struct BindingType<std::vector<T, Allocator>> {
    using ValBinding = BindingType<val>;
    using WireType = ValBinding::WireType;

    static WireType toWireType(const std::vector<T, Allocator> &vec) {
        return ValBinding::toWireType(val::array(vec));
    }

    static std::vector<T, Allocator> fromWireType(WireType value) {
        return vecFromJSArray<T>(ValBinding::fromWireType(value));
    }
};

template <typename T>
struct TypeID<T,
              typename std::enable_if_t<std::is_same<
                  typename Canonicalized<T>::type,
                  std::vector<typename Canonicalized<T>::type::value_type,
                              typename Canonicalized<T>::type::allocator_type>>::value>> {
    static constexpr TYPEID get() { return TypeID<val>::get(); }
};

}  // namespace internal
}  // namespace emscripten

This will automatically convert a JS array to a std::vector (for C++ function parameters) and a std::vector to a JS array (for C++ return values) without having to mess with register_vector as long as the T type in std::vector<T> has bindings defined.

struct NumWrapper {
    double num;
};

std::vector<NumWrapper> sort(std::vector<NumWrapper> nums) {
    std::sort(nums.begin(), nums.end(), [](const NumWrapper &a, const NumWrapper &b) {
        return a.num < b.num;
    });

    return nums;
}

EMSCRIPTEN_BINDINGS(some_module) {
    value_object<NumWrapper>("NumWrapper").field("num", &NumWrapper::num);
    function("sort", &sort);

    // `register_vector<NumWrapper>` isn't needed; vectors are implicitly converted to and from JS arrays.
}
Module.sort([{num: 2}, {num: 1}, {num: 3}]);
    => [{num: 1}, {num: 2}, {num: 3}]

Similar marshalling could be added for converting std::map and std::set to and from the JS analogs. Maybe this would be a helpful addition to Embind as an alternative to register_vector and register_map?

mattbradley avatar Oct 28 '20 03:10 mattbradley

Be aware that if you copy a vector or map into a JavaScript Array or Map, the new JavaScript objects will not forward modifications of their contents to the original C++ objects -- this would be a change in capabilities as well as an API change!

It sounds like what might be useful, though, is making the JS object wrappers for vectors and maps support native JS iteration and property indexing to make them easier to use on the JS side?

Iteration (for for-of loops and anything that takes an iterable) should be straightforward setting a special method on Symbol.iterator

I'm less certain whether property indexing (accessing as vector[index] instead of vector.get(index)) is doable without using a Proxy, which might have performance implications.

Note that neither of these features would work in Internet Explorer or other very old JS engines.

bvibber avatar Oct 28 '20 17:10 bvibber

From a performance perspective would an array bound type be more performant than having to iterate the vector?

I'm assuming there's probably some vector size performance trade off here?

benjamind avatar Feb 04 '21 01:02 benjamind

@mattbradley I'm experiencing a problem with the vector conversion code you have supplied above.

  • If the vector contains non trivial objects that were bound by normal Embind bindings, their internal ptr becomes undefined in JS.
  • Rewriting the first function to this
static WireType toWireType(const std::vector<T, Allocator> &vec) {
	std::vector<val> valVec (vec.begin (), vec.end ());
        return BindingType<val>::toWireType (val::array (valVec));
}

seems to fix it.

mmarczell-graphisoft avatar Mar 03 '22 15:03 mmarczell-graphisoft

I've tried embind's value_array for passing JS Number Array to C++ struct, but the problem is, the legacy JS code's array is not fixed length, so i switch to std::vector, but result in a weird "BindingError: Cannot pass "0,0,0,1" as a vector

So what i want is a vary-length std::vector as value_array...

chenzx avatar Apr 08 '22 06:04 chenzx

The above custom marshalling code can map JS array to C++ side std::vector, but cannot map C++ side std::vector return value to JS array, compile error...

chenzx avatar Apr 18 '22 13:04 chenzx

You should be able to define implicit bindings for any std::vector using custom marshalling. Something like this:

namespace emscripten {
namespace internal {

template <typename T, typename Allocator>
struct BindingType<std::vector<T, Allocator>> {
    using ValBinding = BindingType<val>;
    using WireType = ValBinding::WireType;

    static WireType toWireType(const std::vector<T, Allocator> &vec) {
        return ValBinding::toWireType(val::array(vec));
    }

    static std::vector<T, Allocator> fromWireType(WireType value) {
        return vecFromJSArray<T>(ValBinding::fromWireType(value));
    }
};

template <typename T>
struct TypeID<T,
              typename std::enable_if_t<std::is_same<
                  typename Canonicalized<T>::type,
                  std::vector<typename Canonicalized<T>::type::value_type,
                              typename Canonicalized<T>::type::allocator_type>>::value>> {
    static constexpr TYPEID get() { return TypeID<val>::get(); }
};

}  // namespace internal
}  // namespace emscripten

This will automatically convert a JS array to a std::vector (for C++ function parameters) and a std::vector to a JS array (for C++ return values) without having to mess with register_vector as long as the T type in std::vector<T> has bindings defined.

struct NumWrapper {
    double num;
};

std::vector<NumWrapper> sort(std::vector<NumWrapper> nums) {
    std::sort(nums.begin(), nums.end(), [](const NumWrapper &a, const NumWrapper &b) {
        return a.num < b.num;
    });

    return nums;
}

EMSCRIPTEN_BINDINGS(some_module) {
    value_object<NumWrapper>("NumWrapper").field("num", &NumWrapper::num);
    function("sort", &sort);

    // `register_vector<NumWrapper>` isn't needed; vectors are implicitly converted to and from JS arrays.
}
Module.sort([{num: 2}, {num: 1}, {num: 3}]);
    => [{num: 1}, {num: 2}, {num: 3}]

Similar marshalling could be added for converting std::map and std::set to and from the JS analogs. Maybe this would be a helpful addition to Embind as an alternative to register_vector and register_map?

It'd be really nice to consider this approach to ease the use of std::vector with some heads-up from @brion .

Or, is it better to consider using pointers and array size or using a shared buffer from performance's point of view? @brion

ZheyangSong avatar Jan 15 '23 00:01 ZheyangSong

You should be able to define implicit bindings for any std::vector using custom marshalling. Something like this:

namespace emscripten {
namespace internal {

template <typename T, typename Allocator>
struct BindingType<std::vector<T, Allocator>> {
    using ValBinding = BindingType<val>;
    using WireType = ValBinding::WireType;

    static WireType toWireType(const std::vector<T, Allocator> &vec) {
        return ValBinding::toWireType(val::array(vec));
    }

    static std::vector<T, Allocator> fromWireType(WireType value) {
        return vecFromJSArray<T>(ValBinding::fromWireType(value));
    }
};

template <typename T>
struct TypeID<T,
              typename std::enable_if_t<std::is_same<
                  typename Canonicalized<T>::type,
                  std::vector<typename Canonicalized<T>::type::value_type,
                              typename Canonicalized<T>::type::allocator_type>>::value>> {
    static constexpr TYPEID get() { return TypeID<val>::get(); }
};

}  // namespace internal
}  // namespace emscripten

This will automatically convert a JS array to a std::vector (for C++ function parameters) and a std::vector to a JS array (for C++ return values) without having to mess with register_vector as long as the T type in std::vector<T> has bindings defined.

struct NumWrapper {
    double num;
};

std::vector<NumWrapper> sort(std::vector<NumWrapper> nums) {
    std::sort(nums.begin(), nums.end(), [](const NumWrapper &a, const NumWrapper &b) {
        return a.num < b.num;
    });

    return nums;
}

EMSCRIPTEN_BINDINGS(some_module) {
    value_object<NumWrapper>("NumWrapper").field("num", &NumWrapper::num);
    function("sort", &sort);

    // `register_vector<NumWrapper>` isn't needed; vectors are implicitly converted to and from JS arrays.
}
Module.sort([{num: 2}, {num: 1}, {num: 3}]);
    => [{num: 1}, {num: 2}, {num: 3}]

Similar marshalling could be added for converting std::map and std::set to and from the JS analogs. Maybe this would be a helpful addition to Embind as an alternative to register_vector and register_map?

The issue with this approach is that it breaks typescript generation because the vector is now recognised as an 'any' type. It would be very great to expose C++ containers as typed typescript arrays: this allow seamless interoperability.

Did anyone succeed to generate such typed arrays ?

MatthieuMv avatar Apr 19 '24 20:04 MatthieuMv

@MatthieuMv

The issue with this approach is that it breaks typescript generation because the vector is now recognised as an 'any' type. It would be very great to expose C++ containers as typed typescript arrays: this allow seamless interoperability.

Did anyone succeed to generate such typed arrays ?

Before Emscripten has added TypeScript support I wrote my own d.ts generator which handles that case:

https://github.com/marczellm/emscripdtsgen

mmarczell-graphisoft avatar Apr 22 '24 08:04 mmarczell-graphisoft

@MatthieuMv

The issue with this approach is that it breaks typescript generation because the vector is now recognised as an 'any' type. It would be very great to expose C++ containers as typed typescript arrays: this allow seamless interoperability. Did anyone succeed to generate such typed arrays ?

Before Emscripten has added TypeScript support I wrote my own d.ts generator which handles that case:

https://github.com/marczellm/emscripdtsgen

Thank you @mmarczell-graphisoft, I ended up registering a emscripten::val type for each container I have using EMSCRIPTEN_DECLARE_VAL_TYPE. Then, I registered them using emscripten::register_type<ObjectList>("Object []"); . Now, because I wanted to use my containers in the interfaces / objects, I used emscripten::internal::BindingType to convert from C++ to the registered EMSCRIPTEN_DECLARE_VAL_TYPE.

This allows me register and use these containers as properties and as function parameters.

MatthieuMv avatar Apr 26 '24 09:04 MatthieuMv