multik
multik copied to clipboard
how to parse json array into multik directly without going through list of list?
fun parseJson() {
val json = "[1,2,3]"
val gson = Gson()
val listType = object : TypeToken<List<Int>>() { }.type
val list = gson.fromJson<List<Int>>(json,listType) // works fine
val d1Type = object : TypeToken<D1Array<Int>>() { }.type
val d1array = gson.fromJson<D1Array<Int>>(json,d1Type) //error out
}
I would get an error saying it's an object not an array.
Caused by: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was BEGIN_ARRAY at line 1 column 2 path $
at com.google.gson.stream.JsonReader.beginObject(JsonReader.java:385)
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:215)
... 54 more
currently I have to parse to an list first, then convert to ndarray, it seems an unnecessary copy of data and I need to define 2 copies of model class for a normal use case Thanks
use kotlinx.serialization
Json.decodeFromString<D1Array<Int>>("[1,2,3]")
I got kotlinx.serialization.SerializationException: Serializer for class 'D1' is not found.
is there work planned to add a serializer for popular json parsing libraries?
Now there is no direct parsing from json to ndarray.
Yes, we plan to add json support, but I can't say exactly when.
In this case, you can get a primitive array from json and create a ndarray based on it. Then there will be no unnecessary data copying.
I report the issue using a simplified example, but I was actually dealing with 3D arrays in json, the shape is 72x3x33 . so I think there will still be data copy even if I use primitive array?
Unfortunately, with dimensions greater than 1, data will be copied.
thanks, is json parsing something in 0.0.2? I'd like to contribute if you can point where to start and implementation considerations
In this release, I did not expect to include support for json. In the next release, I plan to support csv and npy, the cornerstone in general for reading data from files in this case is the type provider. Specifically, for reading for a quick implementation, you can take a third-party library, but this will be very non-optimal in terms of memory.
At the moment I'm stuck in multik-native with building and linking static libraries. To implement json support, multik-api is required (the name may change in the future) and it is now ready. For implementation, you can take any multiplatform library, then you will need to write your own type provider. An approach to this can be found in dataframe.
csv/npy would be great too, I get the data from my in house python program so I guess I can save the data as csv as well. with minor modifications. will npy support using https://github.com/JetBrains-Research/npy ?
csv in a simple form is already supported.
will npy support using https://github.com/JetBrains-Research/npy ?
Yes, I discussed this with the maintainer of the library. To begin with, it will be necessary to make it multiplatform.