gopickle
gopickle copied to clipboard
how to read a .pickle file which is written by python pandas.to_pickle(). when i use pickle.load occor REDUCE requires a Callable object: &types.GenericClass{Module:"numpy.core.multiarray", Name:"_reconstruct"}
You need to implement some custom types to handle these additional types.
In the example it shows u.FindClass = makePickleFindClass(u.FindClass)
.
The np.core.multiarray.__reconstruct()
class/method has three arguments. See: http://pyopengl.sourceforge.net/pydoc/numpy.core.multiarray.html#-_reconstruct. Argument 1 specifies the return type - in my case it was an np.ndarray
.
The great thing is that it is possible to add these classes in your own functions/package while using this code as is. It can be a long journey re-implementing the structure of all the classes on the way though.
(I haven't tried pandas pickles, but seem to have some overlap due to the multiarray/ndarray etc classes)
Hi,
sorry for being so late.
Thanks @Konstanty for you reply: a custom FindClass
is exactly the way to go in this case.
By default, when a custom/unknown class/object/function is encountered during the unpickling process, a GenericClass
is used. The fields Module
and Name
are provided exactly for the purpose of manual inspection.
In extremely simple cases, a GenericClass is enough, but more often the unpickler itself is instructed to instantiate specific classes and/or call specific methods. In this case, you can define a custom FindClass
implementation, which intercepts numpy.core.multiarray._reconstruct
and returns a custom object you have to define (e.g. a struct). That object should than implement the Callable
interface, i.e. should provide the Call
method.
For the actual implementation, my advice is to rely on the original Python code, already mentioned in the response above. I suggest to start providing the minimum set of types and functionalities, and then expand them as required, based on the next error provided by the unpickler... and so on and so forth, until all objects and functions are covered.
That was the approach we adopted for the Python-derived types in pytorch
package. See for example storage.go.
Were you able to make already some progress? Being numpy so widespread, we might be able to help you directly implementing some types, maybe also including some new stuff in the project.
You need to implement some custom types to handle these additional types. In the example it shows
u.FindClass = makePickleFindClass(u.FindClass)
.The
np.core.multiarray.__reconstruct()
class/method has three arguments. See: http://pyopengl.sourceforge.net/pydoc/numpy.core.multiarray.html#-_reconstruct. Argument 1 specifies the return type - in my case it was annp.ndarray
.The great thing is that it is possible to add these classes in your own functions/package while using this code as is. It can be a long journey re-implementing the structure of all the classes on the way though.
(I haven't tried pandas pickles, but seem to have some overlap due to the multiarray/ndarray etc classes)
Hi, sorry for being so late. Thanks @Konstanty for you reply: a custom
FindClass
is exactly the way to go in this case.By default, when a custom/unknown class/object/function is encountered during the unpickling process, a
GenericClass
is used. The fieldsModule
andName
are provided exactly for the purpose of manual inspection.In extremely simple cases, a GenericClass is enough, but more often the unpickler itself is instructed to instantiate specific classes and/or call specific methods. In this case, you can define a custom
FindClass
implementation, which interceptsnumpy.core.multiarray._reconstruct
and returns a custom object you have to define (e.g. a struct). That object should than implement theCallable
interface, i.e. should provide theCall
method.For the actual implementation, my advice is to rely on the original Python code, already mentioned in the response above. I suggest to start providing the minimum set of types and functionalities, and then expand them as required, based on the next error provided by the unpickler... and so on and so forth, until all objects and functions are covered.
That was the approach we adopted for the Python-derived types in
pytorch
package. See for example storage.go.Were you able to make already some progress? Being numpy so widespread, we might be able to help you directly implementing some types, maybe also including some new stuff in the project.
Thank you for your detailed answers,i will try it. but i did not have any progress right now. Findally i use python write it as a csv file then read the csv file instead. I'd be excited if you directly implementing some types.
You need to implement some custom types to handle these additional types. In the example it shows
u.FindClass = makePickleFindClass(u.FindClass)
.The
np.core.multiarray.__reconstruct()
class/method has three arguments. See: http://pyopengl.sourceforge.net/pydoc/numpy.core.multiarray.html#-_reconstruct. Argument 1 specifies the return type - in my case it was annp.ndarray
.The great thing is that it is possible to add these classes in your own functions/package while using this code as is. It can be a long journey re-implementing the structure of all the classes on the way though.
(I haven't tried pandas pickles, but seem to have some overlap due to the multiarray/ndarray etc classes)
thank you for your reply. i'm sorry for being here so late Findally i use python write it as a csv file then read the csv file instead
met the same problem, is there any new feature to support numpy pickle?
You need to implement some custom types to handle these additional types. In the example it shows
u.FindClass = makePickleFindClass(u.FindClass)
. Thenp.core.multiarray.__reconstruct()
class/method has three arguments. See: http://pyopengl.sourceforge.net/pydoc/numpy.core.multiarray.html#-_reconstruct. Argument 1 specifies the return type - in my case it was annp.ndarray
. The great thing is that it is possible to add these classes in your own functions/package while using this code as is. It can be a long journey re-implementing the structure of all the classes on the way though. (I haven't tried pandas pickles, but seem to have some overlap due to the multiarray/ndarray etc classes)Hi, sorry for being so late. Thanks @Konstanty for you reply: a custom
FindClass
is exactly the way to go in this case. By default, when a custom/unknown class/object/function is encountered during the unpickling process, aGenericClass
is used. The fieldsModule
andName
are provided exactly for the purpose of manual inspection. In extremely simple cases, a GenericClass is enough, but more often the unpickler itself is instructed to instantiate specific classes and/or call specific methods. In this case, you can define a customFindClass
implementation, which interceptsnumpy.core.multiarray._reconstruct
and returns a custom object you have to define (e.g. a struct). That object should than implement theCallable
interface, i.e. should provide theCall
method. For the actual implementation, my advice is to rely on the original Python code, already mentioned in the response above. I suggest to start providing the minimum set of types and functionalities, and then expand them as required, based on the next error provided by the unpickler... and so on and so forth, until all objects and functions are covered. That was the approach we adopted for the Python-derived types inpytorch
package. See for example storage.go. Were you able to make already some progress? Being numpy so widespread, we might be able to help you directly implementing some types, maybe also including some new stuff in the project.Thank you for your detailed answers,i will try it. but i did not have any progress right now. Findally i use python write it as a csv file then read the csv file instead. I'd be excited if you directly implementing some types.
Meet the same problem with you,so this another solution is pickle file transform csv file to read?
Will a solution to this problem be provided?
package main import ( "github.com/nlpodyssey/gopickle/pytorch" ) func main() { if _, err := pytorch.Load("model.pt"); err != nil { panic(err.Error()) } }
panic: class not found: numpy.core.multiarray _reconstruct
FYI, I've implemented something along these lines.
https://github.com/sbinet/npyio/pull/22 seems to be able to read numpy.ndarray
s that have been pickled.
at least, these kinds of arrays:
import numpy as np
arr = np.array([[1],[2,"3"],[4,5,6]], dtype="object")
import pickle
pickle.dump(arr, open("foo.pkl", "bw"))
on the Go side, github.com/sbinet/npyio/npy
exports func ClassLoader(module, name string) (any, error)
that registers the needed bits to read back npy.Array
and npy.ArrayDescr
.
HTH, -s
if nobody shouts out, I'll consider this as fixed (in sbinet/npyio
) and close that issue by the end of the week.
fixed by https://github.com/sbinet/npyio/pull/22