membership-inference
membership-inference copied to clipboard
how to convert input_var to a string matrix, if train_feat_file contains string instead of floats?
Hi csong, I am new in Tensor. I wanted to try your code, but my dataset contains string data instead of floating point values. How should i modify the code in my case? could you please help? Thanks
Hi SMJT01,
Your string features can be treated as Categorical Data. There are quite a few ways of encoding these string values into numerals that can then be interpreted by the ML model.
The following article provides quick descriptions about some of these various methods. https://towardsdatascience.com/smarter-ways-to-encode-categorical-data-for-machine-learning-part-1-of-3-6dca2f71b159
Two of these encoders are also implemented in the official sk-learn library: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OrdinalEncoder.html And other implementations can be found too. http://contrib.scikit-learn.org/categorical-encoding/
Hi Imathatguy, thank you very much for your reply. Primarily I solved this issue by encoding them to a numerical dataset. But off course if I could use categorical attributes directly, that would save much of my time. as far as I explore, Thenao tensor does not have anything for string datatype. I'll try the scikit learn packages and i'll let your know whether they work well on the data or not. Thanks