lingfeat
lingfeat copied to clipboard
How to apply function .preprocess and others to Pandas df?
Greetings all,
I have a large corpus zipping into a Pandas dataframe
and I'd like to iterate text column to record the results of individual functions to separate columns. As far as I get, extractor
only accepts str
. I am trying to merge scores with metadata included in the dataframe.
For instance, my dataframe is follows.
df.head()
docid_field ... text_field
0 BGSU1001 ... <ICLE-BG-SUN-0001.1> \nIt is time, that our s...
1 BGSU1002 ... <ICLE-BG-SUN-0002.1> \nNowadays there is a gr...
2 BGSU1003 ... <ICLE-BG-SUN-0003.1> \nOnce upon a time there...
3 BGSU1004 ... <ICLE-BG-SUN-0004.1> \nOur educational system...
4 BGSU1005 ... <ICLE-BG-SUN-0005.1> \nScience, technology an...
Is there a way to apply LingFeat function to df['text_field'] and record scores (let's say LingFeat.EnDF_()) as tuples into another column? I did try
df['LingFeat'] = df['text_field'].apply(lambda x: extractor.pass_text(x))
and the result is
0 <lingfeat.extractor.pass_text object at 0x0000...
1 <lingfeat.extractor.pass_text object at 0x0000...
2 <lingfeat.extractor.pass_text object at 0x0000...
3 <lingfeat.extractor.pass_text object at 0x0000...
4 <lingfeat.extractor.pass_text object at 0x0000...
923 <lingfeat.extractor.pass_text object at 0x0000...
924 <lingfeat.extractor.pass_text object at 0x0000...
925 <lingfeat.extractor.pass_text object at 0x0000...
926 <lingfeat.extractor.pass_text object at 0x0000...
927 <lingfeat.extractor.pass_text object at 0x0000...
Name: LingFeat, Length: 928, dtype: object
I couldn't go on any further. How should I do it, if it is possible?