hover icon indicating copy to clipboard operation
hover copied to clipboard

Associating non_feature data with feature_key

Open robinsonkwame opened this issue 3 years ago • 1 comments

There are often metadata associated with the feature data; for example, text comes from certain documents. After labeling the raw data it's often useful to merge the labels with the metadata for other data science tasks. For example, some sets of documents or locations might not contain a labels that you would otherwise expect them to. Or you want to aggregate counts by document or location.

Is there a way for SuperisableTextDataset to include non_feature data? non_feature data could store this kind of metadata. The subset row order differs from the raw data frame so you can't just match indices.

robinsonkwame avatar Aug 26 '22 16:08 robinsonkwame

Yes, the dataset can have any extra columns as long as their names don't conflict with the columns that hover uses.

Most of the time you simply won't hit a conflict, just pass your full csv to SupervisableTextDataset.from_pandas().

That said, we should be making it more obvious which columns will conflict and suggest the user to change them.

phurwicz avatar Aug 27 '22 01:08 phurwicz

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar May 27 '23 12:05 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Jun 11 '23 12:06 github-actions[bot]