sentimentr icon indicating copy to clipboard operation
sentimentr copied to clipboard

Don't sort by sentiment in sentiment_by

Open vspinu opened this issue 8 years ago • 3 comments

Usually grouping operators preserve the order of the groups. In this sense the behavior of sentiment_by is non-intuitive. Especially with data.table by would better have the same semantics as in data.table.

While sorting by sentiment is potentially useful in some contexts, there is virtually no advantage for it with current plotting functionality. Plotting by grouping values is probably more meaningful in majority of the cases.

If they so desire, users can always sort by sentiment with a simple setkey. Sorting by keys is more verbose and results in unnecessary groups names replication.

For these reasons I would propose to remove this "feature".

vspinu avatar Apr 13 '16 22:04 vspinu

Thanks for the PR and interest in sentimentr...

Still debating on this. I use sentimentr for analysis and usually care most about the order of the output. Just like with plots, tables shine when they are ordered by the variable of interest. I think you have a specific reason of why it's screwing something up for you but am unsure what that is. I would appreciate a stronger argument for changing the behavior. I'd be more inclined to add this as an argument but retain the default behavior.

trinker avatar May 07 '16 21:05 trinker

I don't have a "stronger" argument than the one I have already said. If I do processing by a factor variable G, then I expect my output to be ordered by original order of data or by the order of the grouped factor. That's what all group_by processor do in R. The fact that the output is sorted by something else is unexpected unexpected. Groups are almost certainly have a meaningful order so, I think, preserving that order is desirable independently of the context.

In my case the order is given by time. I have thousands sentences which are grouped by messages which in turn are ordered in time. Sorting it by sentiment doesn't make much sense AFAIC.

I'd be more inclined to add this as an argument but retain the default behavior.

It's up to you to decide of course, but I think sorting is an 'extra" feature. It adds up to the default computation, so it should an extra optional option. Otherwise users would need to revert that sorting computation.

vspinu avatar May 08 '16 13:05 vspinu

I have been bitten by this twice this week. You're right.

trinker avatar Sep 14 '16 02:09 trinker