juliasilge.com icon indicating copy to clipboard operation
juliasilge.com copied to clipboard

If I Loved Natural Language Processing Less, I Might Be Able to Talk About It More | Julia Silge

Open utterances-bot opened this issue 3 years ago • 4 comments

If I Loved Natural Language Processing Less, I Might Be Able to Talk About It More | Julia Silge

In my last post, I did some natural language processing and sentiment analysis for Jane Austen’s most well-known novel, Pride and Prejudice. It was just so much fun that I wanted to extend some of that work and compare across her body of writing.

https://juliasilge.com/blog/if-i-loved-nlp-less/

utterances-bot avatar Mar 15 '21 22:03 utterances-bot

Hello Julia,

Thanks so much for this -- I really enjoyed your work and now I'm trying to do a similar analysis. My dataset consists of 2 columns: a text feedback and a date that the feedback was given. I was able to plot the feedback line by line and aggregate by sentiment. However, I'd love the X axis to be Date, rather than the line number so I can look at it monthly/quarterly/yearly.

Is there a way to group these lines by day?

I appreciate it!

Oz

ozengnr avatar Mar 15 '21 22:03 ozengnr

Yep, absolutely! Take a look at this section of our book, and group_by() your date variable instead of index or line number.

juliasilge avatar Mar 16 '21 15:03 juliasilge

hello julia I am currently participating in a text binary classification competition and have observed a 70% similarity in text between the training and testing datasets. How can I leverage this similarity to my advantage?

mohamedelhilaltek avatar Jan 10 '24 13:01 mohamedelhilaltek

@mohamedelhilaltek I recommend that you take a look at Supervised Machine Learning for Text Analysis in R for best practices in binary classification with text.

juliasilge avatar Jan 10 '24 17:01 juliasilge