personality_detection icon indicating copy to clipboard operation
personality_detection copied to clipboard

Predict personality scores of a new text

Open FatemeFathii opened this issue 4 years ago • 21 comments

Hi. How can I give a review (a new text) to this model and extract the reviewer's personality?

FatemeFathii avatar Feb 06 '21 09:02 FatemeFathii

Hi Fateme, I think the best approach is to feed the full essays to the model (skip the psycholinguistics features) and then pass your new test sample to the trained model.

amirmohammadkz avatar Feb 12 '21 01:02 amirmohammadkz

Hi,

I'm currently trying to achieve what Fateme was aiming at. However, I'm not exactly sure of the process to accomplish it? Which of the Python script should I use to pass my sample text files (reviews of several users) to the trained model as to extract their personality? In short, what would be the required steps to infer the personality of multiple users. Thank you for your help! (and really interesting program by the way)

Arkhemis avatar Apr 12 '21 17:04 Arkhemis

Hi,

We added a new python file predictor.py to predict personality traits for new text as input. (if you want to pass a file, you can modify this code file easily.)

saminfatehi avatar Apr 15 '21 21:04 saminfatehi

Hi @saminfatehir Thanks for uploading predictor.py, it will be very helpful! Please, I still need to follow all the steps in README to run predictor.py, right? Thanks!

di-press avatar Apr 19 '21 02:04 di-press

You need to follow steps 1, 2, and 3 in README. Then you can run predictor.py to predict the personality traits for new input.

saminfatehi avatar Apr 20 '21 09:04 saminfatehi

Thank you for the predictor.py file! However, I think there is something wrong with the classifying phase, at least on my side. For instance, asking to classify a random short text from Reddit, such as "We can only fix this in two ways: either we also waive environmental regulations to produce REMs (something I, as an environmentalist, am not wholly against if done responsibly, as we need them to manufacture green technology), or we invest in space mining. We have all the technology we need to mine in space, we just refuse to invest in it because the cost is literally astronomical", is taking forever (around ~1 hour). Is it a normal waiting time?

Pompiia avatar Apr 23 '21 12:04 Pompiia

Hi @Pompiia, It takes time to create SVM models by essays dataset. If you save the model once you can use them for other times. Now I modified the predictor.py to save the models the first time.

Note that you should run the BERT server in the other terminal like step 2 on README (to get your text embedding).

saminfatehi avatar Apr 27 '21 11:04 saminfatehi

It's amazing, thank you very much!

However, I'm encountering the following issue:

Traceback (most recent call last): File "predictor.py", line 177, in <module> predictions = predict(x) File "predictor.py", line 171, in predict results = predict_with_model(embeddings) File "predictor.py", line 153, in predict_with_model predicts = classifier.predict(embeddings4layers) File "C:\Users\alexa\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\svm\base.py", line 567, in predict y = super(BaseSVC, self).predict(X) File "C:\Users\alexa\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\svm\base.py", line 325, in predict X = self._validate_for_predict(X) File "C:\Users\alexa\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\svm\base.py", line 478, in _validate_for_predict (n_features, self.shape_fit_[1])) ValueError: X.shape[1] = 3072 should be equal to 3156, the number of features at training time. Could you help me on that matter?

Pompiia avatar Apr 27 '21 16:04 Pompiia

@saminfatehir Hi,

I run the new added predictor.py with my own text (a 72.8Kb txt file) in my lab's computer. I also run the BERT server in the terminal like in step 2 and all workers are ready. But the output is only to print the text again and it's still running, for around 2 hours. Was the model running well?

Thanks for helping in advance!

kkkkangx avatar Apr 30 '21 10:04 kkkkangx

Here are my screenshots

图片

图片

kkkkangx avatar Apr 30 '21 10:04 kkkkangx

Hi @kkkkangx, The text should get user input. When you run the original predictor.py you'll see this: image

Then you should enter your custom text like this and press Enter: image

saminfatehi avatar Apr 30 '21 10:04 saminfatehi

@saminfatehir Hi,

Thank you so much, I have tried and it worked.

I have another question: how could I save the trained model? Because I have over 200 texts to be predicted, I wonder if I can save the trained model and just use it the next time, which could save much time.

Thanks in advance!

kkkkangx avatar Apr 30 '21 14:04 kkkkangx

predictor.py creates and saves SVM models for the first time and uses them other times.

saminfatehi avatar May 01 '21 05:05 saminfatehi

@saminfatehir Hi,

It worked well. Thank you :)

If I understand correctly, the final output of SVM models is a binary variable, indicating whether the corresponding personality trait displayed in the text. Are there any problems with my understanding of the final output?

In my project, I want to get the estimated continous probability of a personality trait displayed in my text. So, I changed the line 150 of predictor.py into predicts = classifier.predict_proba(embeddings4layers)[:,1]. I am not sure if I can get the output I want (i.e. the probability)?

Looking forward to your replying.

kkkkangx avatar May 02 '21 16:05 kkkkangx

Capture

Hi @saminfatehir,

Continuing our conversation on this issue as this is what my current hiccup pertains to.

Continuing from our previous conversation:

I wasn't aware that I had to open a second terminal to complete step 3. I was under the impression I would wait for a process to finish so I could begin step 3 in the exact same terminal.

However, you are completely correct. After opening a second terminal and starting step 3, it ran to completion with no issues.

For this issue:

I ran predictor.py, inputted some text and received the attached error.

I attempted to re-run the script as an administrator to no avail. Really appreciate all your help thus far.

LifeofLucidity avatar May 11 '21 18:05 LifeofLucidity

Hello,

I'am having trouble on this project. For a long time I've been triying to complete but I did not manage to do it yet.

I followed steps 1, 2, and 3. Then I run predictor.py

I got an error message. son1 son2 son3

6wom9 avatar Nov 04 '21 10:11 6wom9

Hi @6wom9 !

I couldn't run the code on my computer, but I could run it on Google Colaboratory, through a Jupyter Notebook (that is a .ipynb file). Here you can find the notebook:

https://github.com/di-press/algorithms_personality/blob/master/text_personality_extractor.ipynb

It is exactly the same code that the authors did, but only adapting it to run on Google Colab. Note that you need to set a GPU on the Colab notebook environment. If you are a little familiar with notebooks or Colab, I think it won't be hard to execute. So, before executing the code, go to:

Edit > Notebook Settings > Hardware Accelerator (select GPU) > Save

Now you have a GPU and are able to run the code. Then do:

Runtime > Run all

And wait. It generally takes easily more than 1 hour. Finally, you have to input the text through the terminal, when you receive the message on the terminal:

"Enter a new text:"

then you paste (or type) in the terminal the text that you want to be analysed. If you need to analyse a lot of text, instead of copying each text to the terminal exhaustively, you need to change predictor.py to parse your own files (this is not a complicated task if you are familiar with Python).

Obs: The last time I ran this code was around 6 months ago, but I think it is still working. If you have any problems, let me know, please.

di-press avatar Nov 04 '21 16:11 di-press

Hello, I would like to ask if the output result through predictor.py is [0.0, 1.0, 0.0, 0.0, 1.0](EXT, NEU, AGR, CON, OPN), does this mean that the personality represented by this message is NEU and OPN?

SmallZhangZhang avatar Jan 23 '22 06:01 SmallZhangZhang

I think the output represents both "Neu" and "Opn", not only one. How to run this code? I still can not run this code.

6wom9 avatar Mar 23 '22 07:03 6wom9

I think the output represents both "Neu" and "Opn", not only one. How to run this code? I still can not run this code. I did what the blogger said “You need to follow steps 1, 2, and 3 in README. Then you can run predictor.py to predict the personality traits for new input.”. I changed a parameter in step 2, I “changed -num_worker=4” to “-num_worker=1”.

In fact, I couldn't run the code at first, but after a lot of trial and error, it worked.

bert-serving-start -model_dir uncased_L-12_H-768_A-12/ -num_worker=4 -max_seq_len=NONE -show_tokens_to_client -pooling_layer -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1

SmallZhangZhang avatar Mar 24 '22 09:03 SmallZhangZhang

Hi @6wom9 !

I couldn't run the code on my computer, but I could run it on Google Colaboratory, through a Jupyter Notebook (that is a .ipynb file). Here you can find the notebook:

https://github.com/di-press/algorithms_personality/blob/master/text_personality_extractor.ipynb

It is exactly the same code that the authors did, but only adapting it to run on Google Colab. Note that you need to set a GPU on the Colab notebook environment. If you are a little familiar with notebooks or Colab, I think it won't be hard to execute. So, before executing the code, go to:

Edit > Notebook Settings > Hardware Accelerator (select GPU) > Save

Now you have a GPU and are able to run the code. Then do:

Runtime > Run all

And wait. It generally takes easily more than 1 hour. Finally, you have to input the text through the terminal, when you receive the message on the terminal:

"Enter a new text:"

then you paste (or type) in the terminal the text that you want to be analysed. If you need to analyse a lot of text, instead of copying each text to the terminal exhaustively, you need to change predictor.py to parse your own files (this is not a complicated task if you are familiar with Python).

Obs: The last time I ran this code was around 6 months ago, but I think it is still working. If you have any problems, let me know, please.

Hello di-press

The code which you modified, does not work anymore, I think. I could not run this code.

from bert_serving.client import BertClient
bc = BertClient()
#print (bc.encode(['First do it', 'then do it right', 'then do it better']))

The code works well until this part of the code. Then I am having following error.

KeyboardInterrupt Traceback (most recent call last) in () 1 from bert_serving.client import BertClient ----> 2 bc = BertClient() 3 #print (bc.encode(['First do it', 'then do it right', 'then do it better']))

5 frames zmq/backend/cython/socket.pyx in zmq.backend.cython.socket.Socket.recv()

zmq/backend/cython/socket.pyx in zmq.backend.cython.socket.Socket.recv()

zmq/backend/cython/socket.pyx in zmq.backend.cython.socket._recv_copy()

/usr/local/lib/python3.7/dist-packages/zmq/backend/cython/checkrc.pxd in zmq.backend.cython.checkrc._check_rc()

KeyboardInterrupt:

6wom9 avatar Apr 13 '22 06:04 6wom9