personality_detection
personality_detection copied to clipboard
Predict personality scores of a new text
Hi. How can I give a review (a new text) to this model and extract the reviewer's personality?
Hi Fateme, I think the best approach is to feed the full essays to the model (skip the psycholinguistics features) and then pass your new test sample to the trained model.
Hi,
I'm currently trying to achieve what Fateme was aiming at. However, I'm not exactly sure of the process to accomplish it? Which of the Python script should I use to pass my sample text files (reviews of several users) to the trained model as to extract their personality? In short, what would be the required steps to infer the personality of multiple users. Thank you for your help! (and really interesting program by the way)
Hi,
We added a new python file predictor.py to predict personality traits for new text as input. (if you want to pass a file, you can modify this code file easily.)
Hi @saminfatehir Thanks for uploading predictor.py, it will be very helpful! Please, I still need to follow all the steps in README to run predictor.py, right? Thanks!
You need to follow steps 1, 2, and 3 in README. Then you can run predictor.py to predict the personality traits for new input.
Thank you for the predictor.py file! However, I think there is something wrong with the classifying phase, at least on my side. For instance, asking to classify a random short text from Reddit, such as "We can only fix this in two ways: either we also waive environmental regulations to produce REMs (something I, as an environmentalist, am not wholly against if done responsibly, as we need them to manufacture green technology), or we invest in space mining. We have all the technology we need to mine in space, we just refuse to invest in it because the cost is literally astronomical", is taking forever (around ~1 hour). Is it a normal waiting time?
Hi @Pompiia, It takes time to create SVM models by essays dataset. If you save the model once you can use them for other times. Now I modified the predictor.py to save the models the first time.
Note that you should run the BERT server in the other terminal like step 2 on README (to get your text embedding).
It's amazing, thank you very much!
However, I'm encountering the following issue:
Traceback (most recent call last): File "predictor.py", line 177, in <module> predictions = predict(x) File "predictor.py", line 171, in predict results = predict_with_model(embeddings) File "predictor.py", line 153, in predict_with_model predicts = classifier.predict(embeddings4layers) File "C:\Users\alexa\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\svm\base.py", line 567, in predict y = super(BaseSVC, self).predict(X) File "C:\Users\alexa\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\svm\base.py", line 325, in predict X = self._validate_for_predict(X) File "C:\Users\alexa\AppData\Local\Programs\Python\Python37\lib\site-packages\sklearn\svm\base.py", line 478, in _validate_for_predict (n_features, self.shape_fit_[1])) ValueError: X.shape[1] = 3072 should be equal to 3156, the number of features at training time
.
Could you help me on that matter?
@saminfatehir Hi,
I run the new added predictor.py with my own text (a 72.8Kb txt file) in my lab's computer. I also run the BERT server in the terminal like in step 2 and all workers are ready. But the output is only to print the text again and it's still running, for around 2 hours. Was the model running well?
Thanks for helping in advance!
Here are my screenshots
Hi @kkkkangx,
The text should get user input. When you run the original predictor.py you'll see this:
Then you should enter your custom text like this and press Enter:
@saminfatehir Hi,
Thank you so much, I have tried and it worked.
I have another question: how could I save the trained model? Because I have over 200 texts to be predicted, I wonder if I can save the trained model and just use it the next time, which could save much time.
Thanks in advance!
predictor.py creates and saves SVM models for the first time and uses them other times.
@saminfatehir Hi,
It worked well. Thank you :)
If I understand correctly, the final output of SVM models is a binary variable, indicating whether the corresponding personality trait displayed in the text. Are there any problems with my understanding of the final output?
In my project, I want to get the estimated continous probability of a personality trait displayed in my text. So, I changed the line 150 of predictor.py into predicts = classifier.predict_proba(embeddings4layers)[:,1]. I am not sure if I can get the output I want (i.e. the probability)?
Looking forward to your replying.
Hi @saminfatehir,
Continuing our conversation on this issue as this is what my current hiccup pertains to.
Continuing from our previous conversation:
I wasn't aware that I had to open a second terminal to complete step 3. I was under the impression I would wait for a process to finish so I could begin step 3 in the exact same terminal.
However, you are completely correct. After opening a second terminal and starting step 3, it ran to completion with no issues.
For this issue:
I ran predictor.py, inputted some text and received the attached error.
I attempted to re-run the script as an administrator to no avail. Really appreciate all your help thus far.
Hello,
I'am having trouble on this project. For a long time I've been triying to complete but I did not manage to do it yet.
I followed steps 1, 2, and 3. Then I run predictor.py
I got an error message.
Hi @6wom9 !
I couldn't run the code on my computer, but I could run it on Google Colaboratory, through a Jupyter Notebook (that is a .ipynb file). Here you can find the notebook:
https://github.com/di-press/algorithms_personality/blob/master/text_personality_extractor.ipynb
It is exactly the same code that the authors did, but only adapting it to run on Google Colab. Note that you need to set a GPU on the Colab notebook environment. If you are a little familiar with notebooks or Colab, I think it won't be hard to execute. So, before executing the code, go to:
Edit > Notebook Settings > Hardware Accelerator (select GPU) > Save
Now you have a GPU and are able to run the code. Then do:
Runtime > Run all
And wait. It generally takes easily more than 1 hour. Finally, you have to input the text through the terminal, when you receive the message on the terminal:
"Enter a new text:"
then you paste (or type) in the terminal the text that you want to be analysed. If you need to analyse a lot of text, instead of copying each text to the terminal exhaustively, you need to change predictor.py to parse your own files (this is not a complicated task if you are familiar with Python).
Obs: The last time I ran this code was around 6 months ago, but I think it is still working. If you have any problems, let me know, please.
Hello, I would like to ask if the output result through predictor.py is [0.0, 1.0, 0.0, 0.0, 1.0](EXT, NEU, AGR, CON, OPN), does this mean that the personality represented by this message is NEU and OPN?
I think the output represents both "Neu" and "Opn", not only one. How to run this code? I still can not run this code.
I think the output represents both "Neu" and "Opn", not only one. How to run this code? I still can not run this code. I did what the blogger said “You need to follow steps 1, 2, and 3 in README. Then you can run predictor.py to predict the personality traits for new input.”. I changed a parameter in step 2, I “changed -num_worker=4” to “-num_worker=1”.
In fact, I couldn't run the code at first, but after a lot of trial and error, it worked.
bert-serving-start -model_dir uncased_L-12_H-768_A-12/ -num_worker=4 -max_seq_len=NONE -show_tokens_to_client -pooling_layer -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1
Hi @6wom9 !
I couldn't run the code on my computer, but I could run it on Google Colaboratory, through a Jupyter Notebook (that is a .ipynb file). Here you can find the notebook:
https://github.com/di-press/algorithms_personality/blob/master/text_personality_extractor.ipynb
It is exactly the same code that the authors did, but only adapting it to run on Google Colab. Note that you need to set a GPU on the Colab notebook environment. If you are a little familiar with notebooks or Colab, I think it won't be hard to execute. So, before executing the code, go to:
Edit > Notebook Settings > Hardware Accelerator (select GPU) > Save
Now you have a GPU and are able to run the code. Then do:
Runtime > Run all
And wait. It generally takes easily more than 1 hour. Finally, you have to input the text through the terminal, when you receive the message on the terminal:
"Enter a new text:"
then you paste (or type) in the terminal the text that you want to be analysed. If you need to analyse a lot of text, instead of copying each text to the terminal exhaustively, you need to change predictor.py to parse your own files (this is not a complicated task if you are familiar with Python).
Obs: The last time I ran this code was around 6 months ago, but I think it is still working. If you have any problems, let me know, please.
Hello di-press
The code which you modified, does not work anymore, I think. I could not run this code.
from bert_serving.client import BertClient
bc = BertClient()
#print (bc.encode(['First do it', 'then do it right', 'then do it better']))
The code works well until this part of the code. Then I am having following error.
KeyboardInterrupt Traceback (most recent call last)
5 frames zmq/backend/cython/socket.pyx in zmq.backend.cython.socket.Socket.recv()
zmq/backend/cython/socket.pyx in zmq.backend.cython.socket.Socket.recv()
zmq/backend/cython/socket.pyx in zmq.backend.cython.socket._recv_copy()
/usr/local/lib/python3.7/dist-packages/zmq/backend/cython/checkrc.pxd in zmq.backend.cython.checkrc._check_rc()
KeyboardInterrupt: