ABSC
ABSC copied to clipboard
what is the format of aspect_id_new.txt ? Can you share the example for twitter dataset?
So I cannot find what is the format of aspect_id_new.txt ? I need to create one for the twitter but you have no example. Can you please share an example? I have added the aspects I am look for in a tweet in order to find the sentiment towards them.
[jalal@goku ASC]$ grep -irn "aspect_id_new.txt"
model/ram.py:33:tf.app.flags.DEFINE_string('aspect_id_file_path', 'data/restaurant/aspect_id_new.txt', 'word-id mapping file')
model/at_lstm.py:31:tf.app.flags.DEFINE_string('aspect_id_file_path', 'data/restaurant/aspect_id_new.txt', 'word-id mapping file')
newbie_nn/config.py:38:tf.app.flags.DEFINE_string('aspect_id_file_path', 'data/restaurant/aspect_id_new.txt', 'word-id mapping file')
Binary file newbie_nn/config.pyc matches
[jalal@goku ASC]$ python model/lcr.py --train_file_path data/absa/twitter/1train_new.txt --test_file_path data/absa/twitter/test.txt --embedding_file_path data/absa/twitter/twitter_word_embedding_partial_300_42b.txt --learning_rate 0.1 --batch_size 25 --n_iter 50 --random_base 0.1 --l2_reg 0.00001 --keep_prob1 0.5 --keep_prob2 0.5 --word_id_file_path data/twitter/aspect_id_new.txt
Parameters:
aspect_id_file_path=data/restaurant/aspect_id_new.txt
batch_size=25
display_step=4
embedding_dim=300
embedding_file_path=data/absa/twitter/twitter_word_embedding_partial_300_42b.txt
is_r=1
keep_prob1=0.5
keep_prob2=0.5
l2_reg=1e-05
learning_rate=0.1
max_doc_len=20
max_sentence_len=80
max_target_len=10
method=AE
model_num=100
n_class=3
n_hidden=300
n_iter=50
n_layer=3
prob_file=prob1.txt
random_base=0.1
saver_file=prob1.txt
t1=last
t2=last
test_file_path=data/absa/twitter/test.txt
test_file_path_r=data/restaurant/rest_2014_lstm_test_new.txt
train_file_path=data/absa/twitter/1train_new.txt
train_file_path_r=data/restaurant/rest_2014_lstm_train_new.txt
validate_file_path=data/restaurant/rest_2014_lstm_test_new.txt
validate_file_path_r=data/restaurant/rest_2014_lstm_test_new.txt
word_id_file_path=data/twitter/aspect_id_new.txt
a bad word embedding: 10213
(10215, 300)
10215 10215
I am lcr_rot.
2018-04-20 02:14:33.667016: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-04-20 02:14:33.667051: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-04-20 02:14:33.667060: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-04-20 02:14:33.667068: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-04-20 02:14:33.667075: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
load word-to-id done!
Traceback (most recent call last):
File "model/lcr.py", line 255, in <module>
tf.app.run()
File "/scratch/sjn-p2/anaconda/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "model/lcr.py", line 143, in main
FLAGS.max_target_len
File "/scratch2/debate_tweets/sentiment/ASC/data_prepare/utils.py", line 135, in load_inputs_twitter
y.append(lines[i + 2].strip().split()[0])
IndexError: list index out of range
[jalal@goku ASC]$ cat data
data/ data_prepare/
[jalal@goku ASC]$ cat data/twitter/aspect_id_new.txt
Hillary Clinton
Ted Cruz
Bernie Sanders
Sanders
Bernie
[jalal@goku ASC]$
Have you found a solution to this issue? Please help me if you can