keras-applications
keras-applications copied to clipboard
glove embeddings with keras
Hi,
I'm setting up my keras layer for glove embeddings as follows:
X = X[:set_size]
y = y[:set_size]
yy = []
for i in range(0,len(X)):
yy = np.concatenate((yy,X[i,0:set_size]), axis=0)
X = "".join([" "+i if not i.startswith("'") and i not in string.punctuation else i for i in yy]).strip()
t = Tokenizer()
t.fit_on_texts(X)
encoded_docs = t.texts_to_sequences(X)
# pad documents to a max length of 4 words
max_length = 4
vocab_size = len(t.word_index) + 1
embeddings_index = dict()
f = open('glove.6B/glove.6B.50d.txt', encoding='utf8')
for line in f:
values = line.split()
word = values[0]
coefs = np.asarray(values[1:], dtype='float32')
embeddings_index[word] = coefs
f.close()
print('Loaded %s word vectors.' % len(embeddings_index))
# create a weight matrix for words in training docs
embedding_matrix = zeros((vocab_size, 50))
for word, i in t.word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector
vocab_size = len(t.word_index) + 1
max_len = 200
sequence_input = Input(shape=(max_len,), dtype='int32', name="seq_input")
embedding_layer = Embedding(vocab_size, 50, weights=[embedding_matrix], input_length=200, trainable=False)
e = embedding_layer(sequence_input)
I then pass this layer to a GRU neural network.
testGru = GRU(units=16, return_sequences=False )(e)
classification = Dense(1, activation='sigmoid')(testGru)
model = Model(sequence_input, outputs=classification)
I get the following error when I try to do a fit. validation_split = int((0.05 * set_size / batch_size)) * batch_size / set_size model.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])
try: with redirect_stdout(sys.stderr): with graph.as_default(): history = model.fit(np.array(X), y, validation_split=validation_split, shuffle=True, batch_size=batch_size, epochs=epochs, verbose=1, callbacks=fit_callbacks) except Exception as e: raise e
I get the following error:
ValueError: Error when checking input: expected seq_input to have 2 dimensions, but got array with shape ()
Please help
Best regards, Cartik