machine learning - Keras: model.evaluate vs model.predict accuracy difference in multi-class NLP task -
i training simple model in keras nlp task following code. variable names self explanatory train, test , validation set. dataset has 19 classes final layer of network has 19 outputs. labels one-hot encoded.
nb_classes = 19 model1 = sequential() model1.add(embedding(nb_words, embedding_dim, weights=[embedding_matrix], input_length=max_sequence_length, trainable=false)) model1.add(lstm(num_lstm, dropout=rate_drop_lstm, recurrent_dropout=rate_drop_lstm)) model1.add(dropout(rate_drop_dense)) model1.add(batchnormalization()) model1.add(dense(num_dense, activation=act)) model1.add(dropout(rate_drop_dense)) model1.add(batchnormalization()) model1.add(dense(nb_classes, activation = 'sigmoid')) model1.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) #one hot encode labels ytrain_enc = np_utils.to_categorical(train_labels) yval_enc = np_utils.to_categorical(val_labels) ytestenc = np_utils.to_categorical(test_labels) model1.fit(train_data, ytrain_enc, validation_data=(val_data, yval_enc), epochs=200, batch_size=384, shuffle=true, verbose=1)
after first epoch, gives me these outputs.
epoch 1/200 216632/216632 [==============================] - 2442s - loss: 0.1427 - acc: 0.9443 - val_loss: 0.0526 - val_acc: 0.9826
then evaluate model on testing dataset , shows me accuracy around 0.98.
model1.evaluate(test_data, y = ytestenc, batch_size=384, verbose=1)
however, labels one-hot encoded, need prediction vector of classes can generate confusion matrix etc. use,
predicted_classes = model1.predict_classes(test_data, batch_size=384, verbose=1) temp = sum(test_labels == predicted_classes) temp/len(test_labels) 0.83
this shows total predicted classes 83% accurate model1.evaluate
shows 98% accuracy!! doing wrong here? loss function okay categorical class labels? choice of sigmoid
activation function prediction layer okay? or there difference in way keras evaluates model? please suggest on can wrong. first try make deep model don't have understanding of what's wrong here.
i have found problem. metrics=['accuracy']
calculates accuracy automatically cost function. using binary_crossentropy
shows binary accuracy, not categorical accuracy. using categorical_crossentropy
automatically switches categorical accuracy , same calculated manually using model1.predict()
. yu-yang right point out cost function , activation function multi-class problem.
p.s: 1 can both categorical , binary accuracy using metrics=['binary_accuracy', 'categorical_accuracy']
wiki
Comments
Post a Comment