DeepTCR
DeepTCR copied to clipboard
Performance
Thanks for writing this package. This package is very useful.
To check if I wrote the test code correctly, I used the training set as an independent test set for testing, but the performance was very poor. The AUC value of the independent test set was only 0.6654. I don't know what's wrong. May I ask for your suggestions? Thanks!
from DeepTCR.DeepTCR import DeepTCR_WF from sklearn.metrics import roc_curve, auc import numpy as np import pandas as pd import matplotlib.pyplot as plt
def test_roc_curve(test_y,test_y_score,color):
fpr, tpr, threshold = roc_curve(test_y, test_y_score)
roc_auc = auc(fpr, tpr)
lw = 2
plt.figure(figsize=(8, 5))
plt.plot(fpr, tpr, color=color,
lw=lw, label='tumor_top500 (area = %0.4f)' % roc_auc)
plt.plot([0, 1], [0, 1], color='blue', lw=lw, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Independent Test Set')
plt.legend(loc="lower right")
plt.show()
Train
Instantiate training object
DTCR_WF = DeepTCR_WF('lung_cancer')
load data from directories
DTCR_WF.Get_Data(directory='/home/tcr/lung_cancer/pbmc',n_jobs=40,aa_column_beta=0)
DTCR_WF.Get_Train_Valid_Test(test_size=0.5) DTCR_WF.Train(batch_size=100)
ROC curve
DTCR_WF.AUC_Curve(by='tumor_top500')
Test
load independent test
DeepTCR_WF_test = DeepTCR_WF('testset') DeepTCR_WF_test.Get_Data(directory='/home/tcr/lung_cancer/pbmc',n_jobs=40,aa_column_beta=0)
beta_sequences = DeepTCR_WF_test.beta_sequences sample_labels = DeepTCR_WF_test.sample_id
predict
DTCR_WF.Sample_Inference(sample_labels=sample_labels, beta_sequences=beta_sequences,batch_size=100) Inference_Pred = DTCR_WF.Inference_Pred
ROC curve
pos = np.ones(121, dtype=np.float64) neg = np.zeros(363, dtype=np.float64) test_y = np.hstack((pos,neg)) test_y_score=Inference_Pred[:,0] test_roc_curve(test_y=test_y,test_y_score=test_y_score,color='yellow')
Results: Epoch: 60 Training loss: 0.06519 Validation loss: 0.04857 Testing loss: 0.05209 Training Accuracy: 0.99667 Validation Accuracy: 1.0 Testing Accuracy: 1.0 Testing AUC: 1.0
However,the AUC value of the independent test set was only 0.6654.