Performance

Open zshu000000 opened this issue 2 years ago • 0 comments

Thanks for writing this package. This package is very useful.

To check if I wrote the test code correctly, I used the training set as an independent test set for testing, but the performance was very poor. The AUC value of the independent test set was only 0.6654. I don't know what's wrong. May I ask for your suggestions? Thanks!

from DeepTCR.DeepTCR import DeepTCR_WF from sklearn.metrics import roc_curve, auc import numpy as np import pandas as pd import matplotlib.pyplot as plt

def test_roc_curve(test_y,test_y_score,color):

fpr, tpr, threshold = roc_curve(test_y, test_y_score) 
roc_auc = auc(fpr, tpr) 
lw = 2
plt.figure(figsize=(8, 5))
plt.plot(fpr, tpr, color=color,
         lw=lw, label='tumor_top500 (area = %0.4f)' % roc_auc)  
plt.plot([0, 1], [0, 1], color='blue', lw=lw, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Independent Test Set')
plt.legend(loc="lower right")
plt.show()

Train

Instantiate training object

DTCR_WF = DeepTCR_WF('lung_cancer')

load data from directories

DTCR_WF.Get_Data(directory='/home/tcr/lung_cancer/pbmc',n_jobs=40,aa_column_beta=0)

DTCR_WF.Get_Train_Valid_Test(test_size=0.5) DTCR_WF.Train(batch_size=100)

ROC curve

DTCR_WF.AUC_Curve(by='tumor_top500')

Test

load independent test

DeepTCR_WF_test = DeepTCR_WF('testset') DeepTCR_WF_test.Get_Data(directory='/home/tcr/lung_cancer/pbmc',n_jobs=40,aa_column_beta=0)

beta_sequences = DeepTCR_WF_test.beta_sequences sample_labels = DeepTCR_WF_test.sample_id

predict

DTCR_WF.Sample_Inference(sample_labels=sample_labels, beta_sequences=beta_sequences,batch_size=100) Inference_Pred = DTCR_WF.Inference_Pred

ROC curve

pos = np.ones(121, dtype=np.float64) neg = np.zeros(363, dtype=np.float64) test_y = np.hstack((pos,neg)) test_y_score=Inference_Pred[:,0] test_roc_curve(test_y=test_y,test_y_score=test_y_score,color='yellow')

Results: Epoch: 60 Training loss: 0.06519 Validation loss: 0.04857 Testing loss: 0.05209 Training Accuracy: 0.99667 Validation Accuracy: 1.0 Testing Accuracy: 1.0 Testing AUC: 1.0

However，the AUC value of the independent test set was only 0.6654.

Oct 17 '22 01:10 zshu000000

DeepTCR DeepTCR copied to clipboard

Performance

Train

Instantiate training object

load data from directories

ROC curve

Test

load independent test

predict

ROC curve

DeepTCR
DeepTCR copied to clipboard