FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.

Open sarang-26 opened this issue 1 year ago • 3 comments

While Fine-Tuning the embedding using the eval_msmarco.py script.

I keep getting this Value Error. I am running this script on colab on T4 GPU

Please check the output below.

python 

!python -m FlagEmbedding.baai_general_embedding.finetune.eval_msmarco \
--encoder BAAI/bge-base-en-v1.5 \
--fp16 \
--add_instruction ""\
--k 100 \
--corpus_data /content/corpus_content_msmacro.json \
--query_data /content/eval_content_msmacro.json

2024-06-25 09:25:00.857213: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-06-25 09:25:00.857255: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-06-25 09:25:00.858912: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-06-25 09:25:00.867709: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-06-25 09:25:02.016193: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Generating train split: 1000 examples [00:00, 94186.29 examples/s] Generating train split: 1000 examples [00:00, 181705.32 examples/s] Inference Embeddings: 100% 4/4 [00:02<00:00, 1.94it/s] Inference Embeddings: 100% 4/4 [00:00<00:00, 12.14it/s] Searching: 100% 4/4 [00:00<00:00, 353.52it/s] Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.10/dist-packages/FlagEmbedding/baai_general_embedding/finetune/eval_msmarco.py", line 266, in main() File "/usr/local/lib/python3.10/dist-packages/FlagEmbedding/baai_general_embedding/finetune/eval_msmarco.py", line 260, in main metrics = evaluate(retrieval_results, scores, ground_truths) File "/usr/local/lib/python3.10/dist-packages/FlagEmbedding/baai_general_embedding/finetune/eval_msmarco.py", line 201, in evaluate auc = roc_auc_score(pred_hard_encodings1d, preds_scores1d) File "/usr/local/lib/python3.10/dist-packages/sklearn/metrics/_ranking.py", line 572, in roc_auc_score return _average_binary_score( File "/usr/local/lib/python3.10/dist-packages/sklearn/metrics/_base.py", line 75, in _average_binary_score return binary_metric(y_true, y_score, sample_weight=sample_weight) File "/usr/local/lib/python3.10/dist-packages/sklearn/metrics/_ranking.py", line 339, in _binary_roc_auc_score raise ValueError( ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.

sarang-26 avatar Jun 25 '24 09:06 sarang-26

Hi, @sarang-26 , it might be due to the runtime environment. You can try installing pip install scikit-learn==1.3.2, which does not have this issue in my experiment.

staoxiao avatar Jun 26 '24 05:06 staoxiao

pip install scikit-learn==1.3.2

@staoxiao I tried the above solution, but still I get the same error. I am not changing any dataset either. I just follow the instructions, but I get this issue.

sarang-26 avatar Jun 27 '24 08:06 sarang-26

@sarang-26 hi, I met the same problem too. have you solve it now?

ling-chun avatar Aug 29 '24 02:08 ling-chun