2BiVQA
2BiVQA copied to clipboard
2BiVQA is a no-reference deep learning based video quality assessment metric.
2BiVQA
2BiVQA: Double Bi-LSTM based Video Quality Assessment of UGC Videos
This repository contains the code of our paper 2BiVQA: Double Bi-LSTM based Video Quality Assessment of UGC Videos. If you use any part of our code, please cite:
@article{telili20222bivqa,
title={2BiVQA: Double Bi-LSTM based Video Quality Assessment of UGC Videos},
author={Telili, Ahmed and Fezza, Sid Ahmed and Hamidouche, Wassim and Meftah, Hanene FZ},
journal={arXiv preprint arXiv:2208.14774},
year={2022}
}
- Requirements
- Features extraction
- Model Training
-
Test
- On KonViD-1K
- On LIVE_VQC
- Demo
- Evaluate
- Performance Benchmark
- References
Requirements
pip install -r requirements.txt
Features extraction
Please note that the meta-data should be a csv file with two columns: video name and MOS.
python3 extract_features.py [-h] [-v 'path to videos directory']
[-f 'path to meta-data csv file']
[-o 'overlapping between patches']
[-fl 'flag: 0 for videos and 1 for images']
To extract features from images, please set flag to 1.
ResNet50 is used for features extractions.
Model Training (optional):
This step can be skipped, and directly test the model in the next section with pre-trained models.
To train your own model:
python End2End_train.py [-h] [-nf number of frames to be extracted] [-b batch_size]
To train your own spatial pooling model on other image datasets:
python spatial_train.py [-h] [-p number of patches] [-b batch_size]
Test:
To test the model:
a-On KonViD-1K:
python test_model.py --dataset konvid
Methods | SROCC | PLCC | KROCC | RMSE |
---|---|---|---|---|
2BiVQA | 0.8463 | 0.8404 | 0.6529 | 0.3620 |
b-On LIVE_VQC:
python test_model.py --dataset live
Methods | SROCC | PLCC | KROCC | RMSE |
---|---|---|---|---|
2BiVQA | 0.7614 | 0.8325 | 0.6212 | 9.9799 |
Demo:
To predict the quality of your own dataset using pre-trained model:
python demo.py [-h] [-nf number of frames to be extracted] [-m path to pretrained model] [-f path to videos dir]
Evaluate:
To evaluate the model:
Please note that your csv file should have two columns: 'Mos' and 'Predicted'.
python evaluate.py --mos_pred konvid.csv
Performance Benchmark:
KonViD-1K [1]:
Methods | SROCC | PLCC | KROCC | RMSE |
---|---|---|---|---|
BRISQUE | 0.6567 | 0.6576 | 0.4761 | 0.4813 |
NIQE | 0.5417 | 0.5530 | 0.3790 | 0.5336 |
ILNIQE | 0.5264 | 0.5400 | 0.3692 | 0.5406 |
VIIDEO | 0.2988 | 0.3002 | 0.2036 | 0.6101 |
GM-LOG | 0.6578 | 0.6636 | 0.4770 | 0.4818 |
HIGRADE | 0.7206 | 0.7269 | 0.5319 | 0.4391 |
FRIQUEE | 0.7472 | 0.7482 | 0.5509 | 0.4252 |
CORNIA | 0.7169 | 0.7135 | 0.5231 | 0.4486 |
HOSA | 0.7654 | 0.7664 | 0.5690 | 0.4142 |
V-BLIINDS | 0.7101 | 0.7037 | 0.5188 | 0.4595 |
TLVQM | 0.7729 | 0.7688 | 0.5770 | 0.4102 |
ResNet-50 | 0.8018 | 0.8104 | 0.6100 | 0.3749 |
VGG-19 | 0.7741 | 0.7845 | 0.5841 | 0.3958 |
KonCept512 | 0.7349 | 0.7489 | 0.5425 | 0.4260 |
VIDEVAL | 0.7832 | 0.7803 | 0.5845 | 0.4026 |
RAPIQUE | 0.8072 | 0.8175 | 0.6189 | 0.3623 |
2BiVQA | 0.8463 | 0.8404 | 0.6529 | 0.3620 |
LIVE VQC [2]:
Methods | SROCC | PLCC | KROCC | RMSE |
---|---|---|---|---|
BRISQUE | 0.5925 | 0.6380 | 0.4162 | 13.100 |
NIQE | 0.5957 | 0.6286 | 0.4252 | 13.110 |
ILNIQE | 0.5037 | 0.5437 | 0.3555 | 14.148 |
VIIDEO | 0.0332 | 0.0231 | 0.2146 | 16.654 |
GM-LOG | 0.5881 | 0.6212 | 0.4180 | 13.223 |
HIGRADE | 0.6103 | 0.6332 | 0.4391 | 13.027 |
FRIQUEE | 0.6579 | 0.7000 | 0.4770 | 12.198 |
CORNIA | 0.6719 | 0.7183 | 0.4849 | 11.832 |
HOSA | 0.6873 | 0.7414 | 0.5033 | 11.353 |
V-BLIINDS | 0.6939 | 0.7178 | 0.5078 | 11.765 |
TLVQM | 0.7988 | 0.8025 | 0.6080 | 10.145 |
ResNet-50 | 0.6636 | 0.7205 | 0.4786 | 11.591 |
VGG-19 | 0.6568 | 0.7160 | 0.4722 | 11.783 |
KonCept512 | 0.6645 | 0.7278 | 0.4793 | 11.626 |
VIDEVAL | 0.7522 | 0.7514 | 0.5639 | 11.100 |
RAPIQUE | 0.7415 | 0.7659 | 0.5576 | 10.6653 |
2BiVQA | 0.7614 | 0.8325 | 0.6212 | 9.9799 |
YouTube-UGC [3]:
Methods | SROCC | PLCC | KROCC | RMSE |
---|---|---|---|---|
BRISQUE | 0.3820 | 0.3952 | 0.2635 | 0.5919 |
NIQE | 0.2379 | 0.2776 | 0.1600 | 0.6174 |
ILNIQE | 0.2918 | 0.3302 | 0.1980 | 0.6052 |
VIIDEO | 0.0580 | 0.1534 | 0.0389 | 0.6359 |
GM-LOG | 0.3678 | 0.3920 | 0.2517 | 0.5896 |
HIGRADE | 0.7376 | 0.7216 | 0.5478 | 0.4471 |
FRIQUEE | 0.7652 | 0.7571 | 0.5688 | 0.4169 |
CORNIA | 0.5972 | 0.6057 | 0.4211 | 0.5136 |
HOSA | 0.6025 | 0.6047 | 0.4257 | 0.5132 |
V-BLIINDS | 0.5590 | 0.5551 | 0.3899 | 0.5356 |
TLVQM | 0.6693 | 0.6590 | 0.4816 | 0.4849 |
ResNet-50 | 0.7183 | 0.7097 | 0.5229 | 0.4538 |
VGG-19 | 0.7025 | 0.6997 | 0.5091 | 0.4562 |
KonCept512 | 0.5872 | 0.5940 | 0.4101 | 0.5135 |
VIDEVAL | 0.7787 | 0.7733 | 0.5830 | 0.4049 |
RAPIQUE | 0.7610 | 0.7620 | 0.5610 | 0.4060 |
2BiVQA | 0.7716 | 0.7904 | 0.5812 | 0.4047 |
All-Combined:
Methods | SROCC | PLCC | KROCC | RMSE |
---|---|---|---|---|
BRISQUE | 0.5695 | 0.5861 | 0.4030 | 0.5617 |
NIQE | 0.4622 | 0.4773 | 0.322 | 0.6112 |
ILNIQE | 0.4592 | 0.4741 | 0.3213 | 0.6119 |
VIIDEO | 0.1039 | 0.1621 | 0.0688 | 0.6804 |
GM-LOG | 0.5650 | 0.5942 | 0.3995 | 0.5588 |
HIGRADE | 0.7398 | 0.7368 | 0.5471 | 0.4674 |
FRIQUEE | 0.7568 | 0.7550 | 0.5651 | 0.4549 |
CORNIA | 0.6764 | 0.6974 | 0.4846 | 0.4946 |
HOSA | 0.6957 | 0.7082 | 0.5038 | 0.4893 |
V-BLIINDS | 0.6545 | 0.6599 | 0.4739 | 0.5200 |
TLVQM | 0.7271 | 0.7342 | 0.5347 | 0.4705 |
ResNet-50 | 0.7557 | 0.7747 | 0.5613 | 0.4385 |
VGG-19 | 0.7321 | 0.7482 | 0.5399 | 0.4610 |
KonCept512 | 0.6608 | 0.6763 | 0.4759 | 0.5091 |
VIDEVAL | 0.7960 | 0.7939 | 0.6032 | 0.4268 |
RAPIQUE | 0.8086 | 0.8186 | 0.6148 | 0.4076 |
2BiVQA | 0.8003 | 0.7941 | 0.6088 | 0.4218 |
References
[1] V. Hosu, F. Hahn, M. Jenadeleh, H. Lin, H. Men, T. Szirányi, S. Li,and D. Saupe, “The konstanz natural video database (konvid-1k),” in2017 Ninth international conference on quality of multimedia experience(QoMEX). IEEE, 2017, pp. 1–6.
[2] Z. Sinno and A. C. Bovik, “Large-scale study of perceptual videoquality,”IEEE Transactions on Image Processing, vol. 28, no. 2, pp.612–627, 2018.
[3] Y. Wang, S. Inguva, and B. Adsumilli, “Youtube ugc dataset for videocompression research,” in2019 IEEE 21st International Workshop onMultimedia Signal Processing (MMSP). IEEE, 2019, pp. 1–5.