Export vulnerablecode-data
This is how the data look like
with path like this /home/ziad/vulnerablecode-data/pypi/django/VCID-rf6e-vjeu-aaae.json
{
"vulnerability_id": "VCID-rf6e-vjeu-aaae",
"aliases": [
"CVE-2022-22818",
"GHSA-95rw-fx8r-36v6"
],
"summary": "Cross-site Scripting in Django",
"affected_purls": [
"pkg:pypi/[email protected]",
.......
"pkg:pypi/[email protected]",
......
"pkg:pypi/[email protected]"
],
"fixed_purl": [
"pkg:pypi/[email protected]",
"pkg:pypi/[email protected]",
"pkg:pypi/[email protected]"
],
"severities": [
{
"id": 25302,
"reference_id": 166932,
"scoring_system": "cvssv3.1_qr",
"value": "MODERATE",
"scoring_elements": ""
}
],
"references": [
{
"id": 164962,
"url": "https://docs.djangoproject.com/en/4.0/releases/security/",
"reference_id": ""
},
.......
{
"id": 166932,
"url": "https://github.com/advisories/GHSA-95rw-fx8r-36v6",
"reference_id": "GHSA-95rw-fx8r-36v6"
}
],
"weaknesses": []
}
and what should I do if vulnerability don't have any related package ?
All these vulnerabilities don't have any related packages and it is old and not open source like you said @pombredanne ignore.txt
@ziadhany LGTM! please add some unit tests for same
@ziadhany LGTM! please add some unit tests for same
Done , @TG1999 have a look at the tests and Lmk if I need to add more tests
@pombredanne @TG1999 can you suggest a way to improve the performance ?
@ziadhany For a data dump type of export, I would suggest simplifying the data structure by handling each model separately. Trying to load all relationships at once is likely to provide poor performance.
You can look into the Django build-in dumpdata management command at https://docs.djangoproject.com/en/4.2/ref/django-admin/#dumpdata
@ziadhany For a data dump type of export, I would suggest simplifying the data structure by handling each model separately. Trying to load all relationships at once is likely to provide poor performance.
You can look into the Django build-in
dumpdatamanagement command at https://docs.djangoproject.com/en/4.2/ref/django-admin/#dumpdata
I tried to use django dumpdata but I don't think this could work in this task. so I tried to use .prefetch_related("vulnerabilities") to load the relationships but the script is still slow compared to dumpdata
Using prefetching makes performance worse. Maybe I'm using it in the wrong way.
there is a lot of query duplication and just 10 loops take more than 2129.20 ms without writing any file on the disk
@ziadhany what's pending on this ?
@ziadhany what's pending on this ?
yes, this PR is ready to be merged.
@ziadhany please see, tests are failing
@ziadhany please see, tests are failing
@TG1999 Done! Could you please review and approve so we can merge?