bugbug
bugbug copied to clipboard
Write a test to make sure a new DB is downloaded when an already downloaded DB exists with an old schema
In the download method, https://github.com/mozilla/bugbug/blob/d1016a351603d948230188fa3f370a057a1a8127/bugbug/db.py#L89, we should handle the case where the DB is already downloaded but is using an older schema version than the current one.
It's possible that this is already kind of supported, as we check the ETag and the ETag changes when the schema version changes. We need to add a test to confirm.
For this, would we want to update and download a newer version or just exit the function with false like
if is_old_schema(path): return False
We should try to download a new version.
Hi, I want to contribute and I am a beginner, I wanted to know is the issue resolved? I am stuck here
if is_old_schema(path): return False
what to do if it is an old schema?
Can i take this work?
I want to contribute and I am a beginner, I wanted to know is the issue resolved? I am stuck here if is_old_schema(path): return False
No, the issue is still here. If you look at the code, you'll see that is_old_schema
is not checking the schema version of the downloaded DB, but the schema version of the remote DB.
what to do if it is an old schema?
Please read all the comments, it is explained above.
Can i take this work?
You can work on any issue which is open and has no PR linked to it. Did you read https://github.com/mozilla/bugbug/issues/1092 or CONTRIBUTING.md?
Hello @marco-c ,
I am thinking to implement a new function download_check_version() similar to download_check_etag() in the bugbug/utils.py which will be called after downloading the current DB. Then return false if that is the case. Am I going to the right direction?
Avijit
@avijit1258 the goal is to remove a DB when its schema version is old compared to the current schema. I'm not sure how what you proposed is going to do this, but I'm not sure exactly what you mean. Proposing a PR is the best way to show what you mean ;)
@marco-c How can I reproduce the situation you are telling? Can you give any url to try or way to figure out this? I have installed the repo and trying to figure things out. This is my first time trying to contribute to such a real project. thanks
@avijit1258 the first step is to look at the code I linked at in the first comment, and take a look at all the functions it calls and figure out how it works.
I have three questions.
- @marco-c As you suggested I have gone through download(), is_old_schema(), download_check_etag() and extract_file(). I need some clarification. After calling the download_check_etag() function, download() function calls extract_file(). I am wondering where (by which code) download() downloaded the DB.
def download(path, support_files_too=False):
# If a DB with the current schema is not available yet, we can't download.
print('Path: ', path)
if is_old_schema(path):
return False
zst_path = f"{path}.zst"
url = DATABASES[path]["url"]
try:
logger.info(f"Downloading {url} to {zst_path}")
updated = utils.download_check_etag(url, zst_path)
if updated:
extract_file(zst_path)
successful = True
if support_files_too:
for support_file in DATABASES[path]["support_files"]:
successful |= download_support_file(path, support_file)
return successful
except requests.exceptions.HTTPError:
logger.info(f"{url} is not yet available to download", exc_info=True)
return False
- In the first comment of download function, it says
If a DB with the current schema is not available yet, we can't download.
Here current means new or latest? or existing?
- Another question is if I am getting the issue right the purpose is to ensure the DB schema is not updated in the server while we are downloading the DB. For that, we are planning to check the DB schema after downloading with the latest schema version of the server?
Thanks
@marco-c I have added a PR with my proposed solution. #1206
No, the purpose is to avoid using an already downloaded DB if its schema is not the current schema. The schema of the downloaded DB can be read from the ".version" file, its current schema can be read from the DATABASES dict. Thinking about this more, I think this is basically already covered by the etag check. So we just need to write a test.
@marco-c Thanks for the clarification. Things are becoming more clear now. By saying test, do you mean checking it in the download function or writing a test case for download function in test file?
@marco-c Thanks for the clarification. Things are becoming more clear now. By saying test, do you mean checking it in the download function or writing a test case for download function in test file?
The second: writing a test case.
Can I just simply assert that new db is not downloaded assert updated '' new db not downloaded" And then in test I can write case where the current dB is older but newer can't be downloaded.
@marco-c can i work on this issue?I am new to open source and want to contribute.
Yes, please read CONTRIBUTING.md and https://github.com/mozilla/bugbug/issues/1092 for the contributing rules.
But isn't line 91 already check if its old schema and therefore if it is then returning false line88:# Download and extract databases. line89:def download(path, support_files_too=False): line90: # If a DB with the current schema is not available yet, we can't download. line91: if is_old_schema(path): line92: return False
@tarunjarvis5 yes, the code is doing that, but this issue is about adding a test to ensure it works correctly.
@marco-c Hi, can I work on this?
Yes, please read CONTRIBUTING.md and #1092 for the contributing rules.
Hello, I am an outreachy applicant. Can I work on it?
Hello, I am an outreachy applicant. Can I work on it?
Yes, please read CONTRIBUTING.md and https://github.com/mozilla/bugbug/issues/1092 for the contributing rules.
I'm interested in this, please confirm if its fine to add a new file for this test. Thanks
I'm interested in this, please confirm if its fine to add a new file for this test. Thanks
As stated in CONTRIBUTING.md, you can work on any issue without asking, as long as there is no open PR linked to it.
Hey @marco-c, Do I need to write assert statements for testing this function?
Hey @marco-c, Do I need to write assert statements for testing this function?
Yes, please read the comments above.
Workng on Bug ID id 1108 I checked the db.py and also i understand the problem . How to write the test file ?. I noticed a tests folder called 'tests' is this where the solution will be? Will I use pytest to run this ?
I also want to discover how to run the test (I have already installed -test-requirements.txt