bugbug icon indicating copy to clipboard operation
bugbug copied to clipboard

Write a test to make sure a new DB is downloaded when an already downloaded DB exists with an old schema

Open marco-c opened this issue 5 years ago • 34 comments

In the download method, https://github.com/mozilla/bugbug/blob/d1016a351603d948230188fa3f370a057a1a8127/bugbug/db.py#L89, we should handle the case where the DB is already downloaded but is using an older schema version than the current one.

marco-c avatar Nov 19 '19 18:11 marco-c

It's possible that this is already kind of supported, as we check the ETag and the ETag changes when the schema version changes. We need to add a test to confirm.

marco-c avatar Nov 19 '19 18:11 marco-c

For this, would we want to update and download a newer version or just exit the function with false like if is_old_schema(path): return False

Nathan-Wisner avatar Nov 30 '19 05:11 Nathan-Wisner

We should try to download a new version.

marco-c avatar Dec 01 '19 11:12 marco-c

Hi, I want to contribute and I am a beginner, I wanted to know is the issue resolved? I am stuck here

if is_old_schema(path): return False

what to do if it is an old schema?

mohanish2504 avatar Dec 01 '19 13:12 mohanish2504

Can i take this work?

Luis-gd avatar Dec 02 '19 13:12 Luis-gd

I want to contribute and I am a beginner, I wanted to know is the issue resolved? I am stuck here if is_old_schema(path): return False

No, the issue is still here. If you look at the code, you'll see that is_old_schema is not checking the schema version of the downloaded DB, but the schema version of the remote DB.

what to do if it is an old schema?

Please read all the comments, it is explained above.

marco-c avatar Dec 02 '19 13:12 marco-c

Can i take this work?

You can work on any issue which is open and has no PR linked to it. Did you read https://github.com/mozilla/bugbug/issues/1092 or CONTRIBUTING.md?

marco-c avatar Dec 02 '19 14:12 marco-c

Hello @marco-c ,

I am thinking to implement a new function download_check_version() similar to download_check_etag() in the bugbug/utils.py which will be called after downloading the current DB. Then return false if that is the case. Am I going to the right direction?

Avijit

avijit1258 avatar Dec 22 '19 07:12 avijit1258

@avijit1258 the goal is to remove a DB when its schema version is old compared to the current schema. I'm not sure how what you proposed is going to do this, but I'm not sure exactly what you mean. Proposing a PR is the best way to show what you mean ;)

marco-c avatar Dec 23 '19 00:12 marco-c

@marco-c How can I reproduce the situation you are telling? Can you give any url to try or way to figure out this? I have installed the repo and trying to figure things out. This is my first time trying to contribute to such a real project. thanks

avijit1258 avatar Dec 23 '19 10:12 avijit1258

@avijit1258 the first step is to look at the code I linked at in the first comment, and take a look at all the functions it calls and figure out how it works.

marco-c avatar Dec 24 '19 00:12 marco-c

I have three questions.

  1. @marco-c As you suggested I have gone through download(), is_old_schema(), download_check_etag() and extract_file(). I need some clarification. After calling the download_check_etag() function, download() function calls extract_file(). I am wondering where (by which code) download() downloaded the DB.
def download(path, support_files_too=False):
    # If a DB with the current schema is not available yet, we can't download.
    print('Path: ', path)
    if is_old_schema(path):
        return False
    zst_path = f"{path}.zst"

    url = DATABASES[path]["url"]
    try:
        logger.info(f"Downloading {url} to {zst_path}")
        updated = utils.download_check_etag(url, zst_path)

        if updated:
            extract_file(zst_path)
        successful = True
        if support_files_too:
            for support_file in DATABASES[path]["support_files"]:
                successful |= download_support_file(path, support_file)

        return successful
    except requests.exceptions.HTTPError:
        logger.info(f"{url} is not yet available to download", exc_info=True)
        return False
  1. In the first comment of download function, it says

If a DB with the current schema is not available yet, we can't download.

Here current means new or latest? or existing?

  1. Another question is if I am getting the issue right the purpose is to ensure the DB schema is not updated in the server while we are downloading the DB. For that, we are planning to check the DB schema after downloading with the latest schema version of the server?

Thanks

avijit1258 avatar Dec 26 '19 13:12 avijit1258

@marco-c I have added a PR with my proposed solution. #1206

avijit1258 avatar Dec 29 '19 00:12 avijit1258

No, the purpose is to avoid using an already downloaded DB if its schema is not the current schema. The schema of the downloaded DB can be read from the ".version" file, its current schema can be read from the DATABASES dict. Thinking about this more, I think this is basically already covered by the etag check. So we just need to write a test.

marco-c avatar Jan 03 '20 15:01 marco-c

@marco-c Thanks for the clarification. Things are becoming more clear now. By saying test, do you mean checking it in the download function or writing a test case for download function in test file?

avijit1258 avatar Jan 05 '20 00:01 avijit1258

@marco-c Thanks for the clarification. Things are becoming more clear now. By saying test, do you mean checking it in the download function or writing a test case for download function in test file?

The second: writing a test case.

marco-c avatar Jan 05 '20 02:01 marco-c

Can I just simply assert that new db is not downloaded assert updated '' new db not downloaded" And then in test I can write case where the current dB is older but newer can't be downloaded.

abhaykatheria avatar Jan 12 '20 10:01 abhaykatheria

@marco-c can i work on this issue?I am new to open source and want to contribute.

sakshamb2113 avatar Feb 04 '20 21:02 sakshamb2113

Yes, please read CONTRIBUTING.md and https://github.com/mozilla/bugbug/issues/1092 for the contributing rules.

marco-c avatar Feb 04 '20 23:02 marco-c

But isn't line 91 already check if its old schema and therefore if it is then returning false line88:# Download and extract databases. line89:def download(path, support_files_too=False): line90: # If a DB with the current schema is not available yet, we can't download. line91: if is_old_schema(path): line92: return False

tarunjarvis5 avatar Dec 24 '20 08:12 tarunjarvis5

@tarunjarvis5 yes, the code is doing that, but this issue is about adding a test to ensure it works correctly.

marco-c avatar Dec 24 '20 10:12 marco-c

@marco-c Hi, can I work on this?

BenDeBrasi avatar Apr 03 '21 17:04 BenDeBrasi

Yes, please read CONTRIBUTING.md and #1092 for the contributing rules.

marco-c avatar Apr 06 '21 10:04 marco-c

Hello, I am an outreachy applicant. Can I work on it?

timalsinab avatar Mar 07 '23 00:03 timalsinab

Hello, I am an outreachy applicant. Can I work on it?

Yes, please read CONTRIBUTING.md and https://github.com/mozilla/bugbug/issues/1092 for the contributing rules.

suhaibmujahid avatar Mar 07 '23 01:03 suhaibmujahid

I'm interested in this, please confirm if its fine to add a new file for this test. Thanks

fadebowaley avatar Mar 10 '23 15:03 fadebowaley

I'm interested in this, please confirm if its fine to add a new file for this test. Thanks

As stated in CONTRIBUTING.md, you can work on any issue without asking, as long as there is no open PR linked to it.

suhaibmujahid avatar Mar 10 '23 15:03 suhaibmujahid

Hey @marco-c, Do I need to write assert statements for testing this function?

KhadijaKamran avatar Mar 10 '23 21:03 KhadijaKamran

Hey @marco-c, Do I need to write assert statements for testing this function?

Yes, please read the comments above.

suhaibmujahid avatar Mar 10 '23 21:03 suhaibmujahid

Workng on Bug ID id 1108 I checked the db.py and also i understand the problem . How to write the test file ?. I noticed a tests folder called 'tests' is this where the solution will be? Will I use pytest to run this ?

I also want to discover how to run the test (I have already installed -test-requirements.txt

fadebowaley avatar Mar 13 '23 12:03 fadebowaley