moto icon indicating copy to clipboard operation
moto copied to clipboard

Dynamo import table tests failing on 5.0.9

Open mweinelt opened this issue 1 year ago • 4 comments

Hi!

We are seeing a number of tests in tests/test_dynamodb/test_dynamodb_import_table.py reliably fail on moto 5.0.9 with boto3 1.34.129 on Python 3.11.9 and 3.12.4.

They've been introduced in https://github.com/getmoto/moto/commit/06d0b2a04bff119a967a154077683f70c4309988, but I've not tried 5.0.7 or 5.0.8.

FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_import_from_empty_s3_bucket - AssertionError: assert 'FAILED' == 'COMPLETED'
FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_import_table_single_file_with_multiple_items - AssertionError: assert 'FAILED' == 'COMPLETED'
FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_import_table_multiple_files - AssertionError: assert 'FAILED' == 'COMPLETED'
FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_some_successfull_files_and_some_with_unknown_data - assert 0 == 1
FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_only_process_file_with_prefix - AssertionError: assert 'FAILED' == 'COMPLETED'
FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_process_gzipped_file - AssertionError: assert 'FAILED' == 'COMPLETED'

Please see the complete test stacktraces below:

_______________________ test_import_from_empty_s3_bucket _______________________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

table_name = 'moto_test_183402'

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_import_from_empty_s3_bucket(table_name=None):
        client = boto3.client("dynamodb", region_name="us-east-1")
        s3 = boto3.client("s3", region_name="us-east-1")
    
        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]
    
        s3.create_bucket(Bucket=s3_bucket_name)
    
        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="NONE",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]
    
        import_details = wait_for_import(client, import_description)
    
>       assert import_details["ImportStatus"] == "COMPLETED"
E       AssertionError: assert 'FAILED' == 'COMPLETED'
E         
E         - COMPLETED
E         + FAILED

tests/test_dynamodb/test_dynamodb_import_table.py:108: AssertionError
______________ test_import_table_single_file_with_multiple_items _______________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_import_table_single_file_with_multiple_items():
        client = boto3.client("dynamodb", region_name="us-east-2")
        s3 = boto3.client("s3", region_name="us-east-1")
    
        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]
    
        s3.create_bucket(Bucket=s3_bucket_name)
    
        data = ""
        for i in range(5):
            data += (
                json.dumps({"Item": {"pk": {"S": f"msg{i}"}, "data": {"S": f"{uuid4()}"}}})
                + "\n"
            )
        for i in range(10, 15):
            data += json.dumps(
                {"Item": {"pk": {"S": f"msg{i}"}, "data": {"S": f"{uuid4()}"}}}
            )
        filename1 = "data.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=data,
            Key=filename1,
        )
    
        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="NONE",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]
    
        import_details = wait_for_import(client, import_description)
    
>       assert import_details["ImportStatus"] == "COMPLETED"
E       AssertionError: assert 'FAILED' == 'COMPLETED'
E         
E         - COMPLETED
E         + FAILED

tests/test_dynamodb/test_dynamodb_import_table.py:169: AssertionError
_______________________ test_import_table_multiple_files _______________________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_import_table_multiple_files():
        client = boto3.client("dynamodb", region_name="us-east-2")
        s3 = boto3.client("s3", region_name="us-east-1")
    
        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]
    
        s3.create_bucket(Bucket=s3_bucket_name)
    
        items_file1 = {"Item": {"pk": {"S": "msg1"}, "data": {"S": f"{uuid4()}"}}}
        filename1 = "data.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file1),
            Key=filename1,
        )
    
        items_file2 = {"Item": {"pk": {"S": "msg2"}, "data": {"S": f"{uuid4()}"}}}
        filename2 = "completely_random_filename_without_extension"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file2),
            Key=filename2,
        )
    
        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="NONE",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]
    
        import_details = wait_for_import(client, import_description)
    
>       assert import_details["ImportStatus"] == "COMPLETED"
E       AssertionError: assert 'FAILED' == 'COMPLETED'
E         
E         - COMPLETED
E         + FAILED

tests/test_dynamodb/test_dynamodb_import_table.py:244: AssertionError
____________ test_some_successfull_files_and_some_with_unknown_data ____________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_some_successfull_files_and_some_with_unknown_data():
        client = boto3.client("dynamodb", region_name="us-east-2")
        s3 = boto3.client("s3", region_name="us-east-1")
    
        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]
    
        s3.create_bucket(Bucket=s3_bucket_name)
    
        items_file1 = {"Item": {"pk": {"S": "msg1"}, "data": {"S": f"{uuid4()}"}}}
        filename1 = "data.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file1),
            Key=filename1,
        )
    
        items_file2 = {"pk": {"S": "msg2"}, "data": {"S": f"{uuid4()}"}}
        filename2 = "invaliddata"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file2),
            Key=filename2,
        )
    
        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="NONE",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]
    
        import_details = wait_for_import(client, import_description)
    
        assert import_details["ImportStatus"] == "FAILED"
>       assert import_details["ErrorCount"] == 1
E       assert 0 == 1

tests/test_dynamodb/test_dynamodb_import_table.py:310: AssertionError
______________________ test_only_process_file_with_prefix ______________________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_only_process_file_with_prefix():
        client = boto3.client("dynamodb", region_name="us-east-2")
        s3 = boto3.client("s3", region_name="us-east-1")
    
        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]
    
        s3.create_bucket(Bucket=s3_bucket_name)
    
        items_file1 = {"Item": {"pk": {"S": "msg1"}, "data": {"S": f"{uuid4()}"}}}
        filename1 = "yesdata.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file1),
            Key=filename1,
        )
    
        items_file2 = {"Item": {"pk": {"S": "msg2"}, "data": {"S": f"{uuid4()}"}}}
        filename2 = "nodata.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file2),
            Key=filename2,
        )
    
        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name, "S3KeyPrefix": "yes"},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="NONE",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]
    
        import_details = wait_for_import(client, import_description)
    
>       assert import_details["ImportStatus"] == "COMPLETED"
E       AssertionError: assert 'FAILED' == 'COMPLETED'
E         
E         - COMPLETED
E         + FAILED

tests/test_dynamodb/test_dynamodb_import_table.py:373: AssertionError
__________________________ test_process_gzipped_file ___________________________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_process_gzipped_file():
        client = boto3.client("dynamodb", region_name="us-east-2")
        s3 = boto3.client("s3", region_name="us-east-1")
    
        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]
    
        s3.create_bucket(Bucket=s3_bucket_name)
    
        items_file1 = {"Item": {"pk": {"S": "msg1"}, "data": {"S": f"{uuid4()}"}}}
        filename1 = "data.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=gzip.compress(json.dumps(items_file1).encode("utf-8")),
            Key=filename1,
        )
    
        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="GZIP",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]
    
        import_details = wait_for_import(client, import_description)
    
>       assert import_details["ImportStatus"] == "COMPLETED"
E       AssertionError: assert 'FAILED' == 'COMPLETED'
E         
E         - COMPLETED
E         + FAILED

tests/test_dynamodb/test_dynamodb_import_table.py:429: AssertionError

mweinelt avatar Jun 20 '24 19:06 mweinelt

Hi @mweinelt, what command do you use to run the tests?

The gw13-part in the traceback looks like part of the xdist plugin, like they are run in parallel, and that isn't necessarily supported.

bblommers avatar Jun 20 '24 20:06 bblommers

Yes, we are running it massively parallel. Up to 40 cores for a few of my machines. But I've also seen it with 6 cores on an 8700K.

python3 -m pytest -m "not network and not requires_docker" --dist loadscope --numprocesses=0

mweinelt avatar Jun 20 '24 20:06 mweinelt

The only way to explicitly serialize a group of tests is through loadgroup (https://github.com/pytest-dev/pytest-xdist/issues/385#issuecomment-1304877301). But that would mean moving away from loadscope, which is probably undesirable.

mweinelt avatar Jun 20 '24 20:06 mweinelt

I had these exact same tests fail for me recently on my local machine. I destroyed/recreated my virtual env (Python 3.11.9) for moto (using make init) and all DynamoDB tests are again passing. I was not running with xdist but are you maybe caching your dependencies on CI or locally? One or more of them might be out of date or out of sync.

bpandola avatar Jun 21 '24 23:06 bpandola

I can't actually reproduce this - but regardless, running all tests in parallel is not supported. I would suggest just using make test, where only tests that do support it will be run in parallel.

bblommers avatar Jul 24 '24 20:07 bblommers