Update azure-storage-blob version
We restricted version in https://github.com/apache/airflow/pull/18443 due to error in static checks https://github.com/apache/airflow/pull/17068#issuecomment-925264410 Tests pass locally with the updated version
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.
Any reason versions between 12.9 and 12.13 need to be excluded?
Any reason versions between 12.9 and 12.13 need to be excluded?
I'll check if I can relax the version
mmm We have
airflow/providers/microsoft/azure/hooks/wasb.py:302: error: Argument 1 to
"write" of "BufferedWriter" has incompatible type "Union[bytes, str]"; expected
"Union[bytes, Union[bytearray, memoryview, array[Any], mmap, _CData]]"
which seems to be related to https://github.com/Azure/azure-sdk-for-python/issues/24661 and will be fixed only in 12.14.0b1 so we can not relax the version
As for the second static failure it's kinda strange. mypy fails with:
airflow/providers/microsoft/azure/hooks/wasb.py:232: error: Argument
"delimiter" to "walk_blobs" of "ContainerClient" has incompatible type
"Optional[str]"; expected "str" [arg-type]
...me_starts_with=prefix, include=include, delimiter=delimiter, **kwargs)
Code line: https://github.com/apache/airflow/blob/f18c609d127f54fbbf4dae6b290c6cdcfc7f98d0/airflow/providers/microsoft/azure/hooks/wasb.py#L216
but the types did not change between the versions: https://github.com/Azure/azure-sdk-for-python/blob/azure-storage-blob_12.8.1/sdk/storage/azure-storage-blob/azure/storage/blob/_container_client.py#L779
https://github.com/Azure/azure-sdk-for-python/blob/azure-storage-blob_12.13.0/sdk/storage/azure-storage-blob/azure/storage/blob/_container_client.py#L780
In 12.8.1 (when mypy pass) it’s the same as in 12.13.0 which mypy fails!
As for the second static failure it's kinda strange. mypy fails with:
airflow/providers/microsoft/azure/hooks/wasb.py:232: error: Argument "delimiter" to "walk_blobs" of "ContainerClient" has incompatible type "Optional[str]"; expected "str" [arg-type] ...me_starts_with=prefix, include=include, delimiter=delimiter, **kwargs)Code line:
https://github.com/apache/airflow/blob/f18c609d127f54fbbf4dae6b290c6cdcfc7f98d0/airflow/providers/microsoft/azure/hooks/wasb.py#L216
but the types did not change between the versions: https://github.com/Azure/azure-sdk-for-python/blob/azure-storage-blob_12.8.1/sdk/storage/azure-storage-blob/azure/storage/blob/_container_client.py#L779
https://github.com/Azure/azure-sdk-for-python/blob/azure-storage-blob_12.13.0/sdk/storage/azure-storage-blob/azure/storage/blob/_container_client.py#L780
In 12.8.1 (when mypy pass) it’s the same as in 12.13.0 which mypy fails!
The delimiter typing should really be delimiter: str = '/', though. I think we've been getting away with this in mypy for a while 🙂
The
delimitertyping should really bedelimiter: str = '/',though. I think we've been getting away with this in mypy for a while 🙂
Yes this is what i did as it also required param in the upstream lib but I'm not sure I get why mypy behaves differently on the same function signature and typing? a mystery.
OK delimiter is solved. So now :
- Test failures are not related to this PR. It will be fixed in https://github.com/apache/airflow/pull/25511 - I'll rebase after it's merged.
- write function typing issue has been fixed upstream we must wait to
12.14.0
after that we should get a green build
- Test failures are not related to this PR. It will be fixed in Limit Flask to <2.3 in the wake of 2.2 breaking our tests #25511 - I'll rebase after it's merged.
Merged . Rebasing this one on top :)
Just one static-check left!
Just one static-check left!
Yes. This is the one that we must wait to 12.14.0 The issue already fixed upstream but they didn't release it yet. I wrote to them about it: https://github.com/Azure/azure-sdk-for-python/issues/24661#issuecomment-1204300000
Once 12.14.0 I'll modify the PR to use that version as minimum
Still failing with
Run mypy for providers.................................................................Failed
- hook id: run-mypy
- exit code: 1
airflow/providers/microsoft/azure/hooks/wasb.py:305: error: Incompatible types
in assignment (expression has type "Union[bytes, str]", variable has type
"bytes") [assignment]
content: bytes = stream.readall()
Lets see if https://github.com/apache/airflow/pull/25426/commits/bb1c54579887613cbc73592f21fea6bdd1ea6788 resolves it
I think Mypy is too smart and can tell that open(..., "wb") does not accept str, so that would also fail. This is probably the solution with the least overhead
content = stream.readall()
assert isinstance(content, bytes)
fileblob.write(content)
So it solved the static issue but created problems for the unit tests. I think the best option here is just to wait for upstream to release 12.14.0 should be in a month+-
So it solved the static issue but created problems for the unit tests.
I think the best option here is just to wait for upstream to release 12.14.0 should be in a month+-
Maybe if the assertion was under TYPE_CHECKING scope the unit test wouldn't fail? Just an idea since the assert statement is to hopefully make mypy happy.
Maybe if the assertion was under TYPE_CHECKING scope the unit test wouldn't fail? Just an idea since the assert statement is to hopefully make mypy happy.
I think not really - the idea is to make it consistent, I think :). It's going to change when the fixed version of liubrary is out, so it will change then and someone might rely on it being bytes.
Yeah let’s wait for upstream.
Testing with 12.14.0b2
Run mypy for providers.................................................................Failed
- hook id: run-mypy
- exit code: 1
airflow/providers/microsoft/azure/hooks/wasb.py:364: error: Argument "offset"
to "download_blob" of "BlobClient" has incompatible type "Optional[int]";
expected "int" [arg-type]
return blob_client.download_blob(offset=offset, length=length,...
^
airflow/providers/microsoft/azure/hooks/wasb.py:364: error: Argument "length"
to "download_blob" of "BlobClient" has incompatible type "Optional[int]";
expected "int" [arg-type]
...turn blob_client.download_blob(offset=offset, length=length, **kwargs)
^
Found 2 errors in 1 file (checked 951 source files)
Looks like https://github.com/Azure/azure-sdk-for-python/commit/8c243548334e2d58bce77caa35ede6a792b75a93 changed also type hints for another function.
Testing with
12.14.0b2Run mypy for providers.................................................................Failed - hook id: run-mypy - exit code: 1 airflow/providers/microsoft/azure/hooks/wasb.py:364: error: Argument "offset" to "download_blob" of "BlobClient" has incompatible type "Optional[int]"; expected "int" [arg-type] return blob_client.download_blob(offset=offset, length=length,... ^ airflow/providers/microsoft/azure/hooks/wasb.py:364: error: Argument "length" to "download_blob" of "BlobClient" has incompatible type "Optional[int]"; expected "int" [arg-type] ...turn blob_client.download_blob(offset=offset, length=length, **kwargs) ^ Found 2 errors in 1 file (checked 951 source files)Looks like Azure/azure-sdk-for-python@8c24354 changed also type hints for another function.
def download_blob(
self, offset: int = None,
length: int = None,
...
^ I'm by no means a mypy expert, but the typing for offset and length are incorrect?
New release of azure-storage-blob is expected next week: https://github.com/Azure/azure-sdk-for-python/issues/24661#issuecomment-1268952696 then this PR will be green :)
azure-storage-blob 12.14.0 was released so this PR should be ready
This change broke the wasb log task handler.
The download method now requires offset and length arguments, but these are not provided from the task handler code. Also, how would it provide anything else than "read the whole file".
This change broke the wasb log task handler.
The
downloadmethod now requiresoffsetandlengtharguments, but these are not provided from the task handler code. Also, how would it provide anything else than "read the whole file".
I think you shoudl open an issue (or better PR :) for that.