snowflake-connector-python SNOW-534004: Added database and schema to the queries related to temporary stage

Please answer these questions before submitting your pull requests. Thanks!

What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

Fixes #1034
Fill out the following pre-review checklist:
- [x] I am adding a new automated test(s) to verify correctness of my new code
- [ ] I am adding new logging messages
- [ ] I am adding a new telemetry message
- [ ] I am modifying authorization mechanisms
- [ ] I am adding new credentials
- [ ] I am modifying OCSP code
- [ ] I am adding a new dependency
Please describe how your code solves the related issue.

Database and schema parameters are not used in the write_pandas code that is related to temporary stage - it refers only to the stage’s name. Therefore, in case the connection provided to write_pandas has no database and/or schema, execution fails, even if database and/or schema are provided as arguments to the write_pandas itself. Thus, indication of database and schema in the queries related to the temporary stage solves the issue.

Oct 04 '22 07:10 jekaterinakletnaja

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

Oct 04 '22 07:10 github-actions[bot]

I have read the CLA Document and I hereby sign the CLA

Oct 04 '22 07:10 jekaterinakletnaja

Hi @jekaterinakletnaja , thank you for looking into the issue and submitting a fix, we really appreciate your contribution 🎉 ! Please address my comments, I will keep monitoring this PR, let's try to merge this change soon.

Oct 11 '22 19:10 sfc-gh-stan

Hi @jekaterinakletnaja , thank you for looking into the issue and submitting a fix, we really appreciate your contribution 🎉 ! Please address my comments, I will keep monitoring this PR, let's try to merge this change soon.

Hi @sfc-gh-stan, appreciate your comments and suggestions! I reflected those in the latest commit and added tests.

At the same time, while adding tests and running previous ones related to write_pandas, I’ve run into a few issues that potentially are not related to the changes in this PR, but might be different bugs (however, due to the lack of deep knowledge about the code I might be wrong). Could you please take a look at my comments below and suggest how to approach them the best?

The existing test test_empty_dataframe_write_pandas in snowflake-connector-python/test/integ/pandas/test_pandas_tools.py is failing with

...
    def chunk_helper(lst: T, n: int) -> Iterator[tuple[int, T]]:
        """Helper generator to chunk a sequence efficiently with current index like if enumerate was called on sequence."""
>       for i in range(0, len(lst), n):
E       ValueError: range() arg 3 must not be zero

Indeed, in case a df is empty, chunk_size is 0 according to it’s current definition, but 0 is not accepted as step for range.

~~There are new tests (added by me in order to test creation and usage of stage_location and file_format_location, for example test_stage_location_building_db_schema). They are failing at this line with message~~

.0 = <map object at 0x7f9d855177d0>

>       [f"{quote}{c}{quote} {column_type_mapping[c]}" for c in df.columns]
    )
E   KeyError: 'name'

~~even in case when all assertion related code (for ex., lines 428-443 for test_stage_location_building_db_schema) is commented. It is because column_type_mapping (which is the result of execution of infer_schema_sql) is an empty list, but we are searching for columns names in it. Do you have a feeling what might be wrong here? Is it expected that column_type_mapping is an empty list?~~

UPD: I’ve got what was the problem in p.2, my bad. It is solved now by the latest commit. Question in p.1 is still actual though.

Oct 12 '22 21:10 jekaterinakletnaja

Hi @jekaterinakletnaja , thank you for looking into the issue and submitting a fix, we really appreciate your contribution 🎉 ! Please address my comments, I will keep monitoring this PR, let's try to merge this change soon.

Hi @sfc-gh-stan, appreciate your comments and suggestions! I reflected those in the latest commit and added tests.

At the same time, while adding tests and running previous ones related to write_pandas, I’ve run into a few issues that potentially are not related to the changes in this PR, but might be different bugs (however, due to the lack of deep knowledge about the code I might be wrong). Could you please take a look at my comments below and suggest how to approach them the best?

The existing test test_empty_dataframe_write_pandas in snowflake-connector-python/test/integ/pandas/test_pandas_tools.py is failing with
...
    def chunk_helper(lst: T, n: int) -> Iterator[tuple[int, T]]:
        """Helper generator to chunk a sequence efficiently with current index like if enumerate was called on sequence."""
>       for i in range(0, len(lst), n):
E       ValueError: range() arg 3 must not be zero
Indeed, in case a df is empty, chunk_size is 0 according to it’s current definition, but 0 is not accepted as step for range.

~There are new tests (added by me in order to test creation and usage of stage_location and file_format_location, for example test_stage_location_building_db_schema). They are failing at this line with message~
.0 = <map object at 0x7f9d855177d0>

>       [f"{quote}{c}{quote} {column_type_mapping[c]}" for c in df.columns]
    )
E   KeyError: 'name'
~even in case when all assertion related code (for ex., lines 428-443 for test_stage_location_building_db_schema) is commented. It is because column_type_mapping (which is the result of execution of infer_schema_sql) is an empty list, but we are searching for columns names in it. Do you have a feeling what might be wrong here? Is it expected that column_type_mapping is an empty list?~

UPD: I’ve got what was the problem in p.2, my bad. It is solved now by the latest commit. Question in p.1 is still actual though.

This was actually a bug we recently fixed with #1261. You should be able to resolve the test failure by pulling the latest code. It shouldn't be affected by your changes.
Glad you're able to resolve it on your own 👍 Parsing these queries is tricky, again I think it's completely fine if we only test one query for each of table, stage and file format. For example, we could only test creation, because if copy into, select don't specify the same location, it should result in failure in other tests.

Oct 13 '22 17:10 sfc-gh-stan

Hey @sfc-gh-stan, I like your idea for tests combo! Please let me know whether latest changes cover it in full.

Oct 14 '22 08:10 jekaterinakletnaja

Hey @sfc-gh-stan, I like your idea for tests combo! Please let me know whether latest changes cover it in full.

Hi @jekaterinakletnaja , your code LGTM and has passed all the merge gates 🎉 ! Normally this would be merged right away, but since the bug fix is technically a behavior change, it can only be merged into to main in the Python connector's next release in January 2023. I will keep monitoring and make sure that happens. Again, thank you so much for this excellent contribution! We would like to give you credit in the next release note if you don't mind 😉 ?

Oct 18 '22 14:10 sfc-gh-stan

Hi @jekaterinakletnaja , your code LGTM and has passed all the merge gates 🎉 ! Normally this would be merged right away, but since the bug fix is technically a behavior change, it can only be merged into to main in the Python connector's next release in January 2023. I will keep monitoring and make sure that happens. Again, thank you so much for this excellent contribution! We would like to give you credit in the next release note if you don't mind 😉 ?

Hi @sfc-gh-stan, thanks for the wonderful news! 🎉
It’s an honour to be in the release notes 😄

Looking forward to January 2023!

Oct 18 '22 15:10 jekaterinakletnaja

snowflake-connector-python snowflake-connector-python copied to clipboard

SNOW-534004: Added database and schema to the queries related to temporary stage

snowflake-connector-python
snowflake-connector-python copied to clipboard