moto
moto copied to clipboard
Support for S3 select?
It looks like the S3 select feature isn't supported by moto at the moment. Moto version: moto==1.3.16
.
Here is a small snippet of code to demonstrate:
import boto3
import awswrangler as wr
import pandas as pd
from moto import mock_s3, mock_glue
@mock_s3
@mock_glue
def test_s3_select():
s3_client = boto3.client("s3", region_name="us-east-1")
s3_client.create_bucket(Bucket="bucket")
wr.s3.to_csv(df=pd.DataFrame({'col': [1, 2, 3]}), path="s3://bucket/prefix/test.csv")
r = s3_client.select_object_content(
Bucket="bucket",
Key="prefix/test.csv",
ExpressionType="SQL",
Expression="select count(*) from s3object",
InputSerialization={
"CSV": {"FileHeaderInfo": "Use", "AllowQuotedRecordDelimiter": True}
},
OutputSerialization={"CSV": {}},
)
print(r)
test_s3_select()
It just creates a bucket, and upload a dummy csv file into the bucket using aws-data-wrangler.
This code works on a real csv file stored in a real bucket, but not when S3 is mocked with moto. Here is the stacktrace I get:
Traceback (most recent call last):
File "test_select.py", line 34, in <module>
test_s3_select()
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/moto/core/models.py", line 100, in wrapper
result = func(*args, **kwargs)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/moto/core/models.py", line 100, in wrapper
result = func(*args, **kwargs)
File "test_select.py", line 26, in test_s3_select
OutputSerialization={"CSV": {}},
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/client.py", line 316, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/client.py", line 622, in _make_api_call
operation_model, request_dict, request_context)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/client.py", line 641, in _make_request
return self._endpoint.make_request(operation_model, request_dict)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/endpoint.py", line 102, in make_request
return self._send_request(request_dict, operation_model)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/endpoint.py", line 137, in _send_request
success_response, exception):
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/endpoint.py", line 256, in _needs_retry
caught_exception=caught_exception, request_dict=request_dict)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/hooks.py", line 356, in emit
return self._emitter.emit(aliased_event_name, **kwargs)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/hooks.py", line 228, in emit
return self._emit(event_name, kwargs)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/hooks.py", line 211, in _emit
response = handler(**kwargs)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/retryhandler.py", line 183, in __call__
if self._checker(attempts, response, caught_exception):
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/retryhandler.py", line 251, in __call__
caught_exception)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/retryhandler.py", line 269, in _should_retry
return self._checker(attempt_number, response, caught_exception)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/retryhandler.py", line 317, in __call__
caught_exception)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/retryhandler.py", line 223, in __call__
attempt_number, caught_exception)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/retryhandler.py", line 359, in _check_caught_exception
raise caught_exception
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/endpoint.py", line 197, in _do_get_response
responses = self._event_emitter.emit(event_name, request=request)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/hooks.py", line 356, in emit
return self._emitter.emit(aliased_event_name, **kwargs)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/hooks.py", line 228, in emit
return self._emit(event_name, kwargs)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/botocore/hooks.py", line 211, in _emit
response = handler(**kwargs)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/moto/core/models.py", line 323, in __call__
request, request.url, request.headers
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/moto/s3/responses.py", line 958, in key_or_control_response
response = self._key_response(request, full_url, headers)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/moto/s3/responses.py", line 1122, in _key_response
return self._key_response_post(request, body, bucket_name, query, key_name)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/moto/s3/responses.py", line 1654, in _key_response_post
self._set_action("KEY", "POST", query)
File "/Users/jpfrancoia/Library/Caches/pypoetry/virtualenvs/glue-jobs-KVdNkci1-py3.6/lib/python3.6/site-packages/moto/s3/responses.py", line 524, in _set_action
self.data["Action"] = ACTION_MAP[action_resource_type][method]["DEFAULT"]
KeyError: 'DEFAULT'
Is this expected? If yes, do you know if it's possible to support this feature?
Kind regards
Hi @JPFrancoia, what version of moto are you on? The expected behaviour for the latest version would be to throw a NotYetImplemented-exception, to make it more obvious what's going on.
It should be possible to support this though, with the limitation that it would always return a static result (regardless of the data/SQL input).
Marking it as an enhancement.
Hi @bblommers , thanks for your answer. I amended my question to include moto's version (it's the latest on pypi at the moment).
It should be possible to support this though, with the limitation that it would always return a static result (regardless of the data/SQL input).
Just to make sure I understand, in this case the SQL query is a simple count. Will I get a number back? What about the serialization formats?
Hi @JPFrancoia, the exact implementation details will depend on whoever wants to tackle this, and how specific they want to make it. As a community-driven project, anyone can send a PR with this feature - so I can't say how many SQL features/functions will be included.
Some thoughts on this: S3 uses PartiQL to execute these statements
DynamoDB now also supports the ability to query data using PartiQL (using the new execute_statement())
Supposedly it's also used in Redshift and QLDR.
Support for PartiQL in moto would be very useful, considering the range of services. As far as I know the only client is currently written in Kotlin, so there's no easy integration though.
Any chances this one is in the roadmap? Would be a helpful testing enhancement for the project I'm working on.
Any chances this one is in the roadmap? Would be a helpful testing enhancement for the project I'm working on.
- 1 ??
An initial, experimental implementation of S3 Select is now available as of moto >= 4.1.6.dev16.
@JPFrancoia Your example above now returns the correct result. Note that this feature includes a new dependency, so make sure that you're installing moto[s3]
or moto[all]
to automatically pull this in.
Not all Select-features are covered yet - considering the amount of possible queries, I expect to only add support for them on a case-by-case basis. I know that this is a popular feature, so if anyone runs into an issue with a specific query, please open a new issue for that - that is going to make it much easier to track.