fake-gcs-server
fake-gcs-server copied to clipboard
404 when doing a resumable upload POST
Image version : latest (v1.30.2)
I'm using apache beam to do resumable uploads into a fake gcs bucket (for testing purpose) , but I get this error
"GET /storage/v1/b/data?alt=json HTTP/1.1\" 200 112"
"POST /resumable/upload/storage/v1/b/data/o?alt=json&name=aac%2Ftest1%2Fbeam-temp-data-820e4a4e464311ecac030242ac150002%2F18266b8a-3b30-4bf3-bda5-af203113e46d.data.csv&uploadType=resumable HTTP/1.1\" 404 59"
I also confirmed that the path test1
was present :
"GET /storage/v1/b/data/o?maxResults=1&projection=noAcl&prefix=aac%2Ftest1%2F2021103019551635623713%2F&delimiter=%2F&prettyPrint=false HTTP/1.1\" 200 533"
It work with the real GCS service so I was wondering if the sent POST has any version compatibility error or if it isnt supported yet anyhow.
Thanks !
@BigJerBD hey, would you be able to share a snippet on how to reproduce the issue? I can definitely look into this some time this weekend or early next week.
Hi @BigJerBD , I'm also trying to use Apache beam Filesystems to upload and download (Using Filesystems). But I keep getting error: HttpError accessing <https://www.googleapis.com/resumable/upload/storage/v1/b/
. It seems that it keeps accessing www.googleapis.com using Apache Beam, no matter how I set the environment variable. Could you please share a snippet how you do this? Thanks a lot!
I'll try this weekend to share a snippet the error that I had .
It's been a while so I probably lost it and have to reproduce it again :sweat_smile:
Hi, I monkey-patch Apache Beam to replace www.googleapis.com
with fake-gcs-server, then I got the same error with @BigJerBD (I got 404 !)
And my script is: test.py
(Apache beam version : apache-beam==2.36.0
)
def test_GCS():
URL = "gs://sample-bucket/test.gz"
# write to test buckets
with FileSystems.create(URL, compression_type=CompressionTypes.UNCOMPRESSED) as f:
f.write(gzip.compress(b"hello world"))
if __name__ == "__main__":
from .gcsio import *
test_GCS()
And the gcsio.py
file is (which is used for monkey-patch Apache Beam):
# Monkey-patch init function of GcsIO
import apache_beam.io.gcp.gcsio
from apache_beam.io.gcp.internal.clients import storage
from apache_beam.internal.gcp import auth
from apache_beam.internal.http_client import get_new_http
from google.auth.credentials import AnonymousCredentials
def new_init(self, storage_client=None):
# raise Exception("This is a test")
if storage_client is None:
storage_client = storage.StorageV1(
url = "http://0.0.0.0:4443/storage/v1/",
credentials=auth.get_service_credentials(),
get_credentials=False,
http=get_new_http(),
response_encoding='utf8'
)
self.client = storage_client
self._rewrite_cb = None
self.bucket_to_project_number = {}
# Monkey Patch the GcsIO to upload
apache_beam.io.gcp.gcsio.GcsIO.__init__ = new_init
And I got following error with resumable url:
And the following info is from the fake-gcs-docker:
time="2022-04-23T00:36:40Z" level=info msg="172.17.0.1 - - [23/Apr/2022:00:36:40 +0000] \"GET /storage/v1/b/sample-bucket?alt=json HTTP/1.1\" 200 153"
time="2022-04-23T00:36:40Z" level=info msg="172.17.0.1 - - [23/Apr/2022:00:36:40 +0000] \"POST /resumable/upload/storage/v1/b/sample-bucket/o?alt=json&name=test.gz&uploadType=resumable HTTP/1.1\" 404 59"
Thanks a lot for your help and hope this will help!
@wwwjn thank you very much for the snippet! This is indeed something like that I did when I was doing to use fake-gcs-server.
Apache beam or not, since this were also giving a 404, I was also wondering if this feature was implemented within fake-gcs-server or not.
Thanks ! :)
Hi @fsouza, is there any progress on this bug? Thanks a lot for your help!
Hi @fsouza, is there any progress on this bug? Thanks a lot for your help!
Hey, I haven't had a chance to look at it yet, but I assume the fix should be simple. I'll check it out in the coming weeks.
Hi @fsouza, is there any progress on this bug? Thanks a lot for your help!
Hey, I haven't had a chance to look at it yet, but I assume the fix should be simple. I'll check it out in the coming weeks.
Thanks a lot! If there is anything I could do, feel free to just let me know!
For anyone like me coming from Google and simply want to override the URL for Apache Beam to point to fake-gcs-server url, there's an issue tracking this here: https://github.com/apache/beam/issues/21255
For now, the solution is still to patch the url in the test. This worked for me:
from unittest import mock
@mock.patch.object(apache_beam.io.gcp.internal.clients.storage.StorageV1, "BASE_URL",
"http://localhost:4443/storage/v1/")
def test_gcs_source():
pass # test implementation here should now call the emulator
where http://localhost:4443
is the url of your fake-gcs-server instance