Error on load object from gcs emulator on insert job API
What happened?
Emulator was trying to load object from GCS emulator using wrong URL:
log from fsouza/fake-gcs-server emulator
time=2025-04-27T11:46:19.843Z level=INFO msg="172.20.0.4 - - [27/Apr/2025:11:46:19 +0000] \"GET /storage/v1//storage/v1/b/md-handler-999/o/sample.parquet?alt=media&prettyPrint=false&projection=full HTTP/1.1\" 404 10\n
GCS storage get object API:
https://cloud.google.com/storage/docs/json_api/v1/objects/get#http-request
GET https://storage.googleapis.com/storage/v1/b/bucket/o/object
What did you expect to happen?
Load object from GCS emulator correctly when i am trying create a insert job.
How can we reproduce it (as minimally and precisely as possible)?
Set STORAGE_EMULATOR_HOST variable and try to insert a job
Anything else we need to know?
If this issue makes sense, I can submit a PR to fix it.
I ran into the same issue and fixed in a PR a few months ago in a backwards compatible way, but the maintainer doesn't seem to be active any more unfortunately 😿
Setting the fsouza/fake-gcs-server parameter -external-url=http://localhost:4443 could fix that.
Here is an example docker-compose.yaml I have used:
services:
bigquery_emulator:
image: ghcr.io/goccy/bigquery-emulator
container_name: bigquery-emulator
command:
- --project=test
platform: linux/x86_64
environment:
- STORAGE_EMULATOR_HOST=http://cloudstorage_emulator:4443
cloudstorage_emulator:
image: fsouza/fake-gcs-server
container_name: cloudstorage-emulator
command:
- -scheme=http
- -backend=memory
- -external-url=http://cloudstorage_emulator:4443
Hope it helps, please let me know if you face any issues.
@snikolakis @daaain nice suggestions, but according Google's documentation, we only need to set STORAGE_EMULATOR_HOST and don't need to change the client URL.
I already testing and submit and PR with this change,
https://github.com/goccy/bigquery-emulator/pull/405
https://cloud.google.com/go/docs/reference/cloud.google.com/go/storage/latest