Storage Write API functionality doesn't work - UNKNOWN: failed to find stream
What happened?
I'm trying to implement a very simple PoC using Storage Write API and test it using the emulator (v0.6.3), but I keep getting this error:
com.google.api.gax.rpc.UnknownException: io.grpc.StatusRuntimeException: UNKNOWN: failed to find stream from projects/test-project/datasets/local-bigquery/tables/local-table/_default
Enabling the debug level doesn't help to find the root cause, container logs have nothing suspicious.
Actual code that uses Storage Write API:
JSONArray rowContentArray = new JSONArray();
JSONObject rowContent;
rowContent = new JSONObject();
rowContent.put("timestamp", DateTimeFormatter.ISO_INSTANT.format(Instant.ofEpochMilli(timestamp).atOffset(ZoneOffset.UTC)));
rowContent.put("id", deviceIdentifier);
rowContent.put("json", objectMapper.writeValueAsString(json));
rowContent.put("partition", partition);
rowContentArray.put(rowContent);
TableSchema tableSchema = TableSchema.newBuilder()
.addFields(TableFieldSchema.newBuilder()
.setName("id")
.setType(TableFieldSchema.Type.STRING)
.setMode(TableFieldSchema.Mode.NULLABLE)
.build())
.addFields(TableFieldSchema.newBuilder()
.setName("timestamp")
.setType(TableFieldSchema.Type.TIMESTAMP)
.setMode(TableFieldSchema.Mode.NULLABLE)
.build())
.addFields(TableFieldSchema.newBuilder()
.setName("json")
.setType(TableFieldSchema.Type.JSON)
.setMode(TableFieldSchema.Mode.NULLABLE)
.build())
.addFields(TableFieldSchema.newBuilder()
.setName("partition")
.setType(TableFieldSchema.Type.INT64)
.setMode(TableFieldSchema.Mode.NULLABLE)
.build())
.build();
BigQueryWriteSettings settings = BigQueryWriteSettings.newBuilder()
.setEndpoint(properties.getGrpcHost())
.setCredentialsProvider(NoCredentialsProvider.create())
.setTransportChannelProvider(
EnhancedBigQueryReadStubSettings.defaultGrpcTransportProviderBuilder()
.setChannelConfigurator(ManagedChannelBuilder::usePlaintext)
.build()
)
.build();
bigQueryWriteClient = BigQueryWriteClient.create(settings);
writer = JsonStreamWriter.newBuilder(parentTable.toString(), tableSchema, bigQueryWriteClient)
.setExecutorProvider(FixedExecutorProvider.create(Executors.newScheduledThreadPool(100)))
.setChannelProvider(BigQueryWriteSettings.defaultGrpcTransportProviderBuilder()
.setKeepAliveTime(org.threeten.bp.Duration.ofMinutes(1))
.setKeepAliveTimeout(org.threeten.bp.Duration.ofMinutes(1))
.setKeepAliveWithoutCalls(true)
.setChannelsPerCpu(2)
.build())
.setEnableConnectionPool(true)
.setRetrySettings(retrySettings)
.build()
ApiFuture<AppendRowsResponse> future = writer.append(rowContentArray);
BTW the emulator works fine when using older insertAll API.
I would appreciate any help in understanding the root cause and how I can troubleshoot the problem.
What did you expect to happen?
The integration test should run successfully
How can we reproduce it (as minimally and precisely as possible)?
This is a sample project demonstrating the issue:
https://github.com/tarasrng/big-query-emulator-write
Test run command (requires Java 21 installed):
./mvnw clean install
Anything else we need to know?
Actual code: https://github.com/tarasrng/big-query-emulator-write/blob/main/src/main/java/org/example/bigqueryemulatorwrite/repository/BigQueryRepository.java
Table schema creation: https://github.com/tarasrng/big-query-emulator-write/blob/main/src/main/java/org/example/bigqueryemulatorwrite/repository/migration/SchemaCreator.java
Emulator config: https://github.com/tarasrng/big-query-emulator-write/blob/main/src/main/java/org/example/bigqueryemulatorwrite/config/BigQueryEmulatorConfig.java
Test: https://github.com/tarasrng/big-query-emulator-write/blob/main/src/test/java/org/example/bigqueryemulatorwrite/BigQueryEmulatorWriteApplicationTests.java
We are having the same issue. We would very much appreciate this functionlity.
having same error. this might mean we need to explicitly create the stream first. FYI
after explicitly creating the stream. im getting following error.
[UNAUTHENTICATED: Credentials require channel with PRIVACY_AND_INTEGRITY security level. Observed security level: NONE](https://stackoverflow.com/questions/65359948/unauthenticated-credentials-require-channel-with-privacy-and-integrity-security)
https://stackoverflow.com/questions/65359948/unauthenticated-credentials-require-channel-with-privacy-and-integrity-security
this example might help. in which example stream is created explicitly. instead using default stream. https://cloud.google.com/bigquery/docs/write-api-batch#java
Apparently, PR #226 assumed that the default stream is formatted as projects/<project>/datasets/<datasets>/tables/<table>/streams/_default, but at least for the Java client, this is not correct, as it's using projects/<project>/datasets/<datasets>/tables/<table>/_default -- so without the streams at the end... and there is no way to create the default stream because you can only create streams using the former format.
So not sure what that PR was all about and who tested it, but at least for Java it's not working at all.