java.net.UnknownHostException during Cluster Snapshot upload
A java.net.UnknownHostException happens when the cluster snapshot is being uploaded to an internal S3-compliant service. The URL of the endpoint is being modified. In this example, the endpoint is "aws.s3.endpoint":"https://service.data.company.com" but the AWS SDK is complaining about the endpoint being starrocks.service.data.company.com
Steps to reproduce the behavior (Required)
- Create a storage volume (in this example, it is built-in)
describe storage volume builtin_storage_volume\G
*************************** 1. row ***************************
Name: builtin_storage_volume
Type: S3
IsDefault: true
Location: s3://starrocks
Params: {"aws.s3.access_key":"******","aws.s3.secret_key":"******","aws.s3.endpoint":"https://service.data.company.com","aws.s3.region":"","aws.s3.use_instance_profile":"false","aws.s3.use_aws_sdk_default_behavior":"false"}
Enabled: true
Comment:
1 row in set (0.00 sec)
- Enable snapshots
ADMIN SET AUTOMATED CLUSTER SNAPSHOT ON
Expected behavior (Required)
Snapshot uploaded to S3 bucket.
Real behavior (Required)
mysql> SELECT * FROM information_schema.cluster_snapshot_jobs;
+------------------------------------------+----------+---------------------+---------------+-------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| SNAPSHOT_NAME | JOB_ID | CREATED_TIME | FINISHED_TIME | STATE | DETAIL_INFO | ERROR_MESSAGE |
+------------------------------------------+----------+---------------------+---------------+-------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| automated_cluster_snapshot_1750189655708 | 11828821 | 2025-06-17 19:47:35 | NULL | ERROR | | upload image failed, err msg: Failed to copy local /opt/starrocks/fe/meta/image to s3://starrocks/4670ab84-aaaf-4606-a370-b762c206b7fe/meta/image/automated_cluster_snapshot_1750189655708 |
...
java.net.UnknownHostException: getFileStatus on s3://starrocks/4670ab84-aaaf-4606-a370-b762c206b7fe/meta/image/automated_cluster_snapshot_1750191657846:
software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service.
See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.: software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service.
See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.: starrocks.service.data.company.com
StarRocks version (Required)
3.5.0-10d7323
@lobrandon1217 is this endpoint https://service.data.company.com an virtual host style endpoint for the bucket or an general endpoint for all the buckets?
@lobrandon1217 is this endpoint
https://service.data.company.coman virtual host style endpoint for the bucket or an general endpoint for all the buckets?
It is a general endpoint for all buckets
give a try adding aws.s3.enable_path_style_access = true in the storage volumes' params, see if it works.
@kevincai Not sure if I am doing something wrong, but the setting does not work for me.
For testing, I created a new storage volume:
CREATE STORAGE VOLUME backups
TYPE = S3
LOCATIONS = ('s3://starrocks-dev-backups')
PROPERTIES (
'enabled' = 'true',
'aws.s3.endpoint' = 'https://service.data.company.com',
'aws.s3.use_instance_profile' = 'false',
'aws.s3.use_aws_sdk_default_behavior' = 'false',
'aws.s3.access_key' = '***',
'aws.s3.secret_key' = '***',
'aws.s3.enable_path_style_access' = 'true'
);
But the setting does not show up
describe storage volume backups;
+---------+------+-----------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+---------+
| Name | Type | IsDefault | Location | Params | Enabled | Comment |
+---------+------+-----------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+---------+
| backups | S3 | false | s3://starrocks-dev-backups | {"aws.s3.access_key":"******","aws.s3.secret_key":"******","aws.s3.endpoint":"https://service.data.company.com","aws.s3.region":"us-east-1","aws.s3.use_instance_profile":"false","aws.s3.use_aws_sdk_default_behavior":"false"} | true | |
+---------+------+-----------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+---------+
And it still encounters the same error
ADMIN SET AUTOMATED CLUSTER SNAPSHOT ON STORAGE VOLUME backups;
SELECT * FROM information_schema.cluster_snapshot_jobs;
+------------------------------------------+---------+---------------------+---------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| SNAPSHOT_NAME | JOB_ID | CREATED_TIME | FINISHED_TIME | STATE | DETAIL_INFO | ERROR_MESSAGE |
+------------------------------------------+---------+---------------------+---------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| automated_cluster_snapshot_1750386647443 | 1265993 | 2025-06-20 02:30:47 | NULL | ERROR | | upload image failed, err msg: Failed to copy local /opt/starrocks/fe/meta/image to s3://starrocks-dev-backups/eea438b3-1321-45b2-862b-1bb4b9edcf12/meta/image/automated_cluster_snapshot_1750386647443 |
+------------------------------------------+---------+---------------------+---------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Log file snippet:
java.net.UnknownHostException: getFileStatus on s3://starrocks-dev-backups/eea438b3-1321-45b2-862b-1bb4b9edcf12/meta/image/automated_cluster_snapshot_1750386647443: software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.: software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.: starrocks-dev-backups.service.data.company.com
Hi @lobrandon1217 , I am having the same issue, did you find anything?
@kevincai Not sure if I am doing something wrong, but the setting does not work for me.
For testing, I created a new storage volume:
CREATE STORAGE VOLUME backups TYPE = S3 LOCATIONS = ('s3://starrocks-dev-backups') PROPERTIES ( 'enabled' = 'true', 'aws.s3.endpoint' = 'https://service.data.company.com', 'aws.s3.use_instance_profile' = 'false', 'aws.s3.use_aws_sdk_default_behavior' = 'false', 'aws.s3.access_key' = '***', 'aws.s3.secret_key' = '***', 'aws.s3.enable_path_style_access' = 'true' );But the setting does not show up
describe storage volume backups; +---------+------+-----------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+---------+ | Name | Type | IsDefault | Location | Params | Enabled | Comment | +---------+------+-----------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+---------+ | backups | S3 | false | s3://starrocks-dev-backups | {"aws.s3.access_key":"******","aws.s3.secret_key":"******","aws.s3.endpoint":"https://service.data.company.com","aws.s3.region":"us-east-1","aws.s3.use_instance_profile":"false","aws.s3.use_aws_sdk_default_behavior":"false"} | true | | +---------+------+-----------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+---------+And it still encounters the same error
ADMIN SET AUTOMATED CLUSTER SNAPSHOT ON STORAGE VOLUME backups;SELECT * FROM information_schema.cluster_snapshot_jobs; +------------------------------------------+---------+---------------------+---------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | SNAPSHOT_NAME | JOB_ID | CREATED_TIME | FINISHED_TIME | STATE | DETAIL_INFO | ERROR_MESSAGE | +------------------------------------------+---------+---------------------+---------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | automated_cluster_snapshot_1750386647443 | 1265993 | 2025-06-20 02:30:47 | NULL | ERROR | | upload image failed, err msg: Failed to copy local /opt/starrocks/fe/meta/image to s3://starrocks-dev-backups/eea438b3-1321-45b2-862b-1bb4b9edcf12/meta/image/automated_cluster_snapshot_1750386647443 | +------------------------------------------+---------+---------------------+---------------+-----------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+Log file snippet:
java.net.UnknownHostException: getFileStatus on s3://starrocks-dev-backups/eea438b3-1321-45b2-862b-1bb4b9edcf12/meta/image/automated_cluster_snapshot_1750386647443: software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.: software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.: starrocks-dev-backups.service.data.company.com
will be fixed in #62591