scylla-migrator
scylla-migrator copied to clipboard
Connection refused error
I tried to run migrator as per the config below. Spark version 2.4.4. OS: ubuntu (AWS)
source:
type: dynamodb
table: anand-mig-1
endpoint:
host: dynamodb.us-east-1.amazonaws.com
port: 8000
credentials:
accessKey: abc
secretKey: xxx
maxMapTasks: 1
target:
type: dynamodb
table: anand-mig-1
endpoint:
host: xxx
port: 8000
credentials:
accessKey: none
secretKey: none
maxMapTasks: 1
streamChanges: false
renames: []
# Below are unused but mandatory settings
savepoints:
path: /app/savepoints
intervalSeconds: 300
skipTokenRanges: []
validation:
compareTimestamps: true
ttlToleranceMillis: 60000
writetimeToleranceMillis: 1000
failuresToFetch: 100
floatingPointTolerance: 0.001
timestampMsTolerance: 0
Error message:
ubuntu@ip-10-0-0-129:~/install/spark/spark-2.4.4-bin-hadoop2.7/bin$ ./spark-submit --class com.scylladb.migrator.Migrator --master spark://ip-10-0-0-129.ec2.internal:7077 --conf spark.driver.host=ip-10-0-0-129.ec2.internal --conf spark.scylla.config=/home/ubuntu/altws/dynamodb-to-alternator-basic.yaml /home/ubuntu/install/migrator/scylla-migrator/migrator/target/scala-2.11/scylla-migrator-assembly-0.0.1.jar
24/04/29 10:52:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
24/04/29 10:52:24 INFO SparkContext: Running Spark version 2.4.4
24/04/29 10:52:24 INFO SparkContext: Submitted application: scylla-migrator
24/04/29 10:52:24 INFO SecurityManager: Changing view acls to: ubuntu
24/04/29 10:52:24 INFO SecurityManager: Changing modify acls to: ubuntu
24/04/29 10:52:24 INFO SecurityManager: Changing view acls groups to:
24/04/29 10:52:24 INFO SecurityManager: Changing modify acls groups to:
24/04/29 10:52:24 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ubuntu); groups with view permissions: Set(); users with modify permissions: Set(ubuntu); groups with modify permissions: Set()
24/04/29 10:52:24 INFO Utils: Successfully started service 'sparkDriver' on port 41351.
24/04/29 10:52:24 INFO SparkEnv: Registering MapOutputTracker
24/04/29 10:52:24 INFO SparkEnv: Registering BlockManagerMaster
24/04/29 10:52:24 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
24/04/29 10:52:24 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
24/04/29 10:52:24 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-df0c3f25-bc98-4cf9-baa1-8b61e406912f
24/04/29 10:52:24 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
24/04/29 10:52:24 INFO SparkEnv: Registering OutputCommitCoordinator
24/04/29 10:52:24 INFO Utils: Successfully started service 'SparkUI' on port 4040.
24/04/29 10:52:24 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://ip-10-0-0-129.ec2.internal:4040
24/04/29 10:52:24 INFO SparkContext: Added JAR file:/home/ubuntu/install/migrator/scylla-migrator/migrator/target/scala-2.11/scylla-migrator-assembly-0.0.1.jar at spark://ip-10-0-0-129.ec2.internal:41351/jars/scylla-migrator-assembly-0.0.1.jar with timestamp 1714387944952
24/04/29 10:52:24 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://ip-10-0-0-129.ec2.internal:7077...
24/04/29 10:52:25 INFO TransportClientFactory: Successfully created connection to ip-10-0-0-129.ec2.internal/10.0.0.129:7077 after 19 ms (0 ms spent in bootstraps)
24/04/29 10:52:25 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20240429105225-0017
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20240429105225-0017/0 on worker-20240428032644-10.0.0.129-45665 (10.0.0.129:45665) with 2 core(s)
24/04/29 10:52:25 INFO StandaloneSchedulerBackend: Granted executor ID app-20240429105225-0017/0 on hostPort 10.0.0.129:45665 with 2 core(s), 1024.0 MB RAM
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20240429105225-0017/1 on worker-20240428032659-10.0.0.129-42885 (10.0.0.129:42885) with 2 core(s)
24/04/29 10:52:25 INFO StandaloneSchedulerBackend: Granted executor ID app-20240429105225-0017/1 on hostPort 10.0.0.129:42885 with 2 core(s), 1024.0 MB RAM
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20240429105225-0017/2 on worker-20240428032649-10.0.0.129-36371 (10.0.0.129:36371) with 2 core(s)
24/04/29 10:52:25 INFO StandaloneSchedulerBackend: Granted executor ID app-20240429105225-0017/2 on hostPort 10.0.0.129:36371 with 2 core(s), 1024.0 MB RAM
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20240429105225-0017/3 on worker-20240428032656-10.0.0.129-46715 (10.0.0.129:46715) with 2 core(s)
24/04/29 10:52:25 INFO StandaloneSchedulerBackend: Granted executor ID app-20240429105225-0017/3 on hostPort 10.0.0.129:46715 with 2 core(s), 1024.0 MB RAM
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20240429105225-0017/4 on worker-20240428032646-10.0.0.129-36563 (10.0.0.129:36563) with 2 core(s)
24/04/29 10:52:25 INFO StandaloneSchedulerBackend: Granted executor ID app-20240429105225-0017/4 on hostPort 10.0.0.129:36563 with 2 core(s), 1024.0 MB RAM
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20240429105225-0017/5 on worker-20240428032651-10.0.0.129-44873 (10.0.0.129:44873) with 2 core(s)
24/04/29 10:52:25 INFO StandaloneSchedulerBackend: Granted executor ID app-20240429105225-0017/5 on hostPort 10.0.0.129:44873 with 2 core(s), 1024.0 MB RAM
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20240429105225-0017/6 on worker-20240428032702-10.0.0.129-38949 (10.0.0.129:38949) with 2 core(s)
24/04/29 10:52:25 INFO StandaloneSchedulerBackend: Granted executor ID app-20240429105225-0017/6 on hostPort 10.0.0.129:38949 with 2 core(s), 1024.0 MB RAM
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20240429105225-0017/7 on worker-20240428032654-10.0.0.129-43929 (10.0.0.129:43929) with 2 core(s)
24/04/29 10:52:25 INFO StandaloneSchedulerBackend: Granted executor ID app-20240429105225-0017/7 on hostPort 10.0.0.129:43929 with 2 core(s), 1024.0 MB RAM
24/04/29 10:52:25 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 36277.
24/04/29 10:52:25 INFO NettyBlockTransferService: Server created on ip-10-0-0-129.ec2.internal:36277
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20240429105225-0017/1 is now RUNNING
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20240429105225-0017/6 is now RUNNING
24/04/29 10:52:25 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20240429105225-0017/2 is now RUNNING
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20240429105225-0017/5 is now RUNNING
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20240429105225-0017/0 is now RUNNING
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20240429105225-0017/7 is now RUNNING
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20240429105225-0017/4 is now RUNNING
24/04/29 10:52:25 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20240429105225-0017/3 is now RUNNING
24/04/29 10:52:25 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, ip-10-0-0-129.ec2.internal, 36277, None)
24/04/29 10:52:25 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-0-0-129.ec2.internal:36277 with 366.3 MB RAM, BlockManagerId(driver, ip-10-0-0-129.ec2.internal, 36277, None)
24/04/29 10:52:25 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, ip-10-0-0-129.ec2.internal, 36277, None)
24/04/29 10:52:25 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, ip-10-0-0-129.ec2.internal, 36277, None)
24/04/29 10:52:25 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
24/04/29 10:52:26 INFO migrator: Loaded config: MigratorConfig(DynamoDB(Some(DynamoDBEndpoint(dynamodb.us-east-1.amazonaws.com,8000)),None,Some(AWSCredentials(ASI..., <redacted>)),anand-mig-1,None,None,None,Some(1)),DynamoDB(Some(DynamoDBEndpoint(35.227.81.47,8000)),None,Some(AWSCredentials(non..., <redacted>)),anand-mig-1,None,None,None,Some(1),false,None),List(),Savepoints(300,/app/savepoints),Set(),Validation(true,60000,1000,100,0.001,0))
24/04/29 10:52:27 WARN ApacheUtils: NoSuchMethodException was thrown when disabling normalizeUri. This indicates you are using an old version (< 4.5.8) of Apache http client. It is recommended to use http client version >= 4.5.9 to avoid the breaking change introduced in apache client 4.5.7 and the latency in exception handling. See https://github.com/aws/aws-sdk-java/issues/1919 for more information
Exception in thread "main" com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to dynamodb.us-east-1.amazonaws.com:8000 [dynamodb.us-east-1.amazonaws.com/52.119.234.84] failed: Connection refused (Connection refused)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1201)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1147)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:5110)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:5077)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeDescribeTable(AmazonDynamoDBClient.java:1981)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.describeTable(AmazonDynamoDBClient.java:1947)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.describeTable(AmazonDynamoDBClient.java:1993)
at com.scylladb.migrator.readers.DynamoDB$.readRDD(DynamoDB.scala:52)
at com.scylladb.migrator.readers.DynamoDB$.readRDD(DynamoDB.scala:19)
at com.scylladb.migrator.alternator.AlternatorMigrator$.migrate(AlternatorMigrator.scala:20)
at com.scylladb.migrator.Migrator$.main(Migrator.scala:43)
at com.scylladb.migrator.Migrator.main(Migrator.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to dynamodb.us-east-1.amazonaws.com:8000 [dynamodb.us-east-1.amazonaws.com/52.119.234.84] failed: Connection refused (Connection refused)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:159)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
at com.amazonaws.http.conn.$Proxy13.connect(Unknown Source)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:394)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1323)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139)
... 29 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:339)
at com.amazonaws.http.conn.ssl.SdkTLSSocketFactory.connectSocket(SdkTLSSocketFactory.java:142)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
... 45 more
Hey @anand-chandrashekar, could you please try replacing the endpoint
with just the region
in the config, as follows?
source:
type: dynamodb
table: anand-mig-1
- endpoint:
- host: dynamodb.us-east-1.amazonaws.com
- port: 8000
+ region: us-east-1
credentials:
…
Hi @julienrf I got a different error. cc: @gcarmin
Exception in thread "main" com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: The security token included in the request is invalid. (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: HQS4CFKMN28L2FSUKQQCVUQG4VVV4KQNSO5AEMVJF66Q9ASUAAJG)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1799)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1383)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1359)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:5110)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:5077)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeDescribeTable(AmazonDynamoDBClient.java:1981)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.describeTable(AmazonDynamoDBClient.java:1947)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.describeTable(AmazonDynamoDBClient.java:1993)
at com.scylladb.migrator.readers.DynamoDB$.readRDD(DynamoDB.scala:52)
at com.scylladb.migrator.readers.DynamoDB$.readRDD(DynamoDB.scala:19)
at com.scylladb.migrator.alternator.AlternatorMigrator$.migrate(AlternatorMigrator.scala:20)
at com.scylladb.migrator.Migrator$.main(Migrator.scala:43)
at com.scylladb.migrator.Migrator.main(Migrator.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Could please confirm that your credentials work for the region us-east-1? Do they work in the AWS console?
Yes, my credentials work.
I still suspect there is something wrong with the credentials (see this discussion). Could you please double-check the permissions associated with the actual credentials (see the documentation)?
I've setup aws properly and I can list-tables.
Exception in thread "main" com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: The security token included in the request is invalid. (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ...
ubuntu@ip-10-0-0-129:~/install/spark/spark-2.4.4-bin-hadoop2.7/bin$ aws dynamodb list-tables
{
"TableNames": [
"anand-mig-1",
]
}
Is it possible for migrator to connect via https (port 443). The aws dynamodb list-tables
goes via that port.
It was indeed a permission issue. I was able to get past it and get the tool worling. Closing this. Thank you @julienrf