aws-cdk icon indicating copy to clipboard operation
aws-cdk copied to clipboard

(aws_opensearchservice): i4g instances say they need EBS

Open kensentor opened this issue 1 year ago • 2 comments

Describe the bug

Attempting to build an opensearch.Domain object with an opensearch.CapacityConfig object which specifies ig4.2xlarge as its data_node_instance_type parameter yields an error with cdk synth, indicating that ig4.2xlarge instance types require EBS storage.

Regression Issue

  • [ ] Select this option if this issue appears to be a regression.

Last Known Working CDK Version

No response

Expected Behavior

I expect to be able to create an Opensearch cluster using i4g instances without using EBS.

Current Behavior

An error is thrown:

Traceback (most recent call last):
  File "/Users/kennethhoffmann/adept/scaling-eureka/infra/management_stack/app.py", line 16, in <module>
    ManagementStack(
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/jsii/_runtime.py", line 118, in __call__
    inst = super(JSIIMeta, cast(JSIIMeta, cls)).__call__(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kennethhoffmann/adept/scaling-eureka/infra/management_stack/management_stack/management_stack.py", line 61, in __init__
    initialize_opensearch(
  File "/Users/kennethhoffmann/adept/scaling-eureka/infra/management_stack/management_stack/opensearch.py", line 245, in initialize_opensearch
    domain = opensearch.Domain(
             ^^^^^^^^^^^^^^^^^^
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/jsii/_runtime.py", line 118, in __call__
    inst = super(JSIIMeta, cast(JSIIMeta, cls)).__call__(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/aws_cdk/aws_opensearchservice/__init__.py", line 7657, in __init__
    jsii.create(self.__class__, self, [scope, id, props])
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/jsii/_kernel/__init__.py", line 334, in create
    response = self.provider.create(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/jsii/_kernel/providers/process.py", line 365, in create
    return self._process.send(request, CreateResponse)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/jsii/_kernel/providers/process.py", line 342, in send
    raise RuntimeError(resp.error) from JavaScriptError(resp.stack)
RuntimeError: EBS volumes are required when using instance types other than R3, I3, R6GD, or IM4GN.

Reproduction Steps

from aws_cdk import (
    aws_opensearchservice as opensearch,
)

# VPC creation omitted

subnets = vpc.select_subnets(
        availability_zones=opensearch.ZoneAwarenessConfig(
          availability_zone_count=2, enabled=True
       ),
        one_per_az=True,
        subnet_type=ec2.SubnetType.PUBLIC,
    )

security_group = ec2.SecurityGroup(
        stack,
        sg_name,
        allow_all_outbound=True,
        allow_all_ipv6_outbound=True,
        description=f"foo.",
        security_group_name="Opensearch Security Group",
        vpc=vpc,
    )

capacity_config = opensearch.CapacityConfig(
        data_nodes=8,
        data_node_instance_type="i4g.2xlarge.search",
        master_node_instance_type="m7g.large.search",
        master_nodes=3,
        multi_az_with_standby_enabled=False,
    )

os_version = opensearch.EngineVersion.open_search("2.15")

domain = opensearch.Domain(
        stack,
        "OS",
        capacity=capacity_config,
        ebs=ebs_options,
        enable_auto_software_update=True,
        encryption_at_rest=opensearch.EncryptionAtRestOptions(enabled=True),
        enforce_https=use_https,
        fine_grained_access_control=None,
        node_to_node_encryption=True,
        security_groups=[security_group],
        tls_security_policy=opensearch.TLSSecurityPolicy.TLS_1_2,
        version=os_version,
        vpc=vpc,
        vpc_subnets=[ec2.SubnetSelection(subnets=[subnet for subnet in subnets.subnets])],
        zone_awareness=True,
    )

Possible Solution

The list of instance types that does not use EBS appears to be outdated. i4i and i4g both don't use EBS according to https://docs.aws.amazon.com/opensearch-service/latest/developerguide/supported-instance-types.html

Additional Information/Context

No response

CDK CLI Version

2.162.1 (build 10aa526)

Framework Version

No response

Node.js Version

20.11.1

OS

MacOS 14.6.1

Language

Python

Language Version

Python (3.12)

Other information

No response

kensentor avatar Oct 15 '24 17:10 kensentor

Hi @kensentor , thanks for reaching out.

I tried to repro the issue with this minimal sample code -


    const opensearchDomain = new opensearch.Domain(this, 'OpenSearchDomain', {
      version: opensearch.EngineVersion.ELASTICSEARCH_7_10,
      capacity: {
        masterNodes: 1,
        dataNodes: 1,
        dataNodeInstanceType: 'i4g.2xlarge.search',
      },
    });

and it successfully synthesized into this template with these properties -

{
 "Resources": {
  "OpenSearchDomain85D65221": {
   "Type": "AWS::OpenSearchService::Domain",
   "Properties": {
    "ClusterConfig": {
     "DedicatedMasterCount": 1,
     "DedicatedMasterEnabled": true,
     "DedicatedMasterType": "r5.large.search",
     "InstanceCount": 1,
     "InstanceType": "i4g.2xlarge.search",
     "MultiAZWithStandbyEnabled": true,
     "ZoneAwarenessEnabled": false
    },
    "DomainEndpointOptions": {
     "EnforceHTTPS": false,
     "TLSSecurityPolicy": "Policy-Min-TLS-1-0-2019-07"
    },
    "EBSOptions": {
     "EBSEnabled": true,
     "VolumeSize": 10,
     "VolumeType": "gp2"
    },
    "EncryptionAtRestOptions": {
     "Enabled": false
    },
    "EngineVersion": "Elasticsearch_7.10",
    "LogPublishingOptions": {},
    "NodeToNodeEncryptionOptions": {
     "Enabled": false
    }
   },
   "UpdateReplacePolicy": "Retain",
   "DeletionPolicy": "Retain",
   "Metadata": {
    "aws:cdk:path": "EbsissueStack/OpenSearchDomain/Resource"
   }
  

I am also trying to repro it using alternate of creating capacityConfig object and then passing it . will share my findings soon

khushail avatar Oct 16 '24 00:10 khushail

Hi @khushail -

I've found no difference with using the CapacityConfig object vs. the dict. Using the following dict gave me the same error as using the object:

     capacity={
            "data_nodes":8,
            "data_node_instance_type":"i4g.2xlarge.search",
            "master_node_instance_type":"m5.large.search",
            "master_nodes":3,
            "multi_az_with_standby_enabled":False,
        }
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/jsii/_runtime.py", line 118, in __call__
    inst = super(JSIIMeta, cast(JSIIMeta, cls)).__call__(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kennethhoffmann/adept/scaling-eureka/infra/management_stack/management_stack/management_stack.py", line 61, in __init__
    initialize_opensearch(
  File "/Users/kennethhoffmann/adept/scaling-eureka/infra/management_stack/management_stack/opensearch.py", line 245, in initialize_opensearch
    domain = opensearch.Domain(
             ^^^^^^^^^^^^^^^^^^
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/jsii/_runtime.py", line 118, in __call__
    inst = super(JSIIMeta, cast(JSIIMeta, cls)).__call__(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/aws_cdk/aws_opensearchservice/__init__.py", line 7657, in __init__
    jsii.create(self.__class__, self, [scope, id, props])
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/jsii/_kernel/__init__.py", line 334, in create
    response = self.provider.create(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/jsii/_kernel/providers/process.py", line 365, in create
    return self._process.send(request, CreateResponse)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kennethhoffmann/.pyenv/versions/3.12/envs/management/lib/python3.12/site-packages/jsii/_kernel/providers/process.py", line 342, in send
    raise RuntimeError(resp.error) from JavaScriptError(resp.stack)
RuntimeError: EBS volumes are required when using instance types other than R3, I3, R6GD, or IM4GN.

In order to replicate it, I believe you need to add an EbsOptions to the domain:

ebs_options = opensearch.EbsOptions(enabled=False)

and then pass that to the domain's ebs argument.

That will fail with i4g.2xlarge.search but succeed with i3.2xlarge.search.

kensentor avatar Oct 16 '24 13:10 kensentor

It seems that the new i4g & i4i instances are not yet supported and the validation checks only for previous i3 instance types.

joe-cstl avatar Oct 21 '24 15:10 joe-cstl

@kensentor , yes that was the condition to repro the scenario. I changed the boolean variable -enabled: false and passed the datNodeInstanceType: 'i4g.2xlarge.search' and it failed during synth -

Screenshot 2024-10-21 at 4 19 41 PM

However keeping the same flag value but changing the dataNodeInstanceType : 'i3g.2xlarge.search' succeeds .

khushail avatar Oct 21 '24 23:10 khushail

The rootcause for this is in the code -

https://github.com/aws/aws-cdk/blob/366b4927c50168113dd4057f6255ab6c76278135/packages/aws-cdk-lib/aws-opensearchservice/lib/domain.ts#L1598

https://github.com/aws/aws-cdk/blob/366b4927c50168113dd4057f6255ab6c76278135/packages/aws-cdk-lib/aws-opensearchservice/lib/domain.ts#L1595C1-L1599C6

    // Only R3, I3, R6GD, and IM4GN support instance storage, per
    // https://aws.amazon.com/opensearch-service/pricing/
    if (!ebsEnabled && !isEveryDatanodeInstanceType('r3', 'i3', 'r6gd', 'im4gn')) {
      throw new Error('EBS volumes are required when using instance types other than R3, I3, R6GD, or IM4GN.');
    }

AWS Docs also mention the support of i4g instaces -https://aws.amazon.com/opensearch-service/pricing/

Marking this as P2 as it won't be immediately addressed by the core team but would be on their radar. Also contributions from the community are welcome.

khushail avatar Oct 21 '24 23:10 khushail

@khushail I need your help for a review https://github.com/aws/aws-cdk/pull/31948

aymen-chetoui avatar Oct 31 '24 09:10 aymen-chetoui

@aymen-chetoui I see that one of our core team members is already reviewing it. Thanks for submitting the PR.

Let me know if any other help is needed.

khushail avatar Nov 01 '24 16:11 khushail

Comments on closed issues and PRs are hard for our team to see. If you need help, please open a new issue that references this one.

github-actions[bot] avatar Nov 01 '24 17:11 github-actions[bot]

Comments on closed issues and PRs are hard for our team to see. If you need help, please open a new issue that references this one.

github-actions[bot] avatar Nov 01 '24 17:11 github-actions[bot]