ResourceModules icon indicating copy to clipboard operation
ResourceModules copied to clipboard

[Bug Report]: Machine Learning Services module testing fails as there is a missing dependency config (NSG Rules)

Open ahmadabdalla opened this issue 1 year ago • 0 comments

Describe the bug

Machine Learning Services requires specific ports open for NSG as per documentation

Options to resolve would be

1. Update current NSG dependency resource to include the following rules:

        "securityRules": {
            "value": [
                {
                    "name": "Inbound_BatchNodeManagement_29877",
                    "properties": {
                        "protocol": "Tcp",
                        "sourcePortRange": "*",
                        "sourceAddressPrefix": "BatchNodeManagement",
                        "destinationPortRange": "29877",
                        "destinationAddressPrefix": "VirtualNetwork",
                        "access": "Allow",
                        "priority": 1010,
                        "direction": "Inbound"
                    }
                },
                {
                    "name": "Inbound_BatchNodeManagement_29876",
                    "properties": {
                        "protocol": "Tcp",
                        "sourcePortRange": "*",
                        "sourceAddressPrefix": "BatchNodeManagement",
                        "destinationPortRange": "29876",
                        "destinationAddressPrefix": "VirtualNetwork",
                        "access": "Allow",
                        "priority": 1020,
                        "direction": "Inbound"
                    }
                }
            ]
        }

2. create a new vnet - subnet in the existing vnet dedicated for ML, and a new NSG to go with it.

To reproduce

run the ML pipeline with its current config

Code snippet

Exception: /home/runner/work/_temp/9bd9f660-1d53-41e2-9187-45cbde3fc8f7.ps1:54
  Line |
    54 |    throw $res.exception
       |    ~~~~~~~~~~~~~~~~~~~~
       | 21:02:03 - The deployment 'workspaces-20220813T2008149448Z'
       | failed with error(s). Showing 1 out of 1 error(s). Status
       | Message: At least one resource deployment operation failed.
       | Please list deployment operations for details. Please see
       | https://aka.ms/DeployOperations for usage details. (Code:
       | DeploymentFailed)  - {   "status": "Failed",   "error": { 
       | "code": "ResourceDeploymentFailure",     "message": "The
       | resource operation completed with terminal provisioning state
       | 'Failed'.",     "details": [       {         "code":
       | "BadRequest",         "message": "The subnet has a network
       | security group
       | (/subscriptions/***/resourceGroups/validation-rg/providers/Microsoft.Network/networkSecurityGroups/adp-***-az-nsg-x-001) that is missing the following rules below. Please add those rules or increase the priority to allow traffic if the rules already exist. For network security group requirements, please refer to https://docs.microsoft.com/azure/machine-learning/how-to-secure-training-vnet?tabs=azure-studio#azure-machine-learning-compute-clusterinstance-1. \n {\n  \"name\": \"Inbound_BatchNodeManagement_29877\",\n  \"properties\": {\n    \"protocol\": \"TCP\",\n    \"sourcePortRange\": \"*\",\n    \"destinationPortRange\": \"29877\",\n    \"sourceAddressPrefix\": \"BatchNodeManagement OR Internet OR BatchNodeManagement.westeurope\",\n    \"destinationAddressPrefix\": \"VirtualNetwork\",\n    \"priority\": \"Higher than 65500\",\n    \"direction\": \"Inbound\"\n  }\n}\n{\n  \"name\": \"Inbound_BatchNodeManagement_29876\",\n  \"properties\": {\n    \"protocol\": \"TCP\",\n    \"sourcePortRange\": \"*\",\n    \"destinationPortRange\": \"29876\",\n    \"sourceAddressPrefix\": \"BatchNodeManagement OR Internet OR BatchNodeManagement.westeurope\",\n    \"destinationAddressPrefix\": \"VirtualNetwork\",\n    \"priority\": \"Higher than 65500\",\n    \"direction\": \"Inbound\"\n  }\n}"       }     ]   } } (Code:Conflict)   CorrelationId: 536957e4-fd4d-405a-b5ee-f7f19cb0fbe1

Relevant log output

No response

ahmadabdalla avatar Aug 13 '22 22:08 ahmadabdalla