metaflow-ui icon indicating copy to clipboard operation
metaflow-ui copied to clipboard

Unable to view DAG

Open npow opened this issue 3 years ago • 9 comments

Description

Unable to view DAG.

DAG encountered an unexpected error. This should not happen and might be caused by unexpected data.

Steps to Reproduce

  1. Navigate to Metaflow UI
  2. Click on flow that completed successfully, it will show the Timeline tab
  3. Click on DAG tab

Expected behavior:

See the DAG.

Reproduces how often:

Every time.

Versions

OS: macOS catalina 10.15.7
Application version: v1.0.0
Service version: 2.2.1--

Additional Information

These flows are deployed to Step Functions. I see this in the Developer console:

react-dom.production.min.js:209 TypeError: Cannot read properties of undefined (reading 'box_ends')
    at gO (DAGUtils.ts:58:28)
    at hO (DAGUtils.ts:104:10)
    at tE (index.tsx:22:33)
    at Zi (react-dom.production.min.js:153:146)
    at Fa (react-dom.production.min.js:175:309)
    at _l (react-dom.production.min.js:263:406)
    at gs (react-dom.production.min.js:246:265)
    at ms (react-dom.production.min.js:246:194)
    at ls (react-dom.production.min.js:239:172)
    at react-dom.production.min.js:123:115

npow avatar Feb 23 '22 18:02 npow

Can you share the response from the /dag request from the Console Network Tab?

obgibson avatar Feb 23 '22 19:02 obgibson

@obgibson Here you go:

{
   "data":{
      "file":"pyspark_runner.py",
      "parameters":[
         {
            "name":"base_cfg",
            "type":"Parameter"
         },
         {
            "name":"override_cfg",
            "type":"Parameter"
         }
      ],
      "constants":[
         
      ],
      "steps":{
         "start":{
            "name":"start",
            "type":"start",
            "line":24,
            "doc":"",
            "decorators":[
               {
                  "name":"conda",
                  "attributes":{
                     "libraries":"{}",
                     "python":null,
                     "disabled":null
                  },
                  "statically_defined":false
               },
               {
                  "name":"batch",
                  "attributes":{
                     "cpu":"1",
                     "gpu":"0",
                     "memory":"4096",
                     "image":"python:3.7",
                     "queue":"arn:aws:batch:us-east-1:667858956048:job-queue/job-queue-metaflow",
                     "iam_role":"arn:aws:iam::667858956048:role/metaflow-BatchS3TaskRole-1U80KJ4G0N189",
                     "execution_role":null,
                     "shared_memory":null,
                     "max_swap":null,
                     "swappiness":null,
                     "host_volumes":null
                  },
                  "statically_defined":false
               },
               {
                  "name":"step_functions_internal",
                  "attributes":{
                     
                  },
                  "statically_defined":false
               }
            ],
            "next":[
               "create_cluster"
            ]
         },
         "create_cluster":{
            "name":"create_cluster",
            "type":"linear",
            "line":37,
            "doc":"",
            "decorators":[
               {
                  "name":"conda",
                  "attributes":{
                     "libraries":"{}",
                     "python":null,
                     "disabled":null
                  },
                  "statically_defined":false
               },
               {
                  "name":"batch",
                  "attributes":{
                     "cpu":"1",
                     "gpu":"0",
                     "memory":"4096",
                     "image":"python:3.7",
                     "queue":"arn:aws:batch:us-east-1:667858956048:job-queue/job-queue-metaflow",
                     "iam_role":"arn:aws:iam::667858956048:role/metaflow-BatchS3TaskRole-1U80KJ4G0N189",
                     "execution_role":null,
                     "shared_memory":null,
                     "max_swap":null,
                     "swappiness":null,
                     "host_volumes":null
                  },
                  "statically_defined":false
               },
               {
                  "name":"step_functions_internal",
                  "attributes":{
                     
                  },
                  "statically_defined":false
               }
            ],
            "next":[
               "submit_job"
            ]
         },
         "submit_job":{
            "name":"submit_job",
            "type":"linear",
            "line":43,
            "doc":"",
            "decorators":[
               {
                  "name":"conda",
                  "attributes":{
                     "libraries":"{}",
                     "python":null,
                     "disabled":null
                  },
                  "statically_defined":false
               },
               {
                  "name":"batch",
                  "attributes":{
                     "cpu":"1",
                     "gpu":"0",
                     "memory":"4096",
                     "image":"python:3.7",
                     "queue":"arn:aws:batch:us-east-1:667858956048:job-queue/job-queue-metaflow",
                     "iam_role":"arn:aws:iam::667858956048:role/metaflow-BatchS3TaskRole-1U80KJ4G0N189",
                     "execution_role":null,
                     "shared_memory":null,
                     "max_swap":null,
                     "swappiness":null,
                     "host_volumes":null
                  },
                  "statically_defined":false
               },
               {
                  "name":"step_functions_internal",
                  "attributes":{
                     
                  },
                  "statically_defined":false
               }
            ],
            "next":[
               "end"
            ]
         },
         "end":{
            "name":"end",
            "type":"end",
            "line":50,
            "doc":"",
            "decorators":[
               {
                  "name":"conda",
                  "attributes":{
                     "libraries":"{}",
                     "python":null,
                     "disabled":null
                  },
                  "statically_defined":false
               },
               {
                  "name":"batch",
                  "attributes":{
                     "cpu":"1",
                     "gpu":"0",
                     "memory":"4096",
                     "image":"python:3.7",
                     "queue":"arn:aws:batch:us-east-1:667858956048:job-queue/job-queue-metaflow",
                     "iam_role":"arn:aws:iam::667858956048:role/metaflow-BatchS3TaskRole-1U80KJ4G0N189",
                     "execution_role":null,
                     "shared_memory":null,
                     "max_swap":null,
                     "swappiness":null,
                     "host_volumes":null
                  },
                  "statically_defined":false
               },
               {
                  "name":"step_functions_internal",
                  "attributes":{
                     
                  },
                  "statically_defined":false
               }
            ],
            "next":[
               
            ]
         }
      },
      "graph_structure":[
         "start",
         "create_cluster",
         "submit_job",
         "end"
      ],
      "doc":"",
      "decorators":[
         {
            "name":"conda_base",
            "attributes":{
               "libraries":{
                  "flatten_json":"0.1.7",
                  "omegaconf":"2.1.1",
                  "pydash":"5.1.0"
               },
               "python":"3.8",
               "disabled":null
            },
            "statically_defined":true
         },
         {
            "name":"project",
            "attributes":{
               "name":"pyspark"
            },
            "statically_defined":true
         },
         {
            "name":"schedule",
            "attributes":{
               "cron":null,
               "weekly":false,
               "daily":true,
               "hourly":true
            },
            "statically_defined":true
         }
      ]
   },
   "status":200,
   "links":{
      "self":"https://na.metaflow.wpo.amazon.dev/flows/PySparkRunner/runs/738/dag"
   },
   "query":{
      
   }
}

npow avatar Feb 24 '22 20:02 npow

@npow What is the version of Metaflow that you are using? Can you try it with the latest Metaflow version?

savingoyal avatar Feb 25 '22 01:02 savingoyal

It's the latest. I built off master:

>>> metaflow.__version__
'2.5.2'

npow avatar Feb 25 '22 02:02 npow

@npow it looks like you don't have the latest metaflow-ui and metaflow-service. The versions should be 1.1.1 and 2.2.2 respectively. The schema for the DAG's changed recently.

obgibson avatar Feb 25 '22 21:02 obgibson

I used the CloudFormation template to deploy the Metaflow ui/service. Do I just re-run the template again to update?

npow avatar Mar 01 '22 03:03 npow

Hey @npow you can either edit the template you're using to use a newer docker image. You can also use the new template, but note that it includes some changes such as auth through Amazon Cognito. So you'd have to do some extra steps, such as creating an SSL certificate and a domain name, if you're upgrading from an older template that didn't have that.

oavdeev avatar Mar 04 '22 23:03 oavdeev

@oavdeev We should make the auth optional (but highly suggested) because otherwise, the domain name requirement makes deploying the stack a bit non-trivial.

savingoyal avatar Mar 05 '22 01:03 savingoyal

I don't see how to specify the image tag in the CloudFormation template. I'm using this template, which hasn't been updated since October 2021: https://raw.githubusercontent.com/Netflix/metaflow-tools/master/aws/cloudformation/metaflow-cfn-template.yml

Mappings:
  ServiceInfo:
    StackName:
      value: 'metaflow-infrastructure'
    ServiceName:
      value: 'metadata-service-v2'
    ImageUrl:
      value: 'netflixoss/metaflow_metadata_service'
    ContainerPort:
      value: 8080
    ContainerCpu:
      value: 512
    ContainerMemory:
      value: 1024
    Path:
      value: '*'
    Priority:
      value: 1
    DesiredCount:
      value: 1
    Role:
      value: ""

  ServiceInfoUI:
    ServiceName:
      value: 'metaflow-ui-service'
    ImageUrl:
      value: 'netflixoss/metaflow_metadata_service'
    ContainerPort:
      value: 8083
    ContainerCpu:
      value: 4096
    ContainerMemory:
      value: 16384
    DesiredCount:
      value: 1

npow avatar Mar 10 '22 05:03 npow