[BUG] Spark Log Setting Not Applied in Flyte Sandbox Helm Values YAML
Describe the bug
I encountered the problem that the Spark Driver pod log cannot be displayed on the console. General Python Task pod log can work, but the Spark Task cannot.
http://localhost:30082/#!/log/flytesnacks-development/....../pod?namespace=flytesnacks-development
Expected behavior
I manually adjusted the URL and found the correct result
Additional context to reproduce
My Helm values yaml file
docker-registry:
enabled: false
image:
registry: harbor.linecorp.com/ecacda
repository: cr.flyte.org/flyteorg/registry
tag: 2.8.1
pullPolicy: Always
persistence:
enabled: false
service:
type: NodePort
nodePort: 30000
flyte-binary:
nameOverride: flyte-sandbox
enabled: true
configuration:
database:
host: '{{ printf "%s-postgresql" .Release.Name | trunc 63 | trimSuffix "-" }}'
password: postgres
storage:
metadataContainer: my-s3-bucket
userDataContainer: my-s3-bucket
provider: s3
providerConfig:
s3:
disableSSL: true
v2Signing: true
endpoint: http://{{ printf "%s-minio" .Release.Name | trunc 63 | trimSuffix "-" }}.{{ .Release.Namespace }}:9000
authType: accesskey
accessKey: minio
secretKey: miniostorage
logging:
level: 6
plugins:
kubernetes:
enabled: true
templateUri: |-
http://10.233.112.73/#/log/{{.namespace }}/{{ .podName }}/pod?namespace={{ .namespace }}
inline:
storage:
signedURL:
stowConfigOverride:
endpoint: http://10.227.231.9:30003
plugins:
k8s:
default-env-vars:
- FLYTE_AWS_ENDPOINT: http://{{ printf "%s-minio" .Release.Name | trunc 63 | trimSuffix "-" }}.{{ .Release.Namespace }}:9000
- FLYTE_AWS_ACCESS_KEY_ID: minio
- FLYTE_AWS_SECRET_ACCESS_KEY: miniostorage
spark:
spark-config-default:
- spark.driver.cores: "1"
- spark.hadoop.fs.s3a.aws.credentials.provider: "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"
- spark.hadoop.fs.s3a.endpoint: http://{{ printf "%s-minio" .Release.Name | trunc 63 | trimSuffix "-" }}.{{ .Release.Namespace }}:9000
- spark.hadoop.fs.s3a.access.key: "minio"
- spark.hadoop.fs.s3a.secret.key: "miniostorage"
- spark.hadoop.fs.s3a.path.style.access: "true"
- spark.kubernetes.allocation.batch.size: "50"
- spark.hadoop.fs.s3a.acl.default: "BucketOwnerFullControl"
- spark.hadoop.fs.s3n.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
- spark.hadoop.fs.AbstractFileSystem.s3n.impl: "org.apache.hadoop.fs.s3a.S3A"
- spark.hadoop.fs.s3.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
- spark.hadoop.fs.AbstractFileSystem.s3.impl: "org.apache.hadoop.fs.s3a.S3A"
- spark.hadoop.fs.s3a.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
- spark.hadoop.fs.AbstractFileSystem.s3a.impl: "org.apache.hadoop.fs.s3a.S3A"
logs:
mixed:
kubernetes-enabled: true
kubernetes-url: |-
http://10.233.112.73/#/log/{{ .namespace }}/{{ .podName }}/pod?namespace={{ .namespace }}
cluster_resources:
refreshInterval: 5m
customData:
- production:
- projectQuotaCpu:
value: "5"
- projectQuotaMemory:
value: "4000Mi"
- staging:
- projectQuotaCpu:
value: "2"
- projectQuotaMemory:
value: "3000Mi"
- development:
- projectQuotaCpu:
value: "4"
- projectQuotaMemory:
value: "5000Mi"
refresh: 5m
inlineConfigMap: '{{ include "flyte-sandbox.configuration.inlineConfigMap" . }}'
clusterResourceTemplates:
inlineConfigMap: '{{ include "flyte-sandbox.clusterResourceTemplates.inlineConfigMap" . }}'
deployment:
image:
repository: harbor.linecorp.com/ecacda/cr.flyte.org/flyteorg/flyte-binary
tag: native
pullPolicy: Always
waitForDB:
image:
repository: harbor.linecorp.com/ecacda/cr.flyte.org/flyteorg/bitnami/postgresql
tag: 15.1.0-debian-11-r20
pullPolicy: Always
rbac:
# This is strictly NOT RECOMMENDED in production clusters, and is only for use
# within local Flyte sandboxes.
# When using cluster resource templates to create additional namespaced roles,
# Flyte is required to have a superset of those permissions. To simplify
# experimenting with new backend plugins that require additional roles be created
# with cluster resource templates (e.g. Spark), we add the following:
extraRules:
- apiGroups:
- '*'
resources:
- '*'
verbs:
- '*'
enabled_plugins:
# -- Tasks specific configuration [structure](https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/nodes/task/config#GetConfig)
tasks:
# -- Plugins configuration, [structure](https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/nodes/task/config#TaskPluginConfig)
task-plugins:
# -- [Enabled Plugins](https://pkg.go.dev/github.com/lyft/flyteplugins/go/tasks/config#Config).
# Enable sagemaker*, athena if you install the backend plugins
enabled-plugins:
- container
- sidecar
- k8s-array
- agent-service
- spark
default-for-task-types:
container: container
sidecar: sidecar
container_array: k8s-array
spark: spark
# -- Uncomment to enable task type that uses Flyte Agent
# bigquery_query_job_task: agent-service
kubernetes-dashboard:
enabled: true
image:
repository: kubernetesui/dashboard
tag: v2.7.0
pullPolicy: Always
extraArgs:
- --enable-insecure-login
- --enable-skip-login
protocolHttp: true
service:
externalPort: 80
type: LoadBalancer
rbac:
create: true
clusterRoleMetrics: false
clusterReadOnlyRole: true
minio:
enabled: true
image:
registry: harbor.linecorp.com/ecacda
repository: cr.flyte.org/flyteorg/bitnami/minio
tag: 2023.1.25-debian-11-r0
pullPolicy: Always
auth:
rootUser: minio
rootPassword: miniostorage
defaultBuckets: my-s3-bucket
extraEnvVars:
- name: MINIO_BROWSER_REDIRECT_URL
value: http://localhost:30080/minio
service:
type: NodePort
nodePorts:
api: 30003
persistence:
enabled: true
existingClaim: '{{ include "flyte-sandbox.persistence.minioVolumeName" . }}'
volumePermissions:
enabled: true
image:
registry: harbor.linecorp.com/ecacda
repository: cr.flyte.org/flyteorg/bitnami/bitnami-shell
tag: 11-debian-11-r76
pullPolicy: Always
postgresql:
enabled: true
image:
registry: harbor.linecorp.com/ecacda
repository: cr.flyte.org/flyteorg/bitnami/postgresql
tag: 15.1.0-debian-11-r20
pullPolicy: Always
auth:
postgresPassword: postgres
shmVolume:
enabled: false
primary:
service:
type: NodePort
nodePorts:
postgresql: 30001
persistence:
enabled: true
existingClaim: '{{ include "flyte-sandbox.persistence.dbVolumeName" . }}'
volumePermissions:
enabled: true
image:
registry: harbor.linecorp.com/ecacda
repository: cr.flyte.org/flyteorg/bitnami/bitnami-shell
tag: 11-debian-11-r76
pullPolicy: Always
sandbox:
# dev Routes requests to an instance of Flyte running locally on a developer's
# development environment. This is only usable if the flyte-binary chart is disabled.
dev: true
buildkit:
enabled: true
image:
repository: moby/buildkit
tag: buildx-stable-1
pullPolicy: Always
proxy:
enabled: true
image:
repository: envoyproxy/envoy
tag: v1.23-latest
pullPolicy: Always
Screenshots
Are you sure this issue hasn't been raised already?
- [X] Yes
Have you read the Code of Conduct?
- [X] Yes
🤖
The issue you're experiencing might be due to the fact that the log configuration in your task_logs.yaml file is only set up for general Kubernetes pods, not specifically for Spark Driver pods.
The kubernetes-template-uri you've provided is a general template that works for most Kubernetes pods. However, Spark Driver pods might have a different log structure or location, which is why they might not be showing up.
To resolve this issue, you would need to find out the correct log structure or location for Spark Driver pods and adjust the kubernetes-template-uri accordingly.
For more details, you can refer to the Flyte codebase, specifically the getEventInfoForSpark function in the flyteplugins/go/tasks/plugins/k8s/spark/spark.go file. This function initializes different log plugins based on the configuration and fetches the logs for the Spark Driver pod.
Please note that this is a general suggestion and the exact solution might vary depending on your specific setup and configuration. If you're still having trouble, I would recommend reaching out to the Spark community for more specific guidance.
Sources
- kustomize/overlays/sandbox/flyte/config/propeller/plugins/task_logs.yaml
- flyteplugins/go/tasks/plugins/k8s/spark/spark.go
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
#self-assign