CloudEon icon indicating copy to clipboard operation
CloudEon copied to clipboard

Add Support for HA Deployment Profiles for Big Data Components in Kubernetes

Open jigarpatel1007 opened this issue 8 months ago • 2 comments

🆕 Title:

Enhancement Proposal: Add Support for HA Deployment Profiles for Big Data Components in Kubernetes


###Is your feature request related to a problem? Please describe.

CloudEon provides Kubernetes-based deployment and lifecycle management for open-source big data platforms. However, current deployment templates often assume default or single-instance setups, which may not meet the reliability needs of production clusters.

For users deploying components like HDFS, Hive, Spark, and Kafka at scale — especially in telecom and enterprise contexts — there is a strong need for pre-validated High Availability (HA) profiles.


Describe the solution you'd like

I propose adding built-in HA deployment profiles for key components such as:

  • HDFS NameNodes (Active-Standby)
  • Kafka clusters with replication and Zookeeper ensemble
  • Hive Metastore with failover
  • Spark Standalone with redundant masters

These profiles would be:

  • Declarative (via values.yaml or CRD templates)
  • Toggled via a simple profile: ha or profile: dev parameter
  • Include optional Prometheus/Grafana hooks for monitoring readiness

This helps users quickly spin up resilient production-grade clusters without manually editing Helm charts or manifests.


🔄 Describe alternatives you’ve considered

Manually customizing Helm charts per component, but this adds error risk and breaks consistency. Providing curated HA templates allows for standardized and tested deployments across the user base.


📌 Additional context

I work as a Senior Systems Architect focused on cloud-native automation and infrastructure design for real-time telecom workloads. While I don’t contribute code directly, I regularly support platform teams in:

  • Defining deployment standards
  • Hardening big data stacks for high traffic
  • Aligning cluster setup with SLAs and CI/CD workflows

I’d be happy to assist in designing the profiles, testing configuration logic, and reviewing draft templates.


🚀 Benefits to the CloudEon Project

  • Helps users scale from POC to production seamlessly
  • Increases CloudEon's appeal in enterprise/telco environments
  • Reduces user friction and errors during configuration
  • Promotes best practices in resource resilience and failover handling

jigarpatel1007 avatar Apr 17 '25 21:04 jigarpatel1007

你好,我是谢晋峰,你的邮件我已经收到,谢谢!

xlostpath avatar Apr 17 '25 21:04 xlostpath

📦 Sample HA Profile

values-ha.yaml

global: profile: ha monitoring: enabled: true prometheusOperator: true

hdfs: nameNode: replicas: 2 mode: ha haEnabled: true journalNode: enabled: true replicas: 3 dataNode: replicas: 3 config: dfs.replication: 3

zookeeper: enabled: true replicaCount: 3

kafka: replicas: 3 persistence: enabled: true size: 50Gi externalAccess: enabled: true config: offsets.topic.replication.factor: 3 transaction.state.log.replication.factor: 3 default.replication.factor: 3

hive: metastore: replicas: 2 haEnabled: true readinessProbe: enabled: true config: hive.metastore.event.db.notification.listener: org.apache.hive.hcatalog.listener.DbNotificationListener

spark: master: replicas: 2 haMode: true worker: replicas: 3 persistence: enabled: true size: 20Gi

ingress: enabled: true annotations: nginx.ingress.kubernetes.io/rewrite-target: /

jigarpatel1007 avatar Apr 17 '25 21:04 jigarpatel1007