kubernetes-neo4j icon indicating copy to clipboard operation
kubernetes-neo4j copied to clipboard

How to setup bolt port through AWS ELB loadbalancer

Open Balatriveni opened this issue 6 years ago • 7 comments

We have a neo4j EC2 instance, we want to add that to Auto scaling group with AWS ELB load balancer. When I access the neo4j browser with Load balancer DNS name it works, but bolt connect does nt work. How to setup AWS Load balancer for neo4j EC2 instance

Balatriveni avatar Jan 22 '19 19:01 Balatriveni

Apparently, for writes, you need to connect to the leader (if you're using causal cluster) or the RW server (solo setup), so that won't work for autoscaling. I would setup a read replica setup, this will allow you to autoscale for read purposes.

Then you need to do a TCP port forwarding (layer 4) for the ELB for the bolt port.

SGudbrandsson avatar Jan 31 '19 14:01 SGudbrandsson

I just solved this by doing a couple of things:

  1. Add a helper deployment that checks which server is a leader and updates a writeable label on the core pods
  2. Add a NodePort service that shares the port directly to port 30092.

This allowed me to connect to bolt://HOSTNAMETHATRESOLVESTOCLUSTER:30092/

I didn't manage to get bolt+routing to work since internal dns names aren't the same as external dns names, so I came up with this solution.

I made a YAML file that sets up everything for you. There are a couple of caveats:

  1. You need to change the namespace to whatever namespace you have.
  2. You might need to tweak the password secret name based on your setup
  3. Change the nodeport if it's taken on your cluster
  4. If the leader changes, it can take up to 30 seconds for this solution to point to the server. You can change the interval in the script to whatever you want.
  5. The kubectl image version should be the same as your cluster, so you might need to update that. When you have a different version of kubectl from your cluster, you might get weird errors out of the blue.

Here's the YAML that I made:

kind: ServiceAccount
apiVersion: v1
metadata:
  name: neo4j-pod-labeling
  namespace: neo4j

---

kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: neo4j-pod-labeling
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "update", "list", "patch"]

---

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: neo4j-pod-labeling
subjects:
- kind: ServiceAccount
  name: neo4j-pod-labeling
  namespace: neo4j
roleRef:
  kind: Role
  name: neo4j-pod-labeling
  apiGroup: rbac.authorization.k8s.io

---

apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: neo4j-core-auto-labeling
  namespace: neo4j
spec:
  replicas: 1
  selector:
    matchLabels:
      workload.user.cattle.io/workloadselector: deployment-neo4j-neo4j-core-auto-labeling
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      labels:
        workload.user.cattle.io/workloadselector: deployment-neo4j-neo4j-core-auto-labeling
    spec:
      containers:
      - env:
        - name: NEO4J_SECRETS_PASSWORD
          valueFrom:
            secretKeyRef:
              key: neo4j-password
              name: neo4j-neo4j-secrets
              optional: false
        image: lachlanevenson/k8s-kubectl:v1.11.2
        imagePullPolicy: IfNotPresent
        name: neo4j-core-auto-labeling
        resources: {}
        securityContext:
          allowPrivilegeEscalation: false
          capabilities: {}
          privileged: false
          readOnlyRootFilesystem: false
          runAsNonRoot: false
        stdin: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        tty: true
        command:
          - "sh"
          - "-c"
        args:
          - |
            while true; do
              PODS=$(kubectl get pods -l app=neo4j -l component=core --namespace=neo4j --output=jsonpath={.items..metadata.name})
              AUTH="neo4j:$NEO4J_SECRETS_PASSWORD"
              printf "Leader hunt result: "
              for POD in $PODS; do
                SERVER=$(kubectl get pods $POD --namespace=neo4j --output=jsonpath={.status.podIP})
                LEADER=$(wget -qO- http://$AUTH@$SERVER:7474/db/manage/server/core/writable 2> /dev/null)
                if [[ "$LEADER" == "true" ]]; then
                  printf "$POD"
                else
                  LEADER=false
                fi
                kubectl label pods $POD writeable=$LEADER &> /dev/null
              done
              printf "\n"
              sleep 30
            done
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccountName: neo4j-pod-labeling
      terminationGracePeriodSeconds: 30

---

apiVersion: v1
kind: Service
metadata:
  name: neo4jbroker
  namespace: neo4j
spec:
  type: NodePort
  ports:
  - port: 7687
    nodePort: 30092
  selector:
    app: neo4j
    component: core
    release: neo4j
    writeable: "true"

@mneedham can you integrate this super raw stuff I did into your helm chart template?

SGudbrandsson avatar Feb 02 '19 18:02 SGudbrandsson

This is a nice hack @sigginet was able to use this for my neo4j cluster on GKE.

I'm still pretty desperately seeking a true solution for bolt+routing in k8s tho. The pod labeling method doesn't help much when all of sudden the leader pod changes during the middle of a period commit transaction 😢 (maybe just me and I don't know how to properly manage connection session?)

ev-dev avatar Feb 06 '19 04:02 ev-dev

@ev-dev if your internal and external DNS names are the same resolve to the same container, then you should be able to use bolt+routing without a problem.

Is your cluster electing different leaders often?

SGudbrandsson avatar Feb 06 '19 19:02 SGudbrandsson

thanks for clarifying @sigginet 👍

I've been trying to setup the service to be accessible via an alternative address as opposed to the <service>.<namespace> by routing thru an ingress that forwards to a different FQDN (which uses an external DNS with A records on the root name and * wildcard both set to route all traffic to cluster IP).

No success with access outside the cluster when connecting to something like: bolt+routing://my-db.pretty-address.net:7687

The same ingress/service setup works with http/https/bolt with the crucial exception being that due to the round-robin of the service, transactions ran thru the bolt connection can fail when they do get sent to a follower pod

And indeed my cluser's leader does change far too frequently which seems in part to be because our workloads often run huge transactions that cause autoscaling to occur

ev-dev avatar Feb 09 '19 08:02 ev-dev

It's ridiculous how unresponsive and behind Neo (core team and as a business) is to the community of developers about how to get Neo4j deployed in k8s. Look at almost any other mainstream DB and the setup is quick and easy (I'm looking at you Elasticsearch). I'm partially venting here because we've been looking at hosting our own Neo4j cluster in k8s for over a year and it's simply untenable at this point.

jordanfowler avatar Jul 11 '19 20:07 jordanfowler

Hi @jordanfowler

This comment was brought to my attention just recently. I missed it because this repo is old and outdated, as clearly stated in the top matter of the README on the repo. It hasn't been maintained in some time, but that doesn't change the fact that Google has it indexed and people land here. I can't change that.

Rather than linking all of the blog posts, I'll just drop a link here to the main community site where k8s questions are regularly posted and answered:

https://community.neo4j.com/search?q=kubernetes

In that list, you'll find articles about backup & recovery, orchestration considerations, bolt+routing, exposing outside of the k8s cluster, AKS, PKS, GKE, and other topics. Of course it doesn't cover everything - all are welcome to come post a question and see if we can improve the community knowledge base.

moxious avatar Jul 23 '19 18:07 moxious