loki icon indicating copy to clipboard operation
loki copied to clipboard

feat: area/helm: Modified helm template to use http_listen_port and grpc_listen_port instead of hardcoded value

Open Sheikh-Abubaker opened this issue 1 year ago • 9 comments

What this PR does / why we need it: This PR basically modifies helm template to use parameters http_listen_port and grpc_listen_port instead of hardcoded value to handle the cases where user assigns some other port other than default. Which issue(s) this PR fixes: Fixes #11641

Special notes for your reviewer:

Checklist

  • [x] Reviewed the CONTRIBUTING.md guide (required)
  • [ ] Documentation added
  • [ ] Tests updated
  • [ ] CHANGELOG.md updated
    • [ ] If the change is worth mentioning in the release notes, add add-to-release-notes label
  • [ ] Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • [x] For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR
  • [ ] If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

Sheikh-Abubaker avatar Jan 10 '24 21:01 Sheikh-Abubaker

The memberlist port is still hardcoded to 7946 (grep for it in the helm templates) though.

May I substitute those harcdoded value with varibles ? as we did it for http and grpc ?

Sheikh-Abubaker avatar Jan 11 '24 16:01 Sheikh-Abubaker

The memberlist port is still hardcoded to 7946 (grep for it in the helm templates) though.

May I substitute those harcdoded value with varibles ? as we did it for http and grpc ?

The problem with that port number is that there is currently no variable in the helm values that defines the port, so the default port is hardcoded everywhere: if you grep for the port number 7946 you'll only find occurrences in service.yaml files, but not in config files! So if we search-and-replace it everywhere, loki will spawn the process on the default 7946, but the services would be listening on another port number.

From the loki docs it seems it can be set with the following variable:

# Gossip port to advertise to other members in the cluster. Used for NAT
# traversal.
# CLI flag: -memberlist.advertise-port
[advertise_port: <int> | default = 7946]

In the values file we can define values to be set in the memberlist section of the config. Therefore I added to my values the following:

loki:
  extraMemberlistConfig:  
    advertise_port: 7498

And the value shows up correctly in the configmap:

[alberto@fedora]$ k get cm/loki -o yaml
apiVersion: v1
data:
  config.yaml: |2
    ...
    memberlist:
      advertise_port: 7498
      join_members:
      - loki-memberlist

And loki seems to read it without any issue:

$ curl loki-backend:3101/config

target: backend
http_prefix: ""
...
memberlist:
  node_name: ""
  ...
  advertise_addr: ""
  advertise_port: 7498

But the pods are not obeying it somehow:

[alberto@fedora]$ k exec -it pod/loki-backend-0 -c loki -- sh
/ $ netstat -tuplen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 :::3101                 :::*                    LISTEN      1/loki
tcp        0      0 :::9096                 :::*                    LISTEN      1/loki
tcp        0      0 :::7946                 :::*                    LISTEN      1/loki

For this reason I'd open a separate issue. If you find a solution ping me in the new PR and I'll happily test it!

bebosudo avatar Jan 11 '24 18:01 bebosudo

But the pods are not obeying it somehow:

[alberto@fedora]$ k exec -it pod/loki-backend-0 -c loki -- sh
/ $ netstat -tuplen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 :::3101                 :::*                    LISTEN      1/loki
tcp        0      0 :::9096                 :::*                    LISTEN      1/loki
tcp        0      0 :::7946                 :::*                    LISTEN      1/loki

For this reason I'd open a separate issue. If you find a solution ping me in the new PR and I'll happily test it!

@bebosudo We'll together find the solution together about the pods not obeying, I'm onto completing the above memberlist port task

Sheikh-Abubaker avatar Jan 11 '24 18:01 Sheikh-Abubaker

But the pods are not obeying it somehow:

[alberto@fedora]$ k exec -it pod/loki-backend-0 -c loki -- sh
/ $ netstat -tuplen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 :::3101                 :::*                    LISTEN      1/loki
tcp        0      0 :::9096                 :::*                    LISTEN      1/loki
tcp        0      0 :::7946                 :::*                    LISTEN      1/loki

For this reason I'd open a separate issue. If you find a solution ping me in the new PR and I'll happily test it!

@bebosudo We'll together find the solution together about the pods not obeying, as of now I'll do the above memberlist port task

Sheikh-Abubaker avatar Jan 11 '24 18:01 Sheikh-Abubaker

I'd discourage putting the memberlist port task change in this PR: these are different changes and it'd be better fixing them in different PRs. 1 PR, 1 simple fix.

bebosudo avatar Jan 11 '24 19:01 bebosudo

I'd discourage putting the memberlist port task change in this PR: these are different changes and it'd be better fixing them in different PRs. 1 PR, 1 simple fix.

@bebosudo okk, is there anything I'm supposed to do for this PR ?

Sheikh-Abubaker avatar Jan 11 '24 19:01 Sheikh-Abubaker

@bebosudo okk, is there anything I'm supposed to do for this PR ?

Nope, it's ready for review by a maintainer :-)

bebosudo avatar Jan 11 '24 19:01 bebosudo

Nope, it's ready for review by a maintainer :-)

It was great working with you, looking forward to connecting. resolving issues and opening more PR's together :-)

Sheikh-Abubaker avatar Jan 11 '24 19:01 Sheikh-Abubaker

Hey guys! I think this PR introduced an issue on the templates/monitoring/_helpers-monitoring.tpl. So this PR changed from hardcoded to the templated values, but in the printf function, its using %s instead of %d which causes this error in the agent.

grafana-agent 2024/03/17 22:40:27 error loading config file /var/lib/grafana-agent/config/agent.yml: parse "http://hummus-loki.max-dev.svc.cluster.local:%!s(float64=3100)/loki/api/v1/push": invalid port ":%!s(float64=3100)" after host 

If I overwrite the http_listen_port to "3100" (with quotes), then it works, but then the loki core fails with same error: want int instead of string.

I manage to get it to work by using %v instead of %s. If using:

  • %d: same error
  • %f: works, but then print 3100.0000 which then fails
  • %v: work beautifully

Ive been trying the single binary installation, here's my values:

  loki:
    commonConfig:
      replication_factor: 1
    storage:
      type: "filesystem"
  singleBinary:
    replicas: 1

MaxThom avatar Mar 17 '24 22:03 MaxThom

Hey guys! I think this PR introduced an issue on the templates/monitoring/_helpers-monitoring.tpl. So this PR changed from hardcoded to the templated values, but in the printf function, its using %s instead of %d which causes this error in the agent.

@MaxThom Apologies, that did indeed cause a problem. The fix is here: https://github.com/grafana/loki/pull/12242

MichelHollands avatar Mar 18 '24 10:03 MichelHollands