oncall icon indicating copy to clipboard operation
oncall copied to clipboard

Grafana Oncall Plugin not connected

Open Ennakin opened this issue 1 year ago • 38 comments

What went wrong?

What happened:

  • After another update to 1.9.30 version of OnCall Plugin I get 500 code every time I go to Oncall section. Image The configuration looks ok thoughImage

I have a hobby mode of grafana running in containers. The engine version is also 1.9.30, grafana version is 11.3.0-76679. The integrations is still working, alerts are still being sent. There is an error In grafana container logs: " level=error msg="Request Completed" method=POST path=/api/ds/query status=500" What did you expect to happen:

  • Have an access to OnCall section pages of garfana interface

How do we reproduce it?

  1. Open Grafana and go to OnCall section
  2. Now click any page
  3. Wait for the browser to crash. Error message says: "Grafana Oncall Plugin not connected"

Grafana OnCall Version

v.1.9.30

Product Area

Helm/Kubernetes/Docker

Grafana OnCall Platform?

Docker

User's Browser?

Google Chrome

Anything else to add?

No response

Ennakin avatar Sep 30 '24 14:09 Ennakin

I am getting the same error :(

curl -X GET 'https://my-user:[email protected]/api/plugins/grafana-oncall-app/resources/plugin/status'

"error setting up request headers: failed to parse JSON response: json: cannot unmarshal object into Go value of type []plugin.OrgUser "

Grafana v10.2.2

Grafana OnCall Version v.1.9.31

felipevacar avatar Oct 01 '24 18:10 felipevacar

I should mention that i have self-hosted grafana with multiple organizations, so enabling the externalServiceAccounts didn't work for me (as I learned from this https://github.com/grafana/grafana-plugin-examples/blob/main/examples/app-with-service-account/README.md). My error is "failed to parse JSON response: json: cannot unmarshal object into Go value of type []plugin.OnCallPermission body={"message":"Unlicensed","traceID":""}". I'm not running Grafana in the Enterprise mode. What's with the license?

Ennakin avatar Oct 01 '24 20:10 Ennakin

@Ennakin Multiple organizations in Grafana is not supported by OnCall.

Is accessControlOncall feature flag enabled? if you are not running enterprise it should not be on. This is what is telling OnCall to check for RBAC permissions.

mderynck avatar Oct 02 '24 15:10 mderynck

@Ennakin Multiple organizations in Grafana is not supported by OnCall.

Is accessControlOncall feature flag enabled? if you are not running enterprise it should not be on. This is what is telling OnCall to check for RBAC permissions.

accessControlOncall flag isn't enabled nor through config file, nor through docker-compose envs. RBAC section in grafana.ini looks like this Image the toggler section: Image

so should the externalServiceAccounts toggler be enabled? If so I get "PluginAppClientSecret not set in config" error.

Ennakin avatar Oct 02 '24 16:10 Ennakin

externalServiceAccounts should be on accessControlOnCall should be off, may want to double check the feature_toggles section in the UI under Administration->General->Settings as well.

mderynck avatar Oct 02 '24 16:10 mderynck

externalServiceAccounts should be on accessControlOnCall should be off, may want to double check the feature_toggles section in the UI under Administration->General->Settings as well.

Thank you! I checked settings area, externalServiceAccounts is only feature that is on. What PluginAppClientSecret is?

Ennakin avatar Oct 02 '24 16:10 Ennakin

PluginAppClientSecret is the token for the external service account associated with the Plugin. Under Administration->Users and Access->Service accounts there should be one called extsvc-grafana-oncall-app and it should have 1 token.

mderynck avatar Oct 02 '24 18:10 mderynck

there is only sa-autogen-OnCall account wich was generated a few month ago with the installation of grafana-oncall. and it has 1 token

Ennakin avatar Oct 02 '24 18:10 Ennakin

Try going to Administration->Plugins and data->Plugins Grafana OnCall make sure there is an IAM tab on that screen and also check in the grafana log file if there is any errors on startup regarding the plugin. That service account should get created when the plugin is loaded.

mderynck avatar Oct 02 '24 19:10 mderynck

IAM tab is in place. I still get 'msg="Request Completed" method=GET path=/api/plugins/grafana-oncall-app/resources/plugin/status status=500 ... msg="Error making sync request" error="error getting settings from context: PluginAppClientSecret not set in config "' in grafana logs

Ennakin avatar Oct 03 '24 06:10 Ennakin

Is there any chance I can use a post method to create this service account?

Try going to Administration->Plugins and data->Plugins Grafana OnCall make sure there is an IAM tab on that screen and also check in the grafana log file if there is any errors on startup regarding the plugin. That service account should get created when the plugin is loaded.

Ennakin avatar Oct 07 '24 06:10 Ennakin

And if I switched to grafana-enterprise would I need a license to use oncall plugin?

Ennakin avatar Oct 07 '24 11:10 Ennakin

Is there any chance I can use a post method to create this service account?

This service account can't be created by the user it should be created automatically by the plugin.

And if I switched to grafana-enterprise would I need a license to use oncall plugin?

You need a license to use all the features of grafana-enterprise, oncall does not have a license it just conforms to the Grafana version it is installed on.

mderynck avatar Oct 07 '24 19:10 mderynck

In the enterprise mode with 'enable = externalServiceAccounts, accessControlOncall' setting I still get 'PluginAppClientSecret not set in config' error. Please let me know, if I missed anything

Ennakin avatar Oct 08 '24 04:10 Ennakin

This service account can't be created by the user it should be created automatically by the plugin.

Is it supposed to work even without kubernetes? And why doesn't oncall plugin have a permission to create service account in IAM section? Image

Ennakin avatar Oct 09 '24 05:10 Ennakin

Hello I have the exact same issue, the most disturbing par is that it was working but as soon restarted grafana the oncall pages where displaying "Plugin not connected".

I tried to install and uninstall using the API, the UI and ansible without success.

Grafana Version 11.3.0 Grafana Oncall Version v1.11.5

Grafana logs when I hit the retry button :

logger=context userId=1 orgId=1 uname=admin t=2024-10-24T17:12:40.458090452+02:00 level=info msg="Request Completed" method=GET path=/api/live/ws status=-1 remote_addr=redacted_ip time_ms=4 duration=4.800207ms size=0 referer= handler=/api/live/ws status_source=server
logger=context userId=10 orgId=1 uname=sa-1-sa-autogen-oncall t=2024-10-24T17:12:40.59189954+02:00 level=info msg="Request Completed" method=GET path=/api/plugins/grafana-incident-app/settings status=404 remote_addr=redacted_ip time_ms=11 duration=11.167478ms size=64 referer= handler=/api/plugins/:pluginId/settings status_source=server
logger=plugin.grafana-oncall-app t=2024-10-24T17:12:40.593068999+02:00 level=error msg="getting incident plugin settings" error="request did not return 200: 404"
logger=context userId=10 orgId=1 uname=sa-1-sa-autogen-oncall t=2024-10-24T17:12:40.607834375+02:00 level=info msg="Request Completed" method=GET path=/api/plugins/grafana-labels-app/settings status=404 remote_addr=redacted_ip time_ms=8 duration=8.283405ms size=64 referer= handler=/api/plugins/:pluginId/settings status_source=server
logger=plugin.grafana-oncall-app t=2024-10-24T17:12:40.608603801+02:00 level=error msg="getting labels plugin settings" error="request did not return 200: 404"
logger=plugin.grafana-oncall-app t=2024-10-24T17:12:40.612673313+02:00 level=info msg=GetUser user="map[Email:admin@localhost Login:admin Name:admin Role:Admin]"
logger=context userId=10 orgId=1 uname=sa-1-sa-autogen-oncall t=2024-10-24T17:12:40.641276486+02:00 level=info msg="Request Completed" method=GET path=/api/access-control/users/1/permissions status=404 remote_addr=redacted_ip time_ms=8 duration=8.404385ms size=24 referer= handler=notfound status_source=server
logger=plugin.grafana-oncall-app t=2024-10-24T17:12:40.641949991+02:00 level=error msg="Error getting user" error="failed to parse JSON response: json: cannot unmarshal object into Go value of type []plugin.OnCallPermission body={\"message\":\"Not found\"}\n"
logger=plugin.grafana-oncall-app t=2024-10-24T17:12:40.642178723+02:00 level=error msg="Error validating oncall plugin settings" error="error setting up request headers: failed to parse JSON response: json: cannot unmarshal object into Go value of type []plugin.OnCallPermission body={\"message\":\"Not found\"}\n "
logger=context userId=1 orgId=1 uname=admin t=2024-10-24T17:12:40.642738448+02:00 level=error msg="Request Completed" method=GET path=/api/plugins/grafana-oncall-app/resources/plugin/status status=500 remote_addr=redacted_ip time_ms=77 duration=77.752268ms size=174 referer=https://REDACTED/a/grafana-oncall-app/alert-groups handler=/api/plugins/:pluginId/resources/* status_source=downstream

Edit, update to v1.11.5, same issue

ced455 avatar Oct 24 '24 15:10 ced455

we rolled back to 11.1.1 of grafana, 1.9.30 of oncall and 1.9.26 of oncall-plugin. this is the only configuration it works more or less fine.

Ennakin avatar Oct 25 '24 14:10 Ennakin

the second I wrote the prev comment we faced another issue:

error setting up request headers: failed to parse JSON response: json: cannot unmarshal object into Go value of type []plugin.OrgUser body={"message":"Unauthorized","traceID":""}

every time i'm trying to connect to the plugin

Ennakin avatar Oct 25 '24 14:10 Ennakin

Got the same problem with

  • Grafana OSS v11.3.0
  • Grafana oncall 1.11.5
  • Grafana plugin 1.11.5

I add the following features as describe in the thread :

[feature_toggles] enable = externalServiceAccounts accessControlOnCall = false

I got the IAM tab in the plugin setting.

But I got no success

logger=plugin.grafana-oncall-app t=2024-10-28T16:09:01.132349586+01:00 level=error msg="Error getting settings from context" error="PluginAppClientSecret not set in config"

seebag avatar Oct 28 '24 15:10 seebag

Same situation with PluginAppClientSecret not set in config here, GF_FEATURE_TOGGLES_ENABLE=externalServiceAccounts. I tried to remove and reinstall the plugin, but it does not create a Service accounts automatically. If we can't create extsvc-grafana-oncall-app manually, how should we proceed ?

As other stated, the create action has no scope here Image

EDIT: Downgrading from grafana v11.3 to v11.2.3, deleting the plugin and re-installing it does the job. The issue is clearly with grafana v11.3.

RobinFrcd avatar Oct 28 '24 16:10 RobinFrcd

grafana 11.3.0 has been disabled in the e2e test currently (#5207 ) so I guess oncall is currently not compatible.

Looking at the changes in grafana 11.3.0, https://github.com/grafana/grafana/pull/93849 seem like a possible source of the problem.

bpedersen2 avatar Oct 30 '24 15:10 bpedersen2

since grafana has gained rbac support for all editions, i assum its safe to work on accessControlOnCall for OSS ? https://grafana.com/docs/grafana/latest/whatsnew/whats-new-in-v11-3/#developers-support-rbac-in-plugins

sunshine-luganodes avatar Nov 01 '24 08:11 sunshine-luganodes

Also affected in the upgrade. In the future, release notes should indicate breaking changes such as major auth re configurations.

Grafana set accessControlOnCall to GA/on by default which I assume is what's caused all these issues. In the docs I haven't found how to disable feature flags that are GA.

bck01215 avatar Nov 01 '24 14:11 bck01215

disabling feature toggle "accessControlOnCall" helped for helm deployment of grafana you need to add value:

grafana.ini:
  feature_toggles:
    accessControlOnCall: 'false'

Kuzbekov avatar Nov 04 '24 12:11 Kuzbekov

disabling feature toggle "accessControlOnCall" helped for helm deployment of grafana you need to add value:

grafana.ini:
  feature_toggles:
    accessControlOnCall: 'false'

It helped, but it was not enough.

I had to add also GF_AUTH_MANAGED_SERVICE_ACCOUNTS_ENABLED=true env variable. https://grafana.com/docs/grafana/latest/setup-grafana/configure-grafana/#managed_service_accounts_enabled

tarvip avatar Nov 21 '24 10:11 tarvip

Hi I finally managed to get OnCall to work using these commands here:

curl -X POST 'https://admin:<admin_password>@<grafana_host>/api/plugins/grafana-oncall-app/settings' -H "Content-Type: application/json" -d '{"enabled":true, "jsonData":{"stackId":5, "orgId":100, "onCallApiUrl":"http://oncall-engine:8080/", "grafanaUrl":"http://<grafana_address>/"}}'
 curl -X POST 'https://admin:<admin_password>@<grafana_host>/api/plugins/grafana-oncall-app/resources/plugin/install'

Check that everything works properly

curl -X GET 'https://admin:<admin_password>@<grafana_host>/api/plugins/grafana-oncall-app/resources/plugin/status' | jq

However I'd like to avoid having to run commands. Could you please tell me how to configure it programmatically? (Using the Helm chart?)

Smana avatar Dec 02 '24 19:12 Smana

disabling feature toggle "accessControlOnCall" helped for helm deployment of grafana you need to add value:

grafana.ini:
  feature_toggles:
    accessControlOnCall: 'false'

It helped, but it was not enough.

I had to add also GF_AUTH_MANAGED_SERVICE_ACCOUNTS_ENABLED=true env variable. https://grafana.com/docs/grafana/latest/setup-grafana/configure-grafana/#managed_service_accounts_enabled

This is how my configuration looks:

feature_toggles:
  enable: 'correlations autoMigrateOldPanels traceQLStreaming externalServiceAccounts'
  accessControlOnCall: 'false'

I also had to manually set GF_AUTH_MANAGED_SERVICE_ACCOUNTS_ENABLED=true in my helm chart for it to work.

maffelbaffel avatar Dec 03 '24 22:12 maffelbaffel

@maffelbaffel I followed your guide and it works perfectly. Thank you very much.

Do you encounter the parsing error in Grafana OnCall Insights dashboard? Image

nthtrung09it avatar Dec 15 '24 03:12 nthtrung09it

@maffelbaffel I followed your guide and it works perfectly. Thank you very much.

Do you encounter the parsing error in Grafana OnCall Insights dashboard? Image

I have the same error here, did you find a fix?

gabriel-suela avatar Jan 07 '25 17:01 gabriel-suela

@maffelbaffel I followed your guide and it works perfectly. Thank you very much. Do you encounter the parsing error in Grafana OnCall Insights dashboard? Image

I have the same error here, did you find a fix?

This is because the queries in this dashboard are flawed.

round(delta(sum($alert_groups_total{slug=~"$instance", team=~"$team", integration=~"$integration", service_name=~"$service_name"})[$__range:])) >= 0

The dollar sign at $alert_groups_total should not be there. Re-Importing has no effect for me. I have just copied the dashboard and fixed the errors by removing the dollar sign.

maffelbaffel avatar Jan 08 '25 13:01 maffelbaffel