backstage icon indicating copy to clipboard operation
backstage copied to clipboard

Make catalog logging a little quieter where entities are erroring

Open punkle opened this issue 1 year ago • 7 comments

Hey, I just made a Pull Request!

When an error is raised by a url reader or any other processor in the catalog, those errors are logged to the application logs with a warning level as well as written to the refresh_state table for retrieval later.

Given that these errors are already visible at the application layer via the refresh state table, I would argue that the log line is superfluous. The log line can be useful in some cases so I have left it in at debug level.

In the end this change makes the catalog logging a little quieter where entities are erroring.

I dont feel terribly strongly about this, and am happy to close this if there is any contention.

:heavy_check_mark: Checklist

  • [x] A changeset describing the change and affected packages. (more info)
  • [ ] Added or updated documentation
  • [ ] Tests for new functionality and regression tests for bug fixes
  • [ ] Screenshots attached (for UI changes)
  • [x] All your commits have a Signed-off-by line in the message. (more info)

punkle avatar Feb 16 '24 13:02 punkle

Changed Packages

Package Name Package Path Changeset Bump Current Version
@backstage/plugin-catalog-backend plugins/catalog-backend patch v1.21.0

backstage-goalie[bot] avatar Feb 16 '24 13:02 backstage-goalie[bot]

Uffizzi Cluster pr-23022 was deleted.

github-actions[bot] avatar Feb 16 '24 13:02 github-actions[bot]

For a long time while I was at Keyloop we had no access to the database, I think this is true for other adopters as well. Also, for early Adopters these are errors they will run into first as they learn. I feel like moving them to debug makes them harder to surface especially as turning on the debug logs creates very verbose output making it harder to catch things like this.

That said, happy to go with what every the Maintainer decide on this 👍

awanlin avatar Feb 16 '24 13:02 awanlin

@awanlin The errors are shown in the application in a few places:

  • InspectEntityDialog, you can get there from the three dots in the top right of the catalog item.
  • The plugin gives you an overview of the processing errors also: @backstage/plugin-catalog-unprocessed-entities

punkle avatar Feb 16 '24 14:02 punkle

My concern here is this makes integrating with log based tooling a lot more difficult. We use logs to generate both metrics and alerting, and while this would be possible by enabling debug logging, that would also generate a lot of noise and cost when it comes to storing those logs.

mikeyhc avatar Feb 22 '24 16:02 mikeyhc

Can you elaborate a little? Which metrics and alerts are you triggering from this log line?

punkle avatar Feb 23 '24 07:02 punkle

Sure! We use this to generate metrics and alert around total number of erroring entities. Seeing a steep uptick in this is often indicative of a new automation generating invalid entities, and has helped us proactively reached out to catalog users. Also as a large number of erroring entities can negatively impact processing times it has allowed us to head off incidents related to that.

mikeyhc avatar Feb 23 '24 15:02 mikeyhc

@mikeyhc I suppose what we could do to satisfy both concerns is to emit an event on the event broker and then make a configurable event subscriber that writes a log line or a produces a metric or something..

Then if you dont want the log line, dont configure the subscriber. If you do want the log line, configure the subscriber. Something like that.

Would that work?

punkle avatar Feb 26 '24 09:02 punkle

@punkle sounds good to me

mikeyhc avatar Feb 26 '24 14:02 mikeyhc

@mikeyhc Ive implemented that suggestion in the latest version of this pull request. Let me know if you have any feedback on the specifics.

punkle avatar Feb 26 '24 14:02 punkle

This PR has been automatically marked as stale because it has not had recent activity from the author. It will be closed if no further activity occurs. If the PR was closed and you want it re-opened, let us know and we'll re-open the PR so that you can continue the contribution!

github-actions[bot] avatar Mar 07 '24 13:03 github-actions[bot]

not stale

punkle avatar Mar 08 '24 15:03 punkle

This PR has been automatically marked as stale because it has not had recent activity from the author. It will be closed if no further activity occurs. If the PR was closed and you want it re-opened, let us know and we'll re-open the PR so that you can continue the contribution!

github-actions[bot] avatar Mar 15 '24 15:03 github-actions[bot]

not stale

punkle avatar Mar 15 '24 16:03 punkle

This PR has been automatically marked as stale because it has not had recent activity from the author. It will be closed if no further activity occurs. If the PR was closed and you want it re-opened, let us know and we'll re-open the PR so that you can continue the contribution!

github-actions[bot] avatar Mar 22 '24 16:03 github-actions[bot]

Not stale

punkle avatar Mar 25 '24 07:03 punkle

This PR has been automatically marked as stale because it has not had recent activity from the author. It will be closed if no further activity occurs. If the PR was closed and you want it re-opened, let us know and we'll re-open the PR so that you can continue the contribution!

github-actions[bot] avatar Apr 01 '24 07:04 github-actions[bot]

This PR has been automatically marked as stale because it has not had recent activity from the author. It will be closed if no further activity occurs. If the PR was closed and you want it re-opened, let us know and we'll re-open the PR so that you can continue the contribution!

github-actions[bot] avatar Apr 09 '24 09:04 github-actions[bot]

Thank you for contributing to Backstage! The changes in this pull request will be part of the 1.26.0 release, scheduled for Tue, 14 May 2024.

github-actions[bot] avatar Apr 09 '24 09:04 github-actions[bot]

yay, thank you!

punkle avatar Apr 09 '24 09:04 punkle

Hey @punkle :wave: I have a question regarding your changes. I was trying to get the old way of catalog entities errors logging but I'm having a hard time doing that. Would you be able to help me? I followed your instructions from changeset and have something like this in my backend index.ts file:

interface EventsPayload {
  entity: string;
  location?: string;
  errors: Error[];
}

interface EventsParamsWithPayload extends EventParams {
  eventPayload: EventsPayload;
}
const eventsModuleCustomExtensions = createBackendModule({
  pluginId: 'events', // name of the plugin that the module is targeting
  moduleId: 'catalog-errors',
  register(env) {
    env.registerInit({
      deps: {
        events: eventsServiceRef,
        logger: coreServices.logger
      },
      async init({ events, logger}) { 
        events.subscribe({
          id: 'catalog',
          topics: ['experimental.catalog.errors'],
          async onEvent(params: EventsParamsWithPayload): Promise<void> {
            const {entity, location, errors} = params.eventPayload;
            for (const error of errors) {
              logger.warn(error.message, {
                entity,
                location,
              });
            }
          }
        })
      },
    });
  },
});

And in the app-config.yaml file I've added:

events:
  http:
    topics:
      - catalog
      - experimental.catalog.errors

Do you know what am I missing? despite adding that custom extension module I cannot see the warnings in my logs. :( Also I used eventsServiceRef instead of EventsExtensionPoint. addSubscribers() because it is marked as deprecated (Deprecated: use EventsService. subscribe instead).

Would you be able to help me get old logging working again?

knowacki23 avatar Jun 04 '24 12:06 knowacki23

@knowacki23 I am not very familiar with the changes around the backend system. Can you tell if the events service is being configured in the catalog backend? If the events service is not configured it will not produce the events for the errors.

punkle avatar Jun 04 '24 12:06 punkle

I've never used events before so I followed: backend migration guide and README.md of events-backend plugin

so my backend config when it comes to the events service is limited to adding:

// packages/backend/src/index.ts
backend.add(import('@backstage/plugin-events-backend/alpha'));

and adding entries mentioned above to the app-config.yaml

I assume that is not enough then? :disappointed:

knowacki23 avatar Jun 04 '24 12:06 knowacki23

@freben / @punkle. Is there any value in adding documentation on this feature to backstage.io? I was looking around and a “debugging” catalog/catalog ingestion might be a nice section to add somewhere around here https://backstage.io/docs/features/software-catalog/. Or maybe refreshing this section https://backstage.io/docs/features/software-catalog/life-of-an-entity/#errors.

I’m happy to take a shot at documenting this so it’s easier to find and not just in a changelog but didn’t know if I was missing a PR or if this was already planned!

codingdiaz avatar Jun 06 '24 21:06 codingdiaz

@codingdiaz tons of value IMO! It's come up a few times already on Discord as well

awanlin avatar Jun 06 '24 21:06 awanlin

@awanlin I was just testing this with the new backend system. Do you happen to know if anyone has gotten that working?

Similar to @knowacki23 this was my implementation

import { CATALOG_ERRORS_TOPIC } from '@backstage/plugin-catalog-backend';
import {
  coreServices,
  createBackendModule,
} from '@backstage/backend-plugin-api';
import { eventsServiceRef, EventParams } from '@backstage/plugin-events-node';

interface EventsPayload {
  entity: string;
  location?: string;
  errors: Error[];
}

interface EventsParamsWithPayload extends EventParams {
  eventPayload: EventsPayload;
}

const eventsModuleCatalogLogging = createBackendModule({
  pluginId: 'events',
  moduleId: 'catalog-errors',
  register(env) {
    env.registerInit({
      deps: {
        events: eventsServiceRef,
        logger: coreServices.logger,
      },
      async init({ events, logger }) {
        events.subscribe({
          id: 'catalog',
          topics: [CATALOG_ERRORS_TOPIC],
          async onEvent(params: EventParams): Promise<void> {
            const event = params as EventsParamsWithPayload;
            const { entity, location, errors } = event.eventPayload;
            for (const error of errors) {
              logger.warn(error.message, {
                entity,
                location,
              });
            }
          },
        });
      },
    });
  },
});

backend.add(import('@backstage/plugin-events-backend/alpha'));
backend.add(eventsModuleCatalogLogging);

I mucked with the example entities and one isn’t loading after a change but no errors are thrown.

codingdiaz avatar Jun 06 '24 22:06 codingdiaz

Ah I just got this working. I think this was need and isn’t released yet I upgraded to the latest next release and it worked 🥳 .

https://github.com/backstage/backstage/commit/6a257f0de49e948d7d2a1cb6da40083158ac34b3

codingdiaz avatar Jun 06 '24 23:06 codingdiaz

I put up a PR to document this here: https://github.com/backstage/backstage/pull/25094 for reference.

codingdiaz avatar Jun 07 '24 00:06 codingdiaz

Ah I just got this working. I think this was need and isn’t released yet I upgraded to the latest next release and it worked 🥳 .

6a257f0

That's a great news! Thank you for pointing this out! :smiley:

knowacki23 avatar Jun 10 '24 06:06 knowacki23