etl icon indicating copy to clipboard operation
etl copied to clipboard

Support for adding an "internal note" to a dataset

Open lucasrodes opened this issue 10 months ago • 5 comments
trafficstars

One-liner

Option to add an internal note to datasets.

Description

Sometimes, datasets in Admin could benefit from having an internal description. E.g. in the case that the dataset is for external use, for experimenting, etc.

In the dataset edit page, it looks as if we had thought of it, since there is the "Internal notes" field.

Image

The help text for this field suggests that its content came from the dataset.description. However, I've been trying to see if this works (see https://github.com/owid/etl/pull/3784), but it doesn't seem to.

lucasrodes avatar Jan 09 '25 17:01 lucasrodes

That field has essentially been deprecated. Internal notes were taken from the first source’s description.

The question is how much "internal" should it be. If we're fine with having it in ETL repository, we can add a new field internal_notes to DatasetMeta and create a new column in datasets table in MySQL. If we want it to be private, we'd have to make it work through metadata editing (that no one uses).

Marigold avatar Jan 10 '25 10:01 Marigold

Hey @lucasrodes, what would you use them for, vs having comments in the code, or notes at the chart level?

Note that even "internal" notes are public.

Is this about caveats, anomalies, or judgements of data quality, for example?

larsyencken avatar Jan 23 '25 11:01 larsyencken

hey @larsyencken, it is mostly about caveats and minor details, that would be nice to make easily accessible to authors via the Admin. I think it is generally OK to have these be public. Otherwise we can think of something that can just be added through the admin.

More context: Some datasets may be in production for a particular use other than just creating charts. E.g., specific collaboration with external providers, specific projects from an author, etc.

This issue has appeared when doing housekeeping and checking at datasets with no chart associated. See the following issues:

  • https://github.com/owid/owid-issues/issues/1740: Datasets not used by any chart are most likely okay to delete, but others might not.
  • https://github.com/owid/owid-issues/issues/1798: Similar to the above issue, some datasets are only used for draft charts, so maybe they are just for exploration.

In general, if a dataset is kept for a good reason (even if not used by charts), it'd be nice for anyone in the team to know that reason. This can help doing housekeeping tasks.

Also, brief thread in slack →

lucasrodes avatar Jan 23 '25 11:01 lucasrodes

We could have a new ETL dataset metadata field, but it could land on the grapher admin dataset "Internal notes" (which currently is not used). Maybe it's something we could easily edit from the ETL dashboard, and display it there. This would give us the possibility to have notes to remember on our upcoming update.

But maybe we'd need to have a few examples to justify how much we need this. @lucasrodes can you think of any example? Thanks.

pabloarosado avatar Feb 06 '25 10:02 pabloarosado

Thanks for weighing in, @pabloarosado.

On your point:

But maybe we'd need to have a few examples to justify how much we need this

  • As I mentioned, it would have been extremely helpful to me when doing housekeeping tasks. When archiving datasets, is often not clear if the dataset is there for a reason, even if it doesn't have any chart associated. See work from https://github.com/owid/owid-issues/issues/1740 or https://github.com/owid/owid-issues/issues/1798. Some examples:
  • I've asked the team in case they have other instances where this could be helpful (see slack thread):
    • Hannah:

      Probably yes. In the past I have used it to flag to others if it’s a test dataset, or one that’s not yet ready to be promoted. I’d add something like “Do not share”. Not sure if others actually found this useful.

      I’ve also — on occasion — added notes like “Used for XXX, do not delete”. Although that’s less common recently.

    • Ed:

      Yes, definitely! We've used this in the past and would use it again if it was editable.

I'll keep adding examples as they appear.

lucasrodes avatar Feb 06 '25 10:02 lucasrodes

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 07 '25 23:04 stale[bot]