argilla icon indicating copy to clipboard operation
argilla copied to clipboard

Add `huggingface_hub.utils.telemetry`

Open davidberenstein1957 opened this issue 1 year ago • 7 comments

Pull Request Template

Adds telemetry for:

  • [x] users
  • [x] workspaces
  • [x] datasets
  • [x] login users
  • [x] server errors
  • [x] records
  • [x] responses
  • [x] suggestions
  • [x] metadata
  • [x] vectors
  • [x] deprecate old telemetry KEY ("huggingface_hub includes an helper to send telemetry data. This information helps us debug issues and prioritize new features. Users can disable telemetry collection at any time by setting the HF_HUB_DISABLE_TELEMETRY=1 environment variable. Telemetry is also disabled in offline mode (i.e. when setting HF_HUB_OFFLINE=1)."
  • [x] write documentation done int #5253
  • [x] add automatic task distribution should be done after #5136
  • [x] include https://github.com/gradio-app/gradio/pull/8884

General Idea: I’ve structured data to come in through URLs/topics like dataset/settings/vectorsettings/create or dateset/records/suggestions/read along with some generalized metadata per URL/topics, like count or type of suggestion or setting.

To discuss:

  • What to do with list methods. I currently track list-like and send each individual with read, along with a read with a count. I did this because it might be interesting to get the total number of users, workspaces etc. Should we move this over to list as a separate CRUD action? Do we also want to capture each individual update
  • A similar logic applies to bulk operations. bulk_crud as separate CRUD actions?
  • I don't track user/dataset/workspace-specific list operations, like list_users_workspace or list_datasets_user.
  • I don't track metadata and vector updates on a record level, however, we DO keep track of operations on suggestions and responses.
  • @frascuchon was there a reason to include the header along with user/login operations? otherwise I will rewrite this a bit and include the user/login as user/read.

Follow up

  • https://github.com/argilla-io/argilla/issues/5224 @frascuchon

Closes #5204

Type of change

  • Improvement (change adding some improvement to an existing functionality)

How Has This Been Tested

NA

Checklist

  • I added relevant documentation
  • I followed the style guidelines of this project
  • I did a self-review of my code
  • I made corresponding changes to the documentation
  • I confirm My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

davidberenstein1957 avatar Jul 12 '24 12:07 davidberenstein1957

@dvsrepo added some initial work, still in progress but added you more so you could keep track.

davidberenstein1957 avatar Jul 12 '24 12:07 davidberenstein1957

Looks good, just left two small comments!

dvsrepo avatar Jul 15 '24 13:07 dvsrepo

Docs for this PR have been deployed hidden from versioning: https://argilla-io.github.io/argilla/docs_add-telemetry-info-to-reference

github-actions[bot] avatar Jul 17 '24 13:07 github-actions[bot]

Codecov Report

Attention: Patch coverage is 94.36620% with 8 lines in your changes missing coverage. Please review.

Project coverage is 91.42%. Comparing base (7495136) to head (0c51124). Report is 2 commits behind head on develop.

Files with missing lines Patch % Lines
argilla-server/src/argilla_server/settings.py 70.00% 3 Missing :warning:
...lla-server/src/argilla_server/telemetry/_client.py 95.83% 2 Missing :warning:
...la-server/src/argilla_server/telemetry/_helpers.py 88.23% 2 Missing :warning:
argilla-server/src/argilla_server/_app.py 96.00% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #5218      +/-   ##
===========================================
+ Coverage    91.20%   91.42%   +0.21%     
===========================================
  Files          139      141       +2     
  Lines         5745     5798      +53     
===========================================
+ Hits          5240     5301      +61     
+ Misses         505      497       -8     
Flag Coverage Δ
argilla-server 91.42% <94.36%> (+0.21%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jul 17 '24 13:07 codecov[bot]

I've assigned the milestone v2.1.0. This PR changes all the endpoints and we should test it properly before releasing it.

frascuchon avatar Jul 19 '24 09:07 frascuchon

@frascuchon sure, no problem :)

davidberenstein1957 avatar Jul 22 '24 05:07 davidberenstein1957

@burtenshaw, Paco felt comfortable with merging this into develop and start testing the incoming metrics from there. Would you be able to five it a final lookover so we can merge after?

davidberenstein1957 avatar Aug 19 '24 14:08 davidberenstein1957