argilla
argilla copied to clipboard
Add `huggingface_hub.utils.telemetry`
Pull Request Template
Adds telemetry for:
- [x] users
- [x] workspaces
- [x] datasets
- [x] login users
- [x] server errors
- [x] records
- [x] responses
- [x] suggestions
- [x] metadata
- [x] vectors
- [x] deprecate old telemetry KEY ("huggingface_hub includes an helper to send telemetry data. This information helps us debug issues and prioritize new features. Users can disable telemetry collection at any time by setting the HF_HUB_DISABLE_TELEMETRY=1 environment variable. Telemetry is also disabled in offline mode (i.e. when setting HF_HUB_OFFLINE=1)."
- [x] write documentation done int #5253
- [x] add automatic task distribution should be done after #5136
- [x] include https://github.com/gradio-app/gradio/pull/8884
General Idea:
I’ve structured data to come in through URLs/topics like dataset/settings/vectorsettings/create or dateset/records/suggestions/read along with some generalized metadata per URL/topics, like count or type of suggestion or setting.
To discuss:
- What to do with
listmethods. I currently tracklist-likeand send each individual withread, along with areadwith a count. I did this because it might be interesting to get the total number of users, workspaces etc. Should we move this over tolistas a separate CRUD action? Do we also want to capture each individual update - A similar logic applies to bulk operations.
bulk_crudas separate CRUD actions? - I don't track user/dataset/workspace-specific list operations, like list_users_workspace or list_datasets_user.
- I don't track metadata and vector updates on a record level, however, we DO keep track of operations on suggestions and responses.
- @frascuchon was there a reason to include the
headeralong withuser/loginoperations? otherwise I will rewrite this a bit and include theuser/loginasuser/read.
Follow up
- https://github.com/argilla-io/argilla/issues/5224 @frascuchon
Closes #5204
Type of change
- Improvement (change adding some improvement to an existing functionality)
How Has This Been Tested
NA
Checklist
- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature works
- I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)
@dvsrepo added some initial work, still in progress but added you more so you could keep track.
Looks good, just left two small comments!
Docs for this PR have been deployed hidden from versioning: https://argilla-io.github.io/argilla/docs_add-telemetry-info-to-reference
Codecov Report
Attention: Patch coverage is 94.36620% with 8 lines in your changes missing coverage. Please review.
Project coverage is 91.42%. Comparing base (
7495136) to head (0c51124). Report is 2 commits behind head on develop.
Additional details and impacted files
@@ Coverage Diff @@
## develop #5218 +/- ##
===========================================
+ Coverage 91.20% 91.42% +0.21%
===========================================
Files 139 141 +2
Lines 5745 5798 +53
===========================================
+ Hits 5240 5301 +61
+ Misses 505 497 -8
| Flag | Coverage Δ | |
|---|---|---|
| argilla-server | 91.42% <94.36%> (+0.21%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I've assigned the milestone v2.1.0. This PR changes all the endpoints and we should test it properly before releasing it.
@frascuchon sure, no problem :)
@burtenshaw, Paco felt comfortable with merging this into develop and start testing the incoming metrics from there. Would you be able to five it a final lookover so we can merge after?