osf.io
osf.io copied to clipboard
[WIP][ENG-3568] keen replacement
Purpose
store analytics data in our own elasticsearch index(es), to prepare for eventually dropping (some) third-party analytics
Changes
- move contents of
osf/metrics.py
into multiple files inosf/metrics/
- move daily reporting logic from
scripts/analytics/
toosf/metrics/reporters/
-- save results in our own elasticsearch, in addition to sending to keen - add
CountedUsage
metric to support usage reporting - update legacy-frontend js to track pageview events using
CountedUsage
(in addition to sending to keen) - new management commands:
-
daily_reporters_go
to generate daily reports (also added to the admin's management-command page) -
fake_metrics_reports
to generate fake daily reports for local testing
-
new api routes:
-
/_/metrics/events/counted_usage/
: POST to record osf usage, e.g. pageviews -
/_/metrics/reports/
: GET available report types -
/_/metrics/reports/<report_name>/latest/
: GET latest report of the given type -
/_/metrics/reports/<report_name>/recent/
: GET list of recent reports (up to 1000 with?days_back=1000
-
/_/metrics/query/node_analytics/<node_guid>/<timespan>
: GET data necessary to support the existing node analytics page
QA Notes
Please make verification statements inspired by your code and what your code touches.
- Verify POSTing pageview info for a node page to
/_/metrics/events/counted_usage/
responds201 Created
and increases viewcount in thenode_analytics
query - Verify POSTing duplicate pageview info multiple times in the same 30-second window is counted as only one pageview
- Verify the latest report of each type has a
report_date
of yesterday - Verify running
daily_reporters_go
multiple times for the same date does NOT result in duplicate reports
What are the areas of risk?
- losing data by mis-recording
Any concerns/considerations/questions that development raised?