HPI
HPI copied to clipboard
Human Programming Interface - a way to unify, access and interact with all of my personal data [my modules]
TLDR: I'm using HPI
(Human Programming Interface) package as a means of unifying, accessing and interacting with all of my personal data.
It's a Python library (named my
), a collection of modules for:
- social networks: posts, comments, favorites, searches
- shell/program histories (zsh, bash, python, mpv, firefox)
- programming (github/commits)
- instant messaging
- media histories (movies, TV shows, music, video game achievements/history); see https://sean.fish/feed/
This is built on top of karlicoss/HPI
. It started out as a fork, but has since been converted to my own set of modules. This is installed alongside the upstream repository (meaning you can use both modules from upstream and here simultaneously), see #install
My Modules
-
my.zsh
andmy.bash
, access to my shell history w/ timestamps -
my.mail.imap
andmy.mail.mbox
to parse local IMAP sync's of my mail/mbox files -- see doc/MAIL_SETUP.md -
my.mpv.history_daemon
, accesses movies/music w/ activity/metadata that have played on my machine, facilitated by a mpv history daemon -
my.discord.data_export
, parses ~1,000,000 messages/events from the discord data export, parser here -
my.todotxt.active
to parse my current todo.txt file;my.todotxt.git_history
tracks my history using backups of those files ingit_doc_history
-
my.rss.newsboat
, keeps track of when I added/removed RSS feeds (fornewsboat
) -
my.ipython
, for timestamped python REPL history -
my.ttt
, to parse shell/system history tracked byttt
-
my.activitywatch.active_window
, to parse active window events (what application I'm using/what the window title is) usingwindow_watcher
and activitywatch on android -
my.chess.export
, to track my chess.com/lichess.org games, usingchess_export
-
my.trakt.export
, providing me a history/my ratings for Movies/TV Show (episodes) usingtraktexport
-
my.listenbrainz.export
, exporting my music listening history from ListenBrainz (open-source Last.fm) usinglistenbrainz_export
-
my.offline.listens
, for offline music listen history, using offline_listens -
my.mal.export
, for anime/manga history usingmalexport
-
my.grouvee.export
, for my video game history/backlog usinggrouvee_export
-
my.runelite.screenshots
, parses data from the automatic runelite screenshots -
my.minecraft.advancements
, parses advancement (local achievement data) from the~/.minecraft
directory -
my.project_euler
, when I solved Project Euler problems -
my.linkedin.privacy_export
, to parse the privacy export from linkedin -
my.scramble.history
for merged (timed) rubiks cube solves from multiple sources, using scramble_history
'Historical' Modules
These are modules to parse GDPR exports/data from services I used to use, but don't anymore. They're here to provide more context into the past.
-
my.apple.privacy_export
, parses Game Center and location data from the apple privacy export -
my.facebook.gdpr
, to parse the GDPR export from Facebook -
my.league.export
, gives League of Legends game history usinglolexport
-
my.steam.scraper
, for steam achievement data and game playtime usingsteamscraper
-
my.piazza.scraper
, parsing piazza (university forum) posts usingpiazza-scraper
-
my.blizzard.gdpr
, for general battle.net event data parsed from a GDPR export -
my.skype.gdpr
to parse a couple datetimes from the Skype GDPR export (seems all my data from years ago is long gone) -
my.spotify.gdpr
, to parse the GDPR export from Spotify, mostly to access songs from my playlists from years ago -
my.twitch
, merging the data export and my messages parsed from the overrustle logs dump
See here for my HPI
config
Promnesia Source
s for these HPI
modules
I also have some more personal scripts/modules in a separate repo; HPI-personal
In-use from karlicoss/HPI
-
my.browser
, to parse browser history usingbrowserexport
-
my.google.takeout.parser
, parses lots of (~500,000) events (youtube, searches, phone usage, comments, location history) from google takeouts, usinggoogle_takeout_parser
-
my.coding.commits
to track git commits across the system -
my.github
to track github events/commits and parse the GDPR export, usingghexport
-
my.reddit
, get saved posts, comments. Usesrexport
to create backups of recent activity periodically, andpushshift
to get old comments. -
my.smscalls
, exports call/sms history using SMS Backup & Restore -
my.stackexchange.stexport
, for stackexchange data usingstexport
Partially in-use/with overrides:
-
my.location
, though since I also have some locations fromapple.privacy_export
, I have amy.location.apple
which I then merge intomy.location.all
in my overriddenall.py
file on my personal repo - similarly, I do use
my.ip
andmy.location.via_ip
from upstream, but I have overriddenall.py
and module files here
'Overriding' an all.py
file means replacing the all.py
from upstream repo (this means it can use my sources here to grab more locations/ips, since those don't exist in the upstream). For more info see reorder_editable, and the module design docs for HPI, but you might be able to get the gist by comparing:
-
my.location.all in
karlicoss/HPI
-
my.location.all in
seanbreckenridge/HPI-personal
Since I've mangled my PYTHONPATH
(see reorder_editable), it imports from my repo instead of karlicoss/HPI
. all.py
files tend to pretty small -- so overriding/changing a line to add a source is the whole point.
Companion Tools/Libraries
Disregarding tools which actively collect data (like ttt
/window_watcher
) or repositories which have their own exporter/parsers which are used here, there are a couple other tools/libraries I've created for this project:
-
ipgeocache
- for any IPs gathered from data exports, provides geolocation info, so I have partial location info going back to 2013 -
sqlite_backup
- to safely copy/backup application sqlite databases that may currently be in use -
git_doc_history
- a bash script to copy/backup files into git history, with a python library to help traverse and create a history/parse diffs between commits -
HPI_API
- automatically creates a JSON API/server for HPI modules -
url_metadata
- caches youtube subtitles, url metadata (title, description, image links), and a html/plaintext summary for any URL
I also use this in my_feed
, which creates a feed of media/data using HPI
, live at https://sean.fish/feed/
Ad-hoc and interactive
Some basic examples.
When was I most using reddit?
>>> import collections, my.reddit.all, pprint
>>> pprint.pprint(collections.Counter([c.created.year for c in my.reddit.all.comments()]))
Counter({2016: 3288,
2017: 801,
2015: 523,
2018: 209,
2019: 65,
2014: 4,
2020: 3})
Most common shell commands?
>>> import collections, pprint, my.zsh
# lots of these are git-related aliases
>>> pprint.pprint(collections.Counter([c.command for c in my.zsh.history()]).most_common(10))
[('ls', 51059),
('gst', 11361),
('ranger', 6530),
('yst', 4630),
('gds', 3919),
('ec', 3808),
('clear', 3651),
('cd', 2111),
('yds', 1647),
('ga -A', 1333)]
What websites do I visit most?
>>> import collections, pprint, my.browser.export, urllib
>>> pprint.pprint(collections.Counter([urllib.parse.urlparse(h.url).netloc for h in my.browser.export.history()]).most_common(5))
[('github.com', 20953),
('duckduckgo.com', 10146),
('www.youtube.com', 10126),
('discord.com', 8425),
('stackoverflow.com', 2906)]
Song I've listened to most?
>>> import collections, my.mpv.history_daemon
>>> collections.Counter([m.path for m in my.mpv.history_daemon.history()]).most_common(1)[0][0]
'/home/sean/Music/JPEFMAFIA/JPEGMAFIA - LP! - 2021 - V0/JPEGMAFIA - LP! - 05 HAZARD DUTY PAY!.mp3'
Movie I've watched most?
>>> import my.trakt, from collections import Counter
>>> Counter(e.media_data.title for e in my.trakt.history()).most_common(1)
[('Up', 92)] # (the pixar movie)
hpi
also has a JSON query interface, so I can do quick computations using shell tools like:
# how many calories have I eaten today (from https://github.com/seanbreckenridge/ttally)
$ hpi query ttally.__main__.food --recent 1d -s | jq -r '(.quantity)*(.calories)' | datamash sum 1
2258.5
Install
For the basic setup, I recommend you clone and install both directories as editable installs:
# clone and install upstream as an editable package
git clone https://github.com/karlicoss/HPI ./HPI-karlicoss
python3 -m pip install --user -e ./HPI-karlicoss
# clone and install my repository as an editable package
git clone https://github.com/seanbreckenridge/HPI ./HPI-seanb
python3 -m pip install --user -e ./HPI-seanb
Editable install means any changes to python files reflect immediately, which is very convenient for debugging and developing new modules. To update, you can just git pull
in those directories.
If you care about overriding modules, to make sure your easy-install.pth
is ordered correctly:
python3 -m pip install --user reorder_editable
python3 -m reorder_editable reorder ./HPI-seanb ./HPI-karlicoss
Then, you likely need to run hpi module install
for any modules you plan on using -- this can be done incrementally as you setup new modules. E.g.:
-
hpi module install my.trakt.export
to install dependencies - Check the stub config or my config and setup the config block in your HPI configuration file
- Run
hpi doctor my.trakt.export
to check for any possible config issues/if your data is being loaded properly
(The install script does that for all my modules, but you likely don't want to do that)
Its possible to install both my
packages because HPI
is a namespace package. For more information on that, and some of the complications one can run into, see reorder_editable, and the module design docs for HPI.
If you're having issues installing/re-installing, check the TROUBLESHOOTING_INSTALLS.md file.
If you recently updated and it seems like something has broke, check the CHANGELOG for any possible breaking changes