masto.sh
masto.sh copied to clipboard
Only fetches a subset of statuses
Running this against my account, I have a total of 6402 statuses according to the web UI (as of this morning), but the database only has 4368 rows in it. Investigating...
Interesting. I remember something like "statuses" not actually being "posts" but "interactions" in general and might also include favourites and reblogs.
You could be right here (and, given I do developer relations at Mastodon, I should probably know the answer... 😬 ... let me go ask the eng team!)
I've modified locally to exclude reblogs now (avoiding the null rows), and also have a regex to simplify the post content values to strip the markup. Not entirely sure where you are going with this, but happy to contribute if useful.
... although I suppose, now that I'm discovering the different output formats the DuckDB CLI supports, there might be value in retaining the markup, for example to render subsets of query results into HTML. 🤔