hasjob Partition JobImpression and UserEvent tables

Hasjob's JobImpression table has become a bottleneck in production. It along with the UserEvent table is now the biggest consumer of disk space. From production (using query from this article):

hasjob=> SELECT relname AS "relation", pg_size_pretty(pg_relation_size(C.oid)) AS "size"
  FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
  WHERE nspname NOT IN ('pg_catalog', 'information_schema')
  ORDER BY pg_relation_size(C.oid) DESC
  LIMIT 10;
              relation              |  size   
------------------------------------+---------
 job_impression                     | 6426 MB
 user_event                         | 6147 MB
 ix_job_impression_event_session_id | 2823 MB
 job_impression_pkey                | 2681 MB
 ix_job_impression_datetime         | 1834 MB
 pg_toast_89587                     | 248 MB
 event_session                      | 229 MB
 job_application                    | 206 MB
 user_event_pkey                    | 202 MB
 pg_toast_89514                     | 122 MB
(10 rows)

(user_event would have been larger but is not indexed for now.)

PostgreSQL's documentation recommends using vertical partitioning in such timestamp-sensitive scenarios so that indexes are smaller and inserts are more efficient. This will be somewhat cumbersome to implement as we'll no longer have SQLAlchemy's elegant abstractions, but shouldn't be too hard to manage and will be more or less transparent from within code.

One consequence of using this mechanism is that (a) we can no longer use session_id in a unique constraint as that may span partition borders and will slow down inserts, and (b) all read queries for a given session will need to add job_impression.datetime >= session.created_at AND job_impression.datetime <= COALESCE(session.ended_at, NOW()) so that the query planner limits which partitions are read from. Session sweeping as described in #221 becomes important now.

However, the JobImpression table is typically queried on the basis of jobpost_id and not session_id, so this makes partition-specific queries somewhat trickier. We could consider that since jobs are supposed to expire after 30 days and will not be impressed thereafter (except for the unpublicised archive mode), these queries could be bounded to between jobpost.created_at and jobpost.datetime + interval '30 days' (for a total period that may exceed 30 days depending on how long it took for the draft to be published, and depending on manual updates to the datetime column as occasionally done for customer service).

Mar 29 '16 19:03 jace

Engine Yard has a nice blog post on PostgreSQL partitioning, including a trigger that creates child tables automatically.

Mar 30 '16 09:03 jace

One expectation with partitioning is that old data that is no longer needed can be exported to a backup and removed from the database. In our case we still read old data occasionally from permalinks, particularly for viewcounts. The only answer to that problem is having this data permanently cached in another table (rather than cached in redis).

Apr 01 '16 12:04 jace

In our case we still read old data occasionally from permalinks, particularly for viewcounts.

can we find out the maximum interval between 2 job impressions after a job has expired? As in, how long was a job in stale mode before someone opened it with permalink again. And how often does that occur for, say, more than 90 days interval, giving 2 months of grace period once a job has expired. if it's not that frequent, we could move those records to a separate table periodically. and while querying by permalink, if job expiry date is older than 3 months, we check in that other table for impression data and not always.

Also, how important is viewcount for a job that's been expired for more than 3 months? Do we really need to show that at the cost of query time? Until we figure out a nice way to deal with this, we could disable showing view counts for those jobs. Every job view must be suffering from querying this large a table.

Jun 20 '16 13:06 iambibhas

The pg_partman extension for PostgreSQL moves all the hard logic into PostgreSQL itself. We should use it.

Jun 27 '16 07:06 jace

current stat -

              relation              |  size
------------------------------------+---------
 user_event                         | 13 GB
 job_impression                     | 8000 MB
 job_impression_pkey                | 4930 MB
 ix_job_impression_event_session_id | 4404 MB
 ix_job_impression_datetime         | 4139 MB
 event_session                      | 760 MB
 user_event_pkey                    | 450 MB
 job_application                    | 324 MB
 anon_user                          | 321 MB
 pg_toast_89587                     | 319 MB
(10 rows)

Apr 17 '18 06:04 iambibhas

hasjob hasjob copied to clipboard

Partition JobImpression and UserEvent tables

hasjob
hasjob copied to clipboard