parca icon indicating copy to clipboard operation
parca copied to clipboard

Improve columns compression

Open cyriltovena opened this issue 3 years ago • 5 comments

I wrote a tool to inspect Parca generated files and I think there's opportunity for improvement.

The period and duration could probably use better encoding as of today they take as much space as value and timestamp but the data is mostly redundant.

see

❯ go run cmd/parquet-tool/main.go
schema: message parca {
        required int64 duration (INT(64,true));
        optional binary labels.__address__ (STRING);
        optional binary labels.__profile_path__ (STRING);
        optional binary labels.__scheme__ (STRING);
        optional binary labels.app_kubernetes_io_component (STRING);
        optional binary labels.app_kubernetes_io_instance (STRING);
        optional binary labels.app_kubernetes_io_name (STRING);
        optional binary labels.container (STRING);
        optional binary labels.controller_revision_hash (STRING);
        optional binary labels.instance (STRING);
        optional binary labels.job (STRING);
        optional binary labels.loki_gossip_member (STRING);
        optional binary labels.name (STRING);
        optional binary labels.namespace (STRING);
        optional binary labels.pod (STRING);
        optional binary labels.pod_template_hash (STRING);
        optional binary labels.statefulset_kubernetes_io_pod_name (STRING);
        required binary name (STRING);
        required int64 period (INT(64,true));
        required binary period_type (STRING);
        required binary period_unit (STRING);
        optional int64 pprof_num_labels.bytes (INT(64,true));
        required binary sample_type (STRING);
        required binary sample_unit (STRING);
        required binary stacktrace (STRING);
        required int64 timestamp (INT(64,true));
        required int64 value (INT(64,true));
}
Num Rows: 3796901
         Row group: 0
                 Row Count: 3796901
                 Row size: 164 MB
                 Columns:
+-------------------------------------------+------------+---------+---------------------+-----------------------+-------------+-------+
|                    COL                    |    TYPE    | NUMVAL  | TOTALCOMPRESSEDSIZE | TOTALUNCOMPRESSEDSIZE | COMPRESSION |   %   |
+-------------------------------------------+------------+---------+---------------------+-----------------------+-------------+-------+
| duration                                  | INT64      | 3796901 | 30 MB               | 30 MB                 |        0.00 | 18.50 |
| labels.__address__                        | BYTE_ARRAY | 3796901 | 8.4 kB              | 8.4 kB                |        0.00 |  0.01 |
| labels.__profile_path__                   | BYTE_ARRAY | 3796901 | 6.5 kB              | 6.5 kB                |        0.00 |  0.00 |
| labels.__scheme__                         | BYTE_ARRAY | 3796901 | 6.1 kB              | 6.1 kB                |        0.00 |  0.00 |
| labels.app_kubernetes_io_component        | BYTE_ARRAY | 3796901 | 6.3 kB              | 6.3 kB                |        0.00 |  0.00 |
| labels.app_kubernetes_io_instance         | BYTE_ARRAY | 3796901 | 6.1 kB              | 6.1 kB                |        0.00 |  0.00 |
| labels.app_kubernetes_io_name             | BYTE_ARRAY | 3796901 | 6.1 kB              | 6.1 kB                |        0.00 |  0.00 |
| labels.container                          | BYTE_ARRAY | 3796901 | 7.1 kB              | 7.1 kB                |        0.00 |  0.00 |
| labels.controller_revision_hash           | BYTE_ARRAY | 3796901 | 6.8 kB              | 6.8 kB                |        0.00 |  0.00 |
| labels.instance                           | BYTE_ARRAY | 3796901 | 8.4 kB              | 8.4 kB                |        0.00 |  0.01 |
| labels.job                                | BYTE_ARRAY | 3796901 | 6.1 kB              | 6.1 kB                |        0.00 |  0.00 |
| labels.loki_gossip_member                 | BYTE_ARRAY | 3796901 | 6.4 kB              | 6.4 kB                |        0.00 |  0.00 |
| labels.name                               | BYTE_ARRAY | 3796901 | 7.1 kB              | 7.1 kB                |        0.00 |  0.00 |
| labels.namespace                          | BYTE_ARRAY | 3796901 | 6.1 kB              | 6.1 kB                |        0.00 |  0.00 |
| labels.pod                                | BYTE_ARRAY | 3796901 | 8.5 kB              | 8.5 kB                |        0.00 |  0.01 |
| labels.pod_template_hash                  | BYTE_ARRAY | 3796901 | 7.0 kB              | 7.0 kB                |        0.00 |  0.00 |
| labels.statefulset_kubernetes_io_pod_name | BYTE_ARRAY | 3796901 | 7.3 kB              | 7.3 kB                |        0.00 |  0.00 |
| name                                      | BYTE_ARRAY | 3796901 | 5.7 kB              | 5.7 kB                |        0.00 |  0.00 |
| period                                    | INT64      | 3796901 | 30 MB               | 30 MB                 |        0.00 | 18.50 |
| period_type                               | BYTE_ARRAY | 3796901 | 5.7 kB              | 5.7 kB                |        0.00 |  0.00 |
| period_unit                               | BYTE_ARRAY | 3796901 | 5.7 kB              | 5.7 kB                |        0.00 |  0.00 |
| pprof_num_labels.bytes                    | INT64      | 3796901 | 30 MB               | 30 MB                 |        0.00 | 18.10 |
| sample_type                               | BYTE_ARRAY | 3796901 | 5.8 kB              | 5.8 kB                |        0.00 |  0.00 |
| sample_unit                               | BYTE_ARRAY | 3796901 | 5.5 kB              | 5.5 kB                |        0.00 |  0.00 |
| stacktrace                                | BYTE_ARRAY | 3796901 | 13 MB               | 13 MB                 |        0.00 |  7.83 |
| timestamp                                 | INT64      | 3796901 | 30 MB               | 30 MB                 |        0.00 | 18.50 |
| value                                     | INT64      | 3796901 | 30 MB               | 30 MB                 |        0.00 | 18.50 |
+-------------------------------------------+------------+---------+---------------------+-----------------------+-------------+-------+

cyriltovena avatar Jul 06 '22 12:07 cyriltovena

This is really awesome!

Thoughts at first glance:

  • I agree period can probably be optimized away almost entirely with an RLEDictionary encoding.
  • I would try delta encoding on duration and timestamp.
  • There might even be a saving when using DELTA_LENGTH_BYTE_ARRAY on stacktrace.
  • pprof_num_labels I suspect will benefit from RLEDictionary in the same way as other label-values.
  • Value is a tough one, probably a gorilla XOR type encoding is best here, but would need a custom implementation.

Do you want to try these and give it another look?

brancz avatar Jul 06 '22 13:07 brancz

timestamp getting double delta encoding sounds very good! The only caveat might be the jitter that also Prometheus sees. Mentioning this here in case the delta encoding doesn't look as promising as it should be.

metalmatze avatar Jul 11 '22 09:07 metalmatze

Something else is just striking me pprof_num_labels.bytes not sure we want to keep that label it seems redundant ?

cyriltovena avatar Jul 13 '22 06:07 cyriltovena

Redundant with what? That label contains the allocation size. Like I said in my comment I think it would benefit greatly from being RLEDictionary encoded.

brancz avatar Jul 13 '22 13:07 brancz

I finally had a look at this and built this tool to reencode parquet files: https://github.com/polarsignals/frostdb/pull/136

Using this and a little bit of experimentation I could get this parquet file of Parca data:

Num Rows: 5575249
         Row group: 0
                 Row Count: 5575249
                 Row size: 225 MB
                 Columns:
+------------------------+------------+---------+---------------------+-----------------------+-------------+-------+
|          COL           |    TYPE    | NUMVAL  | TOTALCOMPRESSEDSIZE | TOTALUNCOMPRESSEDSIZE | COMPRESSION |   %   |
+------------------------+------------+---------+---------------------+-----------------------+-------------+-------+
| duration               | INT64      | 5575249 | 45 MB               | 45 MB                 |        0.00 | 19.81 |
| labels.instance        | BYTE_ARRAY | 5575249 | 9.0 kB              | 9.0 kB                |        0.00 |  0.00 |
| labels.job             | BYTE_ARRAY | 5575249 | 9.0 kB              | 9.0 kB                |        0.00 |  0.00 |
| name                   | BYTE_ARRAY | 5575249 | 7.7 kB              | 7.7 kB                |        0.00 |  0.00 |
| period                 | INT64      | 5575249 | 45 MB               | 45 MB                 |        0.00 | 19.81 |
| period_type            | BYTE_ARRAY | 5575249 | 7.7 kB              | 7.7 kB                |        0.00 |  0.00 |
| period_unit            | BYTE_ARRAY | 5575249 | 7.7 kB              | 7.7 kB                |        0.00 |  0.00 |
| pprof_num_labels.bytes | INT64      | 5575249 | 44 MB               | 44 MB                 |        0.00 | 19.72 |
| sample_type            | BYTE_ARRAY | 5575249 | 8.1 kB              | 8.1 kB                |        0.00 |  0.00 |
| sample_unit            | BYTE_ARRAY | 5575249 | 8.1 kB              | 8.1 kB                |        0.00 |  0.00 |
| stacktrace             | BYTE_ARRAY | 5575249 | 2.3 MB              | 2.3 MB                |        0.00 |  1.01 |
| timestamp              | INT64      | 5575249 | 45 MB               | 45 MB                 |        0.00 | 19.81 |
| value                  | INT64      | 5575249 | 45 MB               | 45 MB                 |        0.00 | 19.81 |
+------------------------+------------+---------+---------------------+-----------------------+-------------+-------+

Note 225MB row size (and 239202558 bytes ~239mb real file size), down to

Num Rows: 5575249
         Row group: 0
                 Row Count: 5575249
                 Row size: 24 MB
                 Columns:
+------------------------+------------+---------+---------------------+-----------------------+-------------+-------+
|          COL           |    TYPE    | NUMVAL  | TOTALCOMPRESSEDSIZE | TOTALUNCOMPRESSEDSIZE | COMPRESSION |   %   |
+------------------------+------------+---------+---------------------+-----------------------+-------------+-------+
| duration               | INT64      | 5575249 | 44 kB               | 44 kB                 |        0.00 |  0.18 |
| labels.instance        | BYTE_ARRAY | 5575249 | 8.6 kB              | 8.6 kB                |        0.00 |  0.04 |
| labels.job             | BYTE_ARRAY | 5575249 | 8.6 kB              | 8.6 kB                |        0.00 |  0.04 |
| name                   | BYTE_ARRAY | 5575249 | 7.7 kB              | 7.7 kB                |        0.00 |  0.03 |
| period                 | INT64      | 5575249 | 6.2 kB              | 6.2 kB                |        0.00 |  0.03 |
| period_type            | BYTE_ARRAY | 5575249 | 7.7 kB              | 7.7 kB                |        0.00 |  0.03 |
| period_unit            | BYTE_ARRAY | 5575249 | 7.7 kB              | 7.7 kB                |        0.00 |  0.03 |
| pprof_num_labels.bytes | INT64      | 5575249 | 3.4 MB              | 3.4 MB                |        0.00 | 13.77 |
| sample_type            | BYTE_ARRAY | 5575249 | 7.9 kB              | 7.9 kB                |        0.00 |  0.03 |
| sample_unit            | BYTE_ARRAY | 5575249 | 7.9 kB              | 7.9 kB                |        0.00 |  0.03 |
| stacktrace             | BYTE_ARRAY | 5575249 | 962 kB              | 2.3 MB                |      136.18 |  3.95 |
| timestamp              | INT64      | 5575249 | 3.8 MB              | 8.3 MB                |      118.56 | 15.62 |
| value                  | INT64      | 5575249 | 3.3 MB              | 10 MB                 |      208.53 | 13.72 |
+------------------------+------------+---------+---------------------+-----------------------+-------------+-------+

Note 24MB row size (and 18650221 ~18.6mb real file size).

A 12.8x improvement, meaning 12.8x lower memory usage and 12.8x less disk space used when persisting Parca data.

brancz avatar Jul 25 '22 07:07 brancz