materialize icon indicating copy to clipboard operation
materialize copied to clipboard

Elevated CPU usage in computed, many blobs written out even with all dataflows idle

Open philip-stoev opened this issue 3 years ago • 7 comments

What version of Materialize are you using?

environmentd v0.26.1-dev (54ba3052f)

How did you install Materialize?

Debug build

What is the issue?

Creating a simple materialized view over static tables with 20 indexes on it results in the following:

  1. The blob directory fills up with 10 blobs/second . It does not seem that any compaction/deletion is happening since I was up to 10K blobs with no end in sight
  2. The CPU usage of computed is 75-100%, and perf looks like this:
+   14.03%     0.00%  timely:work-0    computed            [.] <core::pin::Pin<P> as core::future::future::Future>::poll                       ◆
+   14.02%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+   13.66%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+   13.64%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+   12.64%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+   12.62%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+   12.56%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+   12.56%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+   12.44%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+   12.43%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+   12.42%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+   12.42%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    8.66%     0.00%  tokio-runtime-w  computed            [.] tracing::span::Span::in_scope                                                   ▒
+    7.76%     0.00%  tokio-runtime-w  computed            [.] mz_persist_client::impl::compact::Compactor::compact::{{closure}}::{{closure}}  ▒
+    7.75%     0.00%  tokio-runtime-w  computed            [.] tokio::runtime::handle::Handle::block_on                                        ▒
+    7.75%     0.00%  tokio-runtime-w  computed            [.] tokio::runtime::enter::Enter::block_on                                          ▒
+    7.58%     0.00%  tokio-runtime-w  computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    5.08%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    5.06%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    5.05%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    4.99%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    4.66%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    2.47%     0.00%  tokio-runtime-w  computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    2.42%     0.00%  tokio-runtime-w  computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    2.42%     0.00%  tokio-runtime-w  computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    2.41%     0.00%  tokio-runtime-w  computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    2.41%     0.00%  tokio-runtime-w  computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    2.41%     0.00%  tokio-runtime-w  computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    1.85%     0.00%  timely:work-0    computed            [.] <core::future::from_generator::GenFuture<T> as core::future::future::Future>::po▒
+    1.80%     1.79%  tokio-runtime-w  computed            [.] mz_persist::indexed::columnar::ColumnarRecordsRef::validate                     ▒
+    1.65%     1.65%  timely:work-0    computed            [.] core::ptr::mut_ptr::<impl *mut T>::guaranteed_eq                                ▒

Relevant log output

Nothing relevant in the logs.

philip-stoev avatar Jul 26 '22 12:07 philip-stoev

gh13879.sql.zip

To reproduce, pipe this file through the psql client. It contains a simple schema, a few data rows, one CREATE MATERIALIZED VIEW statement and then 20 CREATE DEFAULT INDEX ON .

philip-stoev avatar Jul 26 '22 12:07 philip-stoev

The CPU usage is only reproducible with a Debug build. Under release, CPU usage is negligible and storage and malloc related futures do not stand out.

The fact that blobs accumulate is reproducible on both builds.

philip-stoev avatar Jul 26 '22 13:07 philip-stoev

An expanded section of the perf report above shows that it ultimately ends up in malloc:

   16.05%     0.00%  timely:work-0    computed            [.] <core::pin::Pin<P> as core::future::future::Future>::poll                       ◆
   - <core::pin::Pin<P> as core::future::future::Future>::poll                                                                                 ▒
      - 16.03% <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                                              ▒
         - 15.68% <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                                           ▒
            - 15.67% <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                                        ▒
               - 14.50% <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                                     ▒
                  - 14.48% <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                                  ▒
                     - 14.46% <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                               ▒
                        - 14.45% <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                            ▒
                           - 14.32% <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                         ▒
                              - 14.31% <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                      ▒
                                 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                          ▒
                                    - 14.31% <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll                ▒
                                       - 1.25% <tikv_jemallocator::Jemalloc as core::alloc::global::GlobalAlloc>::dealloc                      ▒
                                          + 1.09% sdallocx                                                                                     ▒
                                       - 0.96% <tikv_jemallocator::Jemalloc as core::alloc::global::GlobalAlloc>::alloc                        ▒
                                            0.85% malloc                                                                                       ▒
                                         0.93% <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll              ▒

philip-stoev avatar Jul 26 '22 13:07 philip-stoev

Hey, this could explain https://github.com/MaterializeInc/materialize/issues/13861! cc @danhhz

benesch avatar Jul 26 '22 14:07 benesch

I'm not surprised. At the end of the day yesterday investigating https://github.com/MaterializeInc/materialize/issues/13861, I thought I saw neither storaged (with 1 pubnub source) nor computed (with 1 materialized view on that source) doing gc while environmentd seemed to be gc'ing just fine. I suspect something to do with the persist_sink impl is holding a capability that prevents us from gc, but didn't get a chance to dig into it before I stopped for the day.

danhhz avatar Jul 26 '22 14:07 danhhz

If something in sinks is blocking compaction, that's a bug. otherwise, this might just not be performant.

JLDLaughlin avatar Aug 04 '22 15:08 JLDLaughlin

I tried to reproduce this issue, but failed to to so. Here's what I tried:

  • Run in both release and debug builds.
  • Increase the number of default indexes to 1000.
  • Run with jemalloc and the system allocator.

For each variant I measured using perf where computed spends its time. The only thing I noticed is that debug builds spend a significant amount of time in Persist's validate implementation, which might not be active in release builds.

@philip-stoev could you please re-check if the problem still occurs for you? Thanks.

antiguru avatar Aug 08 '22 13:08 antiguru

CPU usage is indeed limited to Debug, so I guess nothing can be done here. The proliferation of blob files no longer occurs.

philip-stoev avatar Aug 16 '22 10:08 philip-stoev