horaedb icon indicating copy to clipboard operation
horaedb copied to clipboard

Parquet page index cause server panic

Open jiacai2050 opened this issue 1 year ago • 3 comments

Describe this problem

This error arise from one of our test cluster, here are the backtrace

2023-06-28 08:59:04.789 ERRO [common_util/src/panic.rs:42] thread 'ceres-compact' panicked 'called `Option::unwrap()` on a `None` value' at "/usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/parquet-38.0.0/src/file/page_index/index_reader.rs:159"
   0: common_util::panic::set_panic_hook::{{closure}}
             at ceresdb/common_util/src/panic.rs:41:18
   1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/alloc/src/boxed.rs:2002:9
      std::panicking::rust_panic_with_hook
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/std/src/panicking.rs:692:13
   2: std::panicking::begin_panic_handler::{{closure}}
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/std/src/panicking.rs:577:13
   3: std::sys_common::backtrace::__rust_end_short_backtrace
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/std/src/sys_common/backtrace.rs:137:18
   4: rust_begin_unwind
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/std/src/panicking.rs:575:5
   5: core::panicking::panic_fmt
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/core/src/panicking.rs:64:14
   6: core::panicking::panic
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/core/src/panicking.rs:114:5
   7: core::option::Option<T>::unwrap
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/core/src/option.rs:823:21
      parquet::file::page_index::index_reader::get_location_offset_and_total_length::{{closure}}
             at usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/parquet-38.0.0/src/file/page_index/index_reader.rs:159:42
      core::iter::adapters::map::map_fold::{{closure}}
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/core/src/iter/adapters/map.rs:84:28
      core::iter::traits::iterator::Iterator::fold
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/core/src/iter/traits/iterator.rs:2438:21
      <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::fold
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/core/src/iter/adapters/map.rs:124:9
      <i32 as core::iter::traits::accum::Sum>::sum
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/core/src/iter/traits/accum.rs:50:17
      core::iter::traits::iterator::Iterator::sum
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/core/src/iter/traits/iterator.rs:3408:9
      parquet::file::page_index::index_reader::get_location_offset_and_total_length
             at usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/parquet-38.0.0/src/file/page_index/index_reader.rs:157:24
   8: parquet::arrow::async_reader::<impl parquet::arrow::arrow_reader::ArrowReaderBuilder<parquet::arrow::async_reader::AsyncReader<T>>>::new_with_options::{{closure}}
             at usr/local/carg o/registry/src/github.com-1ecc6299db9ec823/parquet-38.0.0/src/arrow/async_reader/mod.rs:250:21
      parquet_ext::meta_data::meta_with_page_indexes::{{closure}}
             at ceresdb/components/parquet_ext/src/meta_data.rs:82:13
      analytic_engine::sst::parquet::async_reader::Reader::load_meta_data_from_storage::{{closure}}
             at ceresdb/analytic_engine/src/sst/parquet/async_reader.rs:368:13
   9: analytic_engine::sst::parquet::async_reader::Reader::read_sst_meta::{{closure}}
             at ceresdb/analytic_engine/src/sst/parquet/async_reader.rs:400:64
      analytic_engine::sst::parquet::async_reader::Reader::init_if_necessary::{{closure}}
             at ceresdb/analytic_engine/src/sst/parquet/async_reader.rs:323:49
  10: <analytic_engine::sst::parquet::async_reader::Reader as analytic_engine::sst::reader::SstReader>::meta_data::{{closure}}
             at ceresdb/analytic_engine/src/sst/parquet/async_reader.rs:533:33
  11: <core::pin::Pin<P> as core::future::future::Future>::poll
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/core/src/future/future.rs:125:9
      <analytic_engine::sst::parquet::async_reader::ThreadedReader as analytic_engine::sst::reader::SstReader>::meta_data::{{closure}}
             at ceresdb/analytic_engine/src/sst/parquet/async_reader.rs:648:31
  12: <core::pin::Pin<P> as core::future::future::Future>::poll
             at rustc/11d96b59307b1702fffe871bfc2d0145d070881e/library/core/src/future/future.rs:125:9
      analytic_engine::row_iter::record_batch_stream::stream_from_sst_file::{{closure}}
             at ceresdb/analytic_engine/src/row_iter/record_batch_stream.rs:333:38
      analytic_engine::row_iter::record_batch_stream::filtered_stream_from_sst_file::{{closure}}
             at ceresdb/analytic_engine/src/row_iter/record_batch_stream.rs:291:5
  13: analytic_engine::row_iter::merge::MergeBuilder::build::{{closure}}
             at ceresdb/analytic_engine/src/row_iter/merge.rs:218:17
  14: analytic_engine::instance::flush_compaction::<impl analytic_engine::instance::SpaceStore>::compact_input_files::{{closure}}
             at ceresdb/analytic_engine/src/instance/flush_compaction.rs:812:28
      analytic_engine::instance::flush_compaction::<impl analytic_engine::instance::SpaceStore>::compact_table::{{closure}}
             at ceresdb/analytic_engine/src/instance/flush_compaction.rs:709:13
  15: analytic_engine::compaction::scheduler::ScheduleWorker::do_table_compaction_task::{{closure}}
             at ceresdb/analytic_engine/src/compaction/scheduler.rs:503:17

Server version

main, commit id 946b3f89e9f6b18c51716a6fe1c5ddb549488dd5

Steps to reproduce

N/A

Expected behavior

No response

Additional Information

No response

jiacai2050 avatar Jun 28 '23 09:06 jiacai2050

Maybe upgrade parquet can resolve this panic. https://github.com/apache/arrow-rs/commit/1434d1f4ddbe50e7729b7b69bdb8b7e10934f806

chunshao90 avatar Jul 07 '23 03:07 chunshao90

After https://github.com/CeresDB/ceresdb/pull/1086, this may be solved. @MachaelLee @Rachelint

tanruixiang avatar Jul 21 '23 08:07 tanruixiang

2023-07-24 03:30:57.450 ERRO [analytic_engine/src/compaction/scheduler.rs:506] Failed to compact table, table_name:prometheus_sd_consul_rpc_duration_seconds, table_id:184, request_id:15054054, err:Failed to build merge iterator, table:prometheus_sd_consul_rpc_duration_seconds, err:Failed to build record batch from sst, err:Fail to read sst meta, err:Failed to decode page indexes for meta data, file_path:0/184/173.sst, err:Parquet error: failed to build page indexes in metadata, err:Parquet error: missing offset index.

After upgrade, the panic has become an Error.

jiacai2050 avatar Jul 24 '23 03:07 jiacai2050