nebula icon indicating copy to clipboard operation
nebula copied to clipboard

disk manager: boost::filesystem::status not implemented in arcch64 linux 4.14

Open wey-gu opened this issue 2 years ago • 11 comments

Please check the FAQ documentation before raising an issue

Describe the bug (required)

reported via https://discuss.nebula-graph.com.cn/t/topic/8487/37?u=wey , boost::filesystem::status() not implemented was encountered during meta boot.

maybe during boost::filesystem::exists in follow function ?

DiskManager::DiskManager(const std::vector<std::string>& dataPaths,
                         std::shared_ptr<thread::GenericWorker> bgThread)
    : bgThread_(bgThread) {
  try {
    // atomic is not copy-constructible
    std::vector<std::atomic_uint64_t> freeBytes(dataPaths.size() + 1);
    Paths* paths = new Paths();
    paths_.store(paths);
    size_t index = 0;
    for (const auto& path : dataPaths) {
      auto absolute = boost::filesystem::absolute(path);
      if (!boost::filesystem::exists(absolute)) {
        if (!boost::filesystem::create_directories(absolute)) {
          LOG(FATAL) << folly::sformat("DataPath:{} does not exist, create failed.", path);
        }
      } else if (!boost::filesystem::is_directory(absolute)) {
        LOG(FATAL) << "DataPath is not a valid directory: " << path;
      }
      auto canonical = boost::filesystem::canonical(path);
      auto info = boost::filesystem::space(canonical);
      paths->dataPaths_.emplace_back(std::move(canonical));
      freeBytes[index++] = info.available;
    }
    freeBytes_ = std::move(freeBytes);
  } catch (boost::filesystem::filesystem_error& e) {
    LOG(FATAL) << "DataPath invalid: " << e.what();
  }
  if (bgThread_) {
    bgThread_->addRepeatTask(FLAGS_disk_check_interval_secs * 1000, &DiskManager::refresh, this);
  }
}

Your Environments (required)

  • OS: uname -a kernel 4.14, aarch64
  • Compiler: g++ --version or clang++ --version
  • CPU: lscpu
  • Commit id : 3.0.2

How To Reproduce(required)

Steps to reproduce the behavior:

  1. Step 1
  2. Step 2
  3. Step 3

Expected behavior

Additional context

wey-gu avatar Apr 24 '22 10:04 wey-gu

Another user reported here: https://discuss.nebula-graph.com.cn/t/topic/8428/10?u=wey

Linux ds1 4.14.0-115.el7a.0.1.aarch64 #1 SMP Sun Nov 25 20:54:21 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux

wey-gu avatar Apr 25 '22 02:04 wey-gu

Not sure if it's a bug/upstream bug, but it's reported as unable to run Nebula on CentOS 7.4 ARM from different sources, thus I am labeling it as bug first.

wey-gu avatar Apr 25 '22 06:04 wey-gu

other users encounter this there days, too

https://discuss.nebula-graph.com.cn/t/topic/11239/17

wey-gu avatar Nov 23 '22 12:11 wey-gu

Reopen it as more users are hitting this.

ref: https://community-chat.nebula-graph.io/t/5161068/hi-everyone-i-ran-into-a-problem-starting-nebula-graph-metad

update: this could be reproduced in x86_64 Linux kernel 4.15

wey-gu avatar Nov 25 '22 02:11 wey-gu

see: https://github.com/boostorg/filesystem/commit/3e8c8b15f940145481c5eb73bc6c108b5bace1da Compile with new boost later than 1.77 should be okay with some situation, but I'm not sure of it. I need to verify that later.

SuperYoko avatar Mar 02 '23 08:03 SuperYoko

see: boostorg/filesystem@3e8c8b1 Compile with new boost later than 1.77 should be okay with some situation, but I'm not sure of it. I need to verify that later.

upstream awareness of this issue looks promising :D

wey-gu avatar Mar 02 '23 09:03 wey-gu

another user hit this issue https://github.com/vesoft-inc/nebula-console/issues/170

wey-gu avatar May 09 '23 01:05 wey-gu

I am also encountered this issue with start up a nebula-graph docker (3.6.0), and I solved it by update libseccomp-2.3.1-3 to libseccomp-2.3.1-4; my os info: centos 7.6, linux kernel: 4.14.0-115.xxx , arm architecture;

but when i copyed the nebula binary file from docker to host, it can succsfully startup (without update libseccomp);

CoderXionghs avatar Sep 24 '23 12:09 CoderXionghs

Thanks @CoderXionghs your precious attempts will help a lot more guys encountering this.

And it's so very valuable as only some ffi lib update would help/ help pointed to related cause to libseccomp w/o changing kernel.

wey-gu avatar Sep 24 '23 14:09 wey-gu

@SuperYoko @dutor what do you think of this please, it seems the libseccomp that's related to syscall filtering could be related to how boost::filesystem::status() syscall behaves, should we highlight this in docs or?

wey-gu avatar Sep 25 '23 03:09 wey-gu

I thought vesoft-inc/nebula-ent#2357 has addressed such an issue, again. But it seems there are other cases left. I have processsed this issue for several times and there is no easy way to resolve it absolutely because of the way how docker works.

For this specific case, we could make a workaround to replace these boost interfaces. I think @SuperYoko could help on this.

dutor avatar Sep 26 '23 02:09 dutor