compy-learn icon indicating copy to clipboard operation
compy-learn copied to clipboard

CFG cycles cause memory leak in clang extractor

Open bennofs opened this issue 4 years ago • 0 comments

I noticed that memory usage increases continuously when using the clang extractor. Here's a simple demo program to reproduce the issue:

#include <clang/Frontend/FrontendActions.h>
#include <clang/Frontend/CompilerInstance.h>

#include <filesystem>
#include <fstream>
#include <iostream>

#include "clang_ast/clang_extractor.h"
#include "common/clang_driver.h"

using namespace compy;

constexpr char kProgramLoop[] =
    "int cyclic(int a) {"
    "while (1) {"
    "  if (a == 4) return cyclic(a + 1);"
    "  a += 10;"
    "  a /= 2;"
    "}"
    "}";

int main(int, char**) {
  // Init extractor
  std::shared_ptr<ClangDriver> clang_;

  clang_.reset(new ClangDriver(ClangDriver::ProgrammingLanguage::C,
                               ClangDriver::OptimizationLevel::O0,
                               {}, {}));

  compy::clang::ClangExtractor extractor(clang_);
  std::cout << "iter" << "," << "bytes" << std::endl;
  for (int i = 0; i < 10000; ++i) {
    //auto fa = std::make_unique<::clang::SyntaxOnlyAction>();
    //clang_->Invoke(kProgramLoop, {fa.get()}, {});
    extractor.GraphFromString(kProgramLoop);
    std::ifstream statm("/proc/self/statm");
    long stat_total, stat_rss, stat_shared, stat_text, stat_data, stat_library, stat_dirty;
    statm >> stat_total >> stat_rss >> stat_shared >> stat_text >> stat_data >> stat_library >> stat_dirty;
    std::cout << i << "," << (stat_rss<<12) << std::endl;
  }
}

This is caused by cyclic control flow. Cycles in the CFG lead to cycles in the ExtractionInfo graph, which means the nodes in the cycle never reach refcount zero when the top-level is discarded.

To confirm that theory, I made a graph with three variants of the above code:

  1. extractor, the code above
  2. syntax-only, using the driver to run the SyntaxOnly action (to see whether the issue is caused by our frontend action or something in the driver)
  3. no-loop, where the kProgramLoop was modified to remove the cyclic control flow (changing the while into if and adding a return statement to silence warnings)

The result clearly shows that the memory leak only happens when using our extractor with cyclic control flow:

memusage

bennofs avatar Apr 20 '21 12:04 bennofs