p4c icon indicating copy to clipboard operation
p4c copied to clipboard

Expensive stream flushes

Open asl opened this issue 1 year ago • 5 comments

We are having the following code in lib/indent.h:

namespace IndentCtl {
inline std::ostream &endl(std::ostream &out) {
    return out << std::endl << indent_t::getindent(out);
}
inline std::ostream &indent(std::ostream &out) {
    ++indent_t::getindent(out);
    return out;
}
inline std::ostream &unindent(std::ostream &out) {
    --indent_t::getindent(out);
    return out;
}

So, doing out << IndentCtl::endl essentially flushes the output stream. This is more or less fine for cout / cerr, however IndentCtl is used in Utils::Json. As a result, serialization of json objects become terrible expensive due to constant flushes of output file stream, e.g.:

void JsonArray::serialize(std::ostream &out) const {
    bool isSmall = true;
    for (auto v : *this) {
        if (!v->is<JsonValue>()) isSmall = false;
    }
    out << "[";
    if (!isSmall) out << IndentCtl::indent;
    bool first = true;
    for (auto v : *this) {
        if (!first) {
            out << ",";
            if (isSmall) out << " ";
        }
        if (!isSmall) out << IndentCtl::endl;
        if (v == nullptr)
            out << "null";
        else
            v->serialize(out);
        first = false;
    }
    if (!isSmall) out << IndentCtl::unindent << IndentCtl::endl;
    out << "]";
}

I'm seeing possible ways of fixing this:

  • Make IndentCtl::endl use \n instead of endl
  • Switch Util::JSon not to use endl (though we'd need to expose current indent level then)

Opinions? Tagging @ChrisDodd @grg @fruffy @vlstill

asl avatar Sep 20 '24 19:09 asl

I have no problems with switching out endl, considering that this seem to be the recommended way: https://clang.llvm.org/extra/clang-tidy/checks/performance/avoid-endl.html

The only question is whether we can guarantee that we do end up flushing correctly?

fruffy avatar Sep 20 '24 20:09 fruffy

The only question is whether we can guarantee that we do end up flushing correctly?

Streams force flush on closure, so this should not be a problem. The only potential outcome is that we might see incomplete writes in case of e.g. crashes.

asl avatar Sep 20 '24 20:09 asl

I'm seeing possible ways of fixing this:

  • Make IndentCtl::endl use \n instead of endl
  • Switch Util::JSon not to use endl (though we'd need to expose current indent level then)

Opinions? Tagging @ChrisDodd @grg @fruffy @vlstill

Since endl is supposed to flush (by analogy with std::endl, making it not flush would be problematic. Perhaps add IndentCtl::nl to output a newline without flushing?

One of my thought for IndentCtl was to redo it by makeing it a streambuf and intercepting the overflow and inserting indentation there. That way it could do it automatically (and transparently) based on what was output, and could also wrap long lines properly. But that is a much bigger challenge

ChrisDodd avatar Sep 21 '24 01:09 ChrisDodd

Since endl is supposed to flush (by analogy with std::endl, making it not flush would be problematic. Perhaps add IndentCtl::nl to output a newline without flushing?

Glad you suggested this :) As this is exactly my existing solution downstream.

One of my thought for IndentCtl was to redo it by makeing it a streambuf and intercepting the overflow and inserting indentation there. That way it could do it automatically (and transparently) based on what was output, and could also wrap long lines properly. But that is a much bigger challenge

This does not seem it would worth the complexity, right.

asl avatar Sep 21 '24 01:09 asl

And it turned out to be a bit tricky. The reason is that GC does not execute destructors for objects. As a result, the streams are never closed. And therefore what left in buffers is left there w/o explicit flush...

asl avatar Oct 15 '24 06:10 asl