glaze icon indicating copy to clipboard operation
glaze copied to clipboard

Append to CSV

Open jalius opened this issue 11 months ago • 2 comments

Is there a good way to append to a CSV buffer? I see this issue tangentially mentioned in https://github.com/stephenberry/glaze/issues/1019 but it's not clear if this is implemented yet.

For now, I could write my CSV to a string buffer and append from the buffer to the file system manually. Not sure if that is any faster than just writing the entire file repeatedly in my use case.

jalius avatar Feb 12 '25 23:02 jalius

Here is somewhat of a hack that @anders-wind came up with. I haven't tested this exact code, but it creates a wrapping buffer that can be passed to Glaze to trick Glaze into appending data starting at the offset. Should Glaze have a structure like this as a permanent solution? Perhaps, and I'd love to hear your thoughts if you find this approach useful.

template <typename InnerBufferT>
struct AppendStringBuffer
{
    InnerBufferT& buffer;
    size_t offset {0};

   using reference = char&;

   constexpr auto* data()
   {
      return buffer.data() + offset;
   }

   constexpr const auto* data() const
   {
      return buffer.data() + offset;
   }

    constexpr auto operator[](size_t index) -> char&
    {
        return buffer[this->offset + index];
    }

    constexpr auto begin()
    {
        return buffer.begin() + static_cast<int64_t>(this->offset);
    }
    constexpr auto end()
    {
        return buffer.end();
    }

    constexpr auto resize(size_t val) -> void
    {
        buffer.resize(this->offset + val);
    }

    constexpr auto size() const -> size_t
    {
        return buffer.size() - this->offset;
    }

    [[nodiscard]]
    constexpr auto empty() const -> bool
    {
        return this->size() == 0;
    }
};

stephenberry avatar Feb 13 '25 17:02 stephenberry

Yes we are using the above in production and then

    auto buffer = std::string{"My already existing data."};
    auto offset_buffer = AppendStringBuffer<std::string> {.buffer = buffer, .offset = buffer.size()};
    auto ec =
        glz::write<glz::opts {.format = glz::JSON}>(value, offset_buffer); // now the value will be serialized and appended to buffer

anders-wind avatar Feb 13 '25 17:02 anders-wind

You can now avoid writing the headers when writing CSV. This means that you can just call .append(more_csv) to your output, or add the new lines to a file. See examples with the merge here: #1724

stephenberry avatar May 02 '25 15:05 stephenberry