glaze icon indicating copy to clipboard operation
glaze copied to clipboard

Request an option of disabling column names of csv output and writing in rowwise structure

Open YanzhaoW opened this issue 1 year ago • 1 comments

Hi,

I would like to request two features regarding csv writing.

First, it would be really nice if users can disable printing the column names. In some situations, the whole data is not available and new value of the structure are read and written inside an event loop. But when it's transformed to a string, the output string contains the column names each time. But a correct csv file only has the column names once on the top.

For example:

struct CsvStruct {
    std::vector<int> header1 {1, 2, 3};
    std::vector<float> header2 {4., 5, 6};
    std::vector<std::string> header3 {"a", "b", "c"};
};

auto main() -> int {
    auto my_csv = CsvStruct{};
    auto sstream = std::stringstream{};
    auto buffer = std::string{};
    auto ec =
        glz::write<glz::opts{.format = glz::csv, .layout=glz::colwise}>(
            my_csv, buffer);
    sstream << buffer;
    ec =
        glz::write<glz::opts{.format = glz::csv, .layout=glz::colwise}>(
            my_csv, buffer);
    sstream << buffer;
    std::print("{}", sstream.str());
    return 0;
}

outputs a string:

header1,header2,header3
1,4,a
2,5,b
3,6,c
header1,header2,header3
1,4,a
2,5,b
3,6,c

which is an ill-formatted csv file.

The second request is whether we could output to a csv string from a vector of struct. In most of cases, each row in a csv file represents a data point and it's very normal to have something like std::vector<DataPoint>. So it would be greate to have an API like:

struct CsvStruct{
    int header1 = 1;
    float header2 = 2.;
    std::string header3 = "a";
};

auto main() -> int {
    auto my_csv = std::vector<CsvStruct>{};
    my_csv.emplace_back();
    auto buffer = std::string{};
    auto ec =
        glz::write<glz::opts{.format = glz::csv, .layout=glz::rowwise}>(
            my_csv, buffer);
    return 0;
}

Many thanks in advance

YanzhaoW avatar Sep 17 '24 23:09 YanzhaoW

Thanks for your suggestions. I've had an issue for a while about supporting CSVs without column or row keys (#853), so this is extra motivation to get that done.

Your example of std::vector<DataPoint> is also a good suggestion.

I'm not sure when I'll get to these, because I'm making other improvements to Glaze right now. But, I'll keep this issue alive until these features are added.

stephenberry avatar Sep 17 '24 23:09 stephenberry

Support for these features has been merged with #1724. You can now disable printing column names and write out std::vector<T> types where T is a struct and the fields are columns. See notes on the merge with examples.

stephenberry avatar May 02 '25 15:05 stephenberry