simdjson icon indicating copy to clipboard operation
simdjson copied to clipboard

[Request] Custom to/from json methods

Open lshamis opened this issue 4 years ago • 5 comments

A few days ago, @jkeiser added templatized get methods (https://github.com/simdjson/simdjson/pull/643) to json objects.

I'd like to request that go a bit further than primitives, and support user defined from_json and (maybe) to_json.

Something like the following example:

struct Car() {
  std::string make;
  std::string model;
  int year;
  std::vector<double> tire_pressure;
};
void from_json(const dom::object& j, Car& c) {
  c.make = j["make"].get<std::string>();
  c.model = j["model"].get<std::string>();
  c.year = j["year"].get<int>();
  c.tire_pressure = j["tire_pressure"].get<std::vector<double>>();
}

// // or
// void from_json(const dom::object& j, Car& c) {
//   j["make"].get_to(c.make);
//   j["model"].get_to(c.model);
//   j["year"].get_to(c.year);
//   j["tire_pressure"].get_to(c.tire_pressure);
// }

...

auto cars_json = R"( [
  { "make": "Toyota", "model": "Camry",  "year": 2018, 
       "tire_pressure": [ 40.1, 39.9 ] },
  { "make": "Kia",    "model": "Soul",   "year": 2012, 
       "tire_pressure": [ 30.1, 31.0 ] },
  { "make": "Toyota", "model": "Tercel", "year": 1999, 
       "tire_pressure": [ 29.8, 30.0 ] }
] )"_padded;
dom::parser parser;
auto jcars = parser.parse(cars_json).get<dom::array>();

for (dom::object jcar : jcars) {
  auto [car, err] = jcar.get<Car>();
  ...
}

(I know this example isn't consistent on whether the return ties an error. I'm not sure which way is better)

lshamis avatar Apr 01 '20 18:04 lshamis

I think we have been discussing ideas closely related to this proposal.

lemire avatar Apr 01 '20 18:04 lemire

@jkeiser Could this issue be solved with On Demand?

The idea here is for the users to provide a function that maps to a custom data type.

lemire avatar Feb 01 '21 21:02 lemire

Probably doable with either one ... I think nlohmann_json does something like this.

jkeiser avatar Feb 02 '21 04:02 jkeiser

@jkeiser I find it more attractive with On Demand because you materialize directly the object you want by passing the DOM entirely. It is one step up from the "point" (Kostya) deserialization problem.

I am marking it "On Demand" even though it is not strictly On Demand.

lemire avatar Feb 02 '21 14:02 lemire

+1

It will be nice and i will can replace https://github.com/nlohmann/json for all projects.

paulocoutinhox avatar Jul 07 '22 23:07 paulocoutinhox

It turns out we can already support such applications without changing the library. The real question is whether we want to document it.

#include "simdjson.h"
#include <iostream>
#include <vector>

using namespace simdjson;

/**
 * A custom type that we want to parse.
 */
struct Car {
  std::string make;
  std::string model;
  int64_t year;
  std::vector<double> tire_pressure;
};

/***
 * This code will compile and run *without* exception support: all exception handling is
 * done through the error_code mechanism.
 */

/**
 * Let us extend the value::get() template
 */

/**
 * We don't have to define a type for get<std::vector<double>> but it might be generally useful.
 */
template <>
simdjson_inline simdjson_result<std::vector<double>>
simdjson::ondemand::value::get() noexcept {
  // This works on a value, if the std::vector<double> is the document, we need to also implement
  // simdjson::ondemand::document::get<std::vector<double>>.
  ondemand::array array;
  if(auto error = get_array().get(array); error) { return error; }
  std::vector<double> vec;
  for (auto v : array) {
    double val;
    if(auto error = v.get_double().get(val); error) { return error; }
    vec.push_back(val);
  }
  return vec;
}


template <>
simdjson_inline simdjson_result<Car> simdjson::ondemand::value::get() noexcept {
  // This works on a value, if the car is the document, we need to also implement
  // simdjson::ondemand::document::get<Car>.
  ondemand::object obj;
  auto error = get_object().get(obj);
  if (error) {
    return error;
  }
  Car car;
  // Instead of repeatedly obj["something"], we iterate through the object which
  // we expect to be faster.
  for (auto field : obj) {
    raw_json_string key;
    if (auto error = field.key().get(key); error) { return error; }
    if (key == "make") {
      if (auto error = field.value().get_string(car.make); error) { return error; }
    } else if (key == "model") {
      if (auto error = field.value().get_string(car.model); error) { return error; }
    } else if (key == "year") {
      if (auto error = field.value().get_int64().get(car.year); error) { return error; }
    } else if (key == "tire_pressure") {
      if (auto error = field.value().get<std::vector<double>>().get(car.tire_pressure); error) { return error; }
    }
  }
  return car;
}

int main(void) {
  padded_string json = R"( [ { "make": "Toyota", "model": "Camry",  "year": 2018,
       "tire_pressure": [ 40.1, 39.9 ] },
  { "make": "Kia",    "model": "Soul",   "year": 2012,
       "tire_pressure": [ 30.1, 31.0 ] },
  { "make": "Toyota", "model": "Tercel", "year": 1999,
       "tire_pressure": [ 29.8, 30.0 ] }
])"_padded;
  ondemand::parser parser;
  ondemand::document doc;
  if(auto error = parser.iterate(json).get(doc); error) { std::cout << error << std::endl; return EXIT_FAILURE; }
  for (auto val : doc) {
    Car c;
    if (auto error = val.get(c); error) { std::cout << error << std::endl; return EXIT_FAILURE; }
    std::cout << c.make << std::endl;
  }
  return EXIT_SUCCESS;
}

lemire avatar Dec 11 '23 18:12 lemire