datafaker icon indicating copy to clipboard operation
datafaker copied to clipboard

Provide Stream output for transformers

Open snuyanzin opened this issue 10 months ago • 5 comments

The problem with current method like net.datafaker.transformations.JsonTransformer#generate(net.datafaker.transformations.Schema<IN,?>, int) it generates the whole String and then returns it. As a result for bigger numbers it consumes larger amount of memory and e.g. such test fails with OutOfMemory

@Test
    void test2() {
        BaseFaker faker = new BaseFaker(new Random(10L));
        Schema<Object, ?> schema = Schema.of(
            field("Text", () -> faker.name().firstName()),
            field("Bool", () -> faker.name().lastName())
        );

        JsonTransformer<Object> transformer = JsonTransformer.builder().build();
        String json = transformer.generate(schema, 50_000_000);
        System.out.println(json);
    }

There is not so much we can do about this method since anyway with such approach we need somehow to store that giant string value.

Another approach is instead of generation the final string value we could generate a stream of values and return it.

snuyanzin avatar Apr 26 '24 09:04 snuyanzin

partially covered with https://github.com/datafaker-net/datafaker/pull/1177 however not all formats are supported yet

snuyanzin avatar Jun 01 '24 11:06 snuyanzin

Do I understand correctly that this proposal suggests implementing a solution similar to the one used for JsonTransformer, but also for:

  • XmlTransformer
  • JavaObjectTransformer
  • SqlTransformer

RVRhub avatar Jun 01 '24 20:06 RVRhub

yes you are right

snuyanzin avatar Jun 01 '24 20:06 snuyanzin

My PR covers the SQL transformer support https://github.com/datafaker-net/datafaker/pull/1264

gatear avatar Jun 20 '24 12:06 gatear

Also a PR for JavaObjectTransformer https://github.com/datafaker-net/datafaker/pull/1313

gatear avatar Jul 25 '24 14:07 gatear