Robert (Bobby) Evans

Results 207 comments of Robert (Bobby) Evans

The could not find parent error appears to be related to the order that some columns show up into the code in question. https://github.com/rapidsai/cudf/blob/933e32ab9ad8e5057282c48129ddbd745c538967/cpp/src/io/json/json_column.cu#L657 Appears to be related to the...

Looks like you need to run it more than once. ``` import ai.rapids.cudf._ val sb = Schema.builder() sb.addColumn(DType.STRING, "key_0_0") val s = sb.build val opts = JSONOptions.builder.withKeepQuotes(true).withLines(true).withNormalizeSingleQuotes(true).withRecoverWithNull(true).withNormalizeWhitespace(true).build() val t =...

It looks like it is related to memory pooling, and possibly reading uninitialized memory. It works fine if pooling is disabled, but fails regularly if it is enabled. (at least...

I was able to cut the file down to 861262 lines (about 26 MiB) and I am still able to see errors. Will keep working on this...

I should clarify. I have not been able to make it fail with C++ yet. Just java.

Digging deeper the tokenization is returning a different set of tokens. I am not sure why yet. The data looks fine for the first part of the run, and then...

A little more info. This is only happening in java on the async allocator. Not the arena. This is all really confusing to me.

If I set the config for recover with null to false that appears to fix the problem. Recover with nulls is odd because it is updating the data inline in...

@GregoryKimball @shrshi I really would appreciate some help in understanding what the next steps should be for debugging this. I have a test case that I can repro nearly 100%...

Okay it is a race somewhere. I put in a bunch of `stream.synchronize` calls in the JSON parsing code and the problem appears to have gone away. I will try...