jsonld-streaming-parser.js icon indicating copy to clipboard operation
jsonld-streaming-parser.js copied to clipboard

Stream goes into flowing mode immediately

Open LaurensRietveld opened this issue 5 years ago • 5 comments

We've experienced some odd issues that are probably caused by race conditions. I think it boils down to the stream going into flowing mode immediately

Afaik, the stream should only go into flowing mode when:

  • We attach an on-data listener
  • We pipe it to another writable stream
  • We explicitly call resume()

I would expect the below snippet to terminate fast and essentially be a no-op. In reality it isn't, and depending on the size of test.jsonld it will take a while for it to finish.

const fs = require("fs");
const { JsonLdParser } = require("./");
fs.createReadStream("./test.jsonld")
  .pipe(
    new JsonLdParser({
      baseIRI: "http://base"
    })
  )
  //Not registering an on-data listener, so not expecting this stream to start flowing
  //.on('data', () => {})

Would you agree with my observation with the stream going into flowing mode too quickly?

LaurensRietveld avatar Jun 10 '20 09:06 LaurensRietveld

I did not explicitly test such cases, so the parser may indeed be too eager with its parsing tasks. Thanks for reporting!

rubensworks avatar Jun 10 '20 12:06 rubensworks

I have not managed to reproduce this bug. It seems to me that both in streaming and in regular mode the parser used as a Transform node and the import method do not start flowing immediatly. I wrote tests in #99 to check this behavior.

Tpt avatar Aug 17 '22 11:08 Tpt

@Tpt Might be interesting to also check for the case where element by element is being read (.read()), to ensure it does not enter flowing mode (explicitly or implicitly).

But in any case, it's possible that this specific issue was already fixed by some other change.

rubensworks avatar Aug 17 '22 11:08 rubensworks

@Tpt Might be interesting to also check for the case where element by element is being read (.read()), to ensure it does not enter flowing mode (explicitly or implicitly).

I have updated #99 with tests. It seems to work properly.

Tpt avatar Aug 17 '22 13:08 Tpt

Ok, great! Not sure if we're handling all cases by only checking readableFlowing though. Specifically, when streamingProfile is true, then the JSON should be read chunk by chunk. So we must be sure that the underlying stream is not being read further than required. (I guess this overlaps with #71)

rubensworks avatar Aug 17 '22 14:08 rubensworks