jsforce icon indicating copy to clipboard operation
jsforce copied to clipboard

PK-Chunking support + Better handling of batched jobs + surfacing errors

Open david0673 opened this issue 7 years ago • 14 comments

Making it possible to pass options into Bulk.query (to utilize PKChunking comfortably) Surfacing errors from Bulk.prototype.query Lazy loading query streams from job batches in order to avoid timeouts from idle open connections

Credit: some parts of this PR were copied over from MixMaxHQ's PR (https://github.com/mixmaxhq/jsforce/commit/78349443154e2228db9070ae9ee70c3360363746)

PK-Chunking Usage Sample (credit to @iyobo for the sample)

conn.bulk.query(soql, {pkChunking: {
   chunkSize: ...,
   parent:...,
   startRow: ...
}})

david0673 avatar Jul 21 '18 19:07 david0673

Hi @wearhere! Glad you liked it :)

The portions of code you commented on were actually the ones i copied over from mixmaxhq's PR. I'll do my best to respond, but i'm CC'ing the original author(s) so that they can respond as well.

CC @ghmeier @leonicolas

david0673 avatar Jul 23 '18 17:07 david0673

@wearhere I've made changes in accordance to your comments, though i was not sure how to simulate the relevant flows.

@ghmeier @leonicolas, because i made the changes, the comments @wearhere made are now hidden; here are the links: https://github.com/jsforce/jsforce/pull/780#discussion_r204219083 https://github.com/jsforce/jsforce/pull/780#discussion_r204219139 https://github.com/jsforce/jsforce/pull/780#discussion_r204219203

david0673 avatar Jul 23 '18 18:07 david0673

Thanks @david0673 . Could you leave some tips on how to use chunking as is suggested here?

From what I see in the code, we'd add options to bulk.query(..) i.e:

sfconn.bulk.query(soql, {pkChunking: {
   chunkSize: ...,
   parent:...,
   startRow: ...
}})

iyobo avatar Jul 26 '18 01:07 iyobo

Thats exactly right @iyobo Thanks for providing the sample, I will include it in the PR description!

david0673 avatar Jul 26 '18 01:07 david0673

@david0673 There is actually something wrong with this branch. This used to work, but now it doesn't.

sfconn.bulk.query(soql)
                .stream()
                .on('error', function (e) {
                    errHandler(e, this);
                })
                .pipe(process.stdout)
                .on('end', function () {
                    console.log('This ends now!')
                    
                });

Now it just hangs, doesn't go through pipes, and never ends. This is the case with or without the new options params. This might also be why CI is still waiting for status to be reported.

How was this tested?

iyobo avatar Jul 26 '18 14:07 iyobo

@iyobo thanks for pointing that out, this was because when conn.bulk.load receives a non Object value for options, it assumes that the value was meant to be the soql, & essentially truncates the option parameter. I've now submitted a fix for this issue.

it is possible this is why the status from CI is not yet reported, but i'd like to believe there is a timeout set & it would've failing by now on that scenario

Previously I've tested these only by running selected flows, but now i ran:

  • npm run test:browser - no tests seem to run
  • npm run node - many successful tests but overall failed

If you look at the main branch you can see that the build is failing regardless of this branch, so i'm not sure what more i can do (within reason)

david0673 avatar Jul 28 '18 00:07 david0673

@david0673 Thanks for being awesome! Now I'm curious if we aren't bubbling up errors as we should if the null options caused the earlier stall. That Null pointer error seemed to have been swallowed up by the flow of things.

iyobo avatar Jul 30 '18 15:07 iyobo

@iyobo thanks for being flattering! 😄 that is definitely possible; one step at a time 😛

david0673 avatar Jul 30 '18 20:07 david0673

@david0673 I've seen that you added my changes in this PR. We are near to have the PK Chunking support in JSForce :)

leonicolas avatar Jul 31 '18 18:07 leonicolas

@leonicolas, yes, i did, as a subset of the things copied from this PR: https://github.com/mixmaxhq/jsforce/commit/78349443154e2228db9070ae9ee70c3360363746 Though i was having timeout issues related to holding the batch result streams open too early, which is why i changed it to "lazy-load" those batch result streams

david0673 avatar Aug 01 '18 16:08 david0673

@leonicolas hows it going? @stomita anything i can do to help get this PR merged?

david0673 avatar Aug 13 '18 21:08 david0673

Hi Guys,

Any plans to merge this any time soon?

Thanks!

Josh

jkrinsky avatar Mar 04 '19 18:03 jkrinsky

Hey Guys,

Any chance this makes it in soon? We are doing a pretty big migration and have to manually export the files with Data Loader as the bulk.query is troublesome with objects over 2 mil.

Thanks!

Brandon Mikeska

brandonmikeska avatar Apr 25 '19 19:04 brandonmikeska

Hello, any updates on getting this into the release?

BEPTEP avatar Jul 22 '22 16:07 BEPTEP