gosnowflake icon indicating copy to clipboard operation
gosnowflake copied to clipboard

Memory improvements for PUT operations

Open asonawalla opened this issue 3 years ago • 15 comments

Description

See Issue #536 (SNOW-535791) for more background.

I've addressed some of the low-hanging fruit memory improvements in this PR. Specifically, using the non-streaming API (and having the caller be responsible for staging the full data on disk instead) now follows a code path that doesn't read the whole file into memory. The streaming API still reads the whole file into memory and will need more effort to fix beyond the changes suggested here.

In addition to the changes, I've also added a few benchmarks on the functions I've been working with, which can be run with a command similar to SKIP_SETUP=1 go test -bench '^Benchmark.*$' -run '^$'. Running these tests in the baseline benchmarks commit and again at HEAD show that (at least for file-based functions) the number of allocations and bytes allocated don't scale with input size after the changes are applied. Note that these benchmarks don't need to connect to a test snowflake instance (hence SKIP_SETUP), but they do take a bit of time to run since they generate a lot of fake data.

Checklist

  • [x] Code compiles correctly
  • [ ] Run make fmt to fix inconsistent formats
  • [ ] Run make lint to get lint errors and fix all of them
  • [ ] Created tests which fail without the change (if possible)
  • [ ] All tests passing
  • [ ] Extended the README / documentation, if necessary

asonawalla avatar Jan 27 '22 21:01 asonawalla

Note: I'm having trouble running the unit tests in this project (some of the shell scripts seem to not be properly escaping special characters in my snowflake password). Would appreciate someone on the snowflake team giving this PR the green light to run in CI once they've had a chance to review that the code is safe.

asonawalla avatar Jan 27 '22 21:01 asonawalla

Thank you for submitting this PR. Unfortunately, this fails our bulk array binding tests where the meta.srcStream passed into getDigestAndSizeForStream does not read its data into the bytes properly (once it's read, the read/seeker is not reset, thereby emptying the buffer). If you find a remedy that addresses this, I will be happy to review and merge this.

sfc-gh-jbahk avatar Jan 30 '22 04:01 sfc-gh-jbahk

Thanks @sfc-gh-jbahk. If I understand what you're saying correctly, there's a test somewhere that calls getDigestAndSizeForStream with a buffer that relies on that buffer being reset after the call. If so, the easiest fix might be to call Reset() on the buffer at the level of the caller so we can re-use the stream version of the method for files as well.

I'm probably missing something obvious, but I don't see such a test. Do you mind pointing me to the failing test?

asonawalla avatar Jan 30 '22 04:01 asonawalla

It's the BulkArrayBinding tests.

sfc-gh-jbahk avatar Jan 30 '22 04:01 sfc-gh-jbahk

Oh I see - the bind uploader calls the "put" command; I was looking for direct usages of the method.

I'm still having trouble getting tests running on my local machine, let me try again tomorrow. In the mean time, I'm pushing what I think should be the fix.

asonawalla avatar Jan 30 '22 05:01 asonawalla

Whoops, Reset() on a buffer clears it, not seeks to 0, so the patch is wrong. I'll make some time to try again in the next day or two.

asonawalla avatar Jan 30 '22 05:01 asonawalla

No worries; I appreciate you putting in the time for this. As for the tests, it might not be possible unless you have internal credentials that are able to run the whole suite.

sfc-gh-jbahk avatar Jan 30 '22 06:01 sfc-gh-jbahk

@asonawalla do you have any updates on this?

sfc-gh-jbahk avatar Feb 03 '22 20:02 sfc-gh-jbahk

@sfc-gh-jbahk unfortunately I had to shift attention to some other work this week, but I do still hope to wrap this up soon. My plan is to get the tests working before I make more changes, then likely revert the reader changes to the stream API. That way this PR will be in good shape so that at least the file-based API keeps memory usage under control, and we can tackle the stream's memory usage some time in the future.

asonawalla avatar Feb 03 '22 20:02 asonawalla

FYI, I think I have most of the tests running successfully on the master branch, but still seeing this failure (which looks related, so I'm trying to avoid skipping it):

=== RUN   TestPutOverwrite
    put_get_test.go:326: expected SKIPPED, got UPLOADED
--- FAIL: TestPutOverwrite (2.61s)

asonawalla avatar Feb 03 '22 20:02 asonawalla

@asonawalla thank you. Where did you get that excerpt from? That test works locally for myself.

sfc-gh-jbahk avatar Feb 03 '22 21:02 sfc-gh-jbahk

It's on running make test on the repo root - here it is in context of the rest of the output.

asonawalla avatar Feb 03 '22 21:02 asonawalla

I actually just noticed that there's another failure in there, but just based on the names of these tests, that one seems less important to get right for this change.

asonawalla avatar Feb 03 '22 21:02 asonawalla

Ah, I see. Thanks. I might try and merge some of these changes faster to help some perf issues on our end actually.

sfc-gh-jbahk avatar Feb 03 '22 21:02 sfc-gh-jbahk

https://github.com/snowflakedb/gosnowflake/pull/527 I updated this to incorporate some of your changes (opening vs reading).

sfc-gh-jbahk avatar Feb 03 '22 21:02 sfc-gh-jbahk

Hi @asonawalla ! Do you still want to merge this PR? Can you rebase and solve conflicts? I'd like to merge it when it's ready.

sfc-gh-pfus avatar Oct 30 '23 07:10 sfc-gh-pfus

Hey @sfc-gh-pfus, the most important part here (the buffer re-allocation) was captured in #527, so I'm going to close this out.

The problems with the streaming PUT described in this issue still appear to be real, but this code doesn't address that.

asonawalla avatar Oct 30 '23 15:10 asonawalla

Thank you @asonawalla for your input anyway!

sfc-gh-pfus avatar Oct 31 '23 06:10 sfc-gh-pfus