embulk-output-bigquery
embulk-output-bigquery copied to clipboard
Replace google-api-client with specific Google APIs
Hello, I want to change the gem used in embulk-output-bigquery to be used for each google service, since the google-api-client gem has been deprecated. https://googleapis.dev/ruby/google-api-client/v0.53.0/
By the way, when I tried docker build with the following Dockerfile, I got java.lang.OutOfMemoryError
when installing embulk-output-bigquery, probably because the size of google-api-client is too big.
https://github.com/Nozomuts/test-embulk-docker/blob/main/Dockerfile
Details of docker build
docker build -t embulk:hoge . --progress plain
#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.0s
#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 898B done
#2 DONE 0.0s
#3 [internal] load metadata for docker.io/library/amazoncorretto:8
#3 ...
#4 [auth] library/amazoncorretto:pull token for registry-1.docker.io
#4 DONE 0.0s
#3 [internal] load metadata for docker.io/library/amazoncorretto:8
#3 DONE 2.3s
#5 [1/8] FROM docker.io/library/amazoncorretto:8@sha256:a07ee8b023bbd2110daede9bf18a3700f81ff1f5b3aa10ae9b6034c0c4960fc4
#5 DONE 0.0s
#6 [internal] load build context
#6 transferring context: 38B done
#6 DONE 0.0s
#7 [6/8] COPY --chown=appuser:appgroup ./embulk.properties /home/appuser/.embulk/embulk.properties
#7 CACHED
#8 [3/8] RUN yum install -y shadow-utils mysql jq && yum clean all
#8 CACHED
#9 [4/8] RUN groupadd -g 1001 appgroup && useradd -u 1001 -g appgroup appuser
#9 CACHED
#10 [5/8] RUN curl -o ./embulk.jar -L https://dl.embulk.org/embulk-0.11.2.jar && curl -o ./jruby-complete.jar -L https://repo1.maven.org/maven2/org/jruby/jruby-complete/9.4.5.0/jruby-complete-9.4.5.0.jar && chmod +x ./embulk.jar ./jruby-complete.jar
#10 CACHED
#11 [2/8] WORKDIR /app
#11 CACHED
#12 [7/8] RUN chown -R appuser:appgroup /app
#12 CACHED
#13 [8/8] RUN java -jar ./embulk.jar gem install embulk -v 0.11.2 && java -jar ./embulk.jar gem install embulk-output-bigquery embulk-filter-column embulk-input-s3 embulk-input-mysql liquid msgpack
#13 0.623 2024-02-29 02:37:57.718 +0000 [INFO] (main): m2_repo is set as a sub directory of embulk_home: /home/appuser/.embulk/lib/m2/repository
#13 0.623 2024-02-29 02:37:57.721 +0000 [INFO] (main): gem_home is set as a sub directory of embulk_home: /home/appuser/.embulk/lib/gems
#13 0.623 2024-02-29 02:37:57.721 +0000 [INFO] (main): gem_path is set empty.
#13 0.623 2024-02-29 02:37:57.721 +0000 [DEBUG] (main): Embulk system property "default_guess_plugin" is set to: "gzip,bzip2,json,csv"
#13 0.726 2024-02-29 02:37:57.824 +0000 [INFO] (main): Loaded JRuby runtime 9.4.5.0
#13 1.876 2024-02-29 02:37:58.974 +0000 [INFO] (main): Environment variable "GEM_HOME" is not set. Setting "GEM_HOME" to "/home/appuser/.embulk/lib/gems" from Embulk system property "gem_home" for the "gem" command.
#13 5.033 Fetching msgpack-1.7.2-java.gem
#13 5.047 Fetching embulk-0.11.2-java.gem
#13 5.192 Successfully installed msgpack-1.7.2-java
#13 5.492 Successfully installed embulk-0.11.2-java
#13 5.851 Parsing documentation for msgpack-1.7.2-java
#13 6.318 Installing ri documentation for msgpack-1.7.2-java
#13 6.459 Parsing documentation for embulk-0.11.2-java
#13 6.960 Installing ri documentation for embulk-0.11.2-java
#13 7.226 Done installing documentation for msgpack, embulk after 1 seconds
#13 7.226 2 gems installed
#13 7.227 Exiting RubyGems with exit_code 0
#13 7.589 2024-02-29 02:38:04.684 +0000 [INFO] (main): m2_repo is set as a sub directory of embulk_home: /home/appuser/.embulk/lib/m2/repository
#13 7.589 2024-02-29 02:38:04.686 +0000 [INFO] (main): gem_home is set as a sub directory of embulk_home: /home/appuser/.embulk/lib/gems
#13 7.589 2024-02-29 02:38:04.686 +0000 [INFO] (main): gem_path is set empty.
#13 7.589 2024-02-29 02:38:04.686 +0000 [DEBUG] (main): Embulk system property "default_guess_plugin" is set to: "gzip,bzip2,json,csv"
#13 7.639 2024-02-29 02:38:04.736 +0000 [INFO] (main): Loaded JRuby runtime 9.4.5.0
#13 8.672 2024-02-29 02:38:05.769 +0000 [INFO] (main): Environment variable "GEM_HOME" is not set. Setting "GEM_HOME" to "/home/appuser/.embulk/lib/gems" from Embulk system property "gem_home" for the "gem" command.
#13 13.38 Fetching thor-1.3.1.gem
#13 13.41 Fetching concurrent-ruby-1.2.3.gem
#13 13.44 Fetching tzinfo-2.0.6.gem
#13 13.45 Fetching rexml-3.2.6.gem
#13 13.47 Fetching retriable-3.1.2.gem
#13 13.48 Fetching time_with_zone-0.3.1.gem
#13 13.49 Fetching thwait-0.2.0.gem
#13 13.50 Fetching e2mmap-0.1.0.gem
#13 13.59 Fetching uber-0.1.0.gem
#13 13.60 Fetching trailblazer-option-0.1.2.gem
#13 13.62 Fetching representable-3.2.0.gem
#13 13.66 Fetching mini_mime-1.1.5.gem
#13 13.68 Fetching httpclient-2.8.3.gem
#13 13.72 Fetching multi_json-1.15.0.gem
#13 13.73 Fetching jwt-2.8.0.gem
#13 13.75 Fetching declarative-0.0.20.gem
#13 13.77 Fetching faraday-2.9.0.gem
#13 13.81 Fetching faraday-net_http-3.1.0.gem
#13 13.83 Fetching public_suffix-5.0.4.gem
#13 13.86 Fetching google-apis-core-0.14.0.gem
#13 13.87 Fetching signet-0.19.0.gem
#13 13.88 Fetching os-1.1.4.gem
#13 13.89 Fetching google-cloud-env-2.1.1.gem
#13 13.91 Fetching googleauth-1.11.0.gem
#13 13.92 Fetching addressable-2.8.6.gem
#13 13.98 Fetching google-apis-discovery_v1-0.16.0.gem
#13 14.01 Fetching gems-1.2.0.gem
#13 14.04 Fetching zeitwerk-2.6.13.gem
#13 14.05 Fetching minitest-5.22.2.gem
#13 14.07 Fetching i18n-1.14.1.gem
#13 14.09 Fetching activesupport-6.1.7.7.gem
#13 14.11 Fetching embulk-output-bigquery-0.7.0.gem
#13 14.12 Fetching google-apis-generator-0.14.0.gem
#13 14.14 Fetching google-api-client-0.53.0.gem
#13 14.89 Successfully installed concurrent-ruby-1.2.3
#13 15.04 Successfully installed tzinfo-2.0.6
#13 15.13 Successfully installed time_with_zone-0.3.1
#13 15.17 Successfully installed e2mmap-0.1.0
#13 15.21 Successfully installed thwait-0.2.0
#13 15.33 Successfully installed thor-1.3.1
#13 15.47 Successfully installed rexml-3.2.6
#13 15.52 Successfully installed retriable-3.1.2
#13 15.57 Successfully installed uber-0.1.0
#13 15.61 Successfully installed trailblazer-option-0.1.2
#13 15.65 Successfully installed declarative-0.0.20
#13 15.81 Successfully installed representable-3.2.0
#13 15.88 Successfully installed mini_mime-1.1.5
#13 16.00 Successfully installed httpclient-2.8.3
#13 16.08 Successfully installed multi_json-1.15.0
#13 16.20 Successfully installed jwt-2.8.0
#13 16.23 Successfully installed faraday-net_http-3.1.0
#13 16.37 Successfully installed faraday-2.9.0
#13 16.41 Successfully installed public_suffix-5.0.4
#13 16.53 Successfully installed addressable-2.8.6
#13 16.57 Successfully installed signet-0.19.0
#13 16.61 Successfully installed os-1.1.4
#13 16.64 Successfully installed google-cloud-env-2.1.1
#13 16.72 Successfully installed googleauth-1.11.0
#13 16.79 Successfully installed google-apis-core-0.14.0
#13 16.85 Successfully installed google-apis-discovery_v1-0.16.0
#13 16.88 Successfully installed gems-1.2.0
#13 16.91 Successfully installed zeitwerk-2.6.13
#13 16.96 Successfully installed minitest-5.22.2
#13 17.02 Successfully installed i18n-1.14.1
#13 17.40 Successfully installed activesupport-6.1.7.7
#13 17.45 Successfully installed google-apis-generator-0.14.0
#13 19.52 *******************************************************************************
#13 19.52 The google-api-client gem is deprecated and will likely not be updated further.
#13 19.52
#13 19.52 Instead, please install the gem corresponding to the specific service to use.
#13 19.52 For example, to use the Google Drive V3 client, install google-apis-drive_v3.
#13 19.52 For more information, see the FAQ in the OVERVIEW.md file or the YARD docs.
#13 19.52 *******************************************************************************
#13 19.52 Successfully installed google-api-client-0.53.0
#13 19.56 Successfully installed embulk-output-bigquery-0.7.0
#13 19.89 Parsing documentation for concurrent-ruby-1.2.3
#13 21.63 Installing ri documentation for concurrent-ruby-1.2.3
#13 28.74 Parsing documentation for tzinfo-2.0.6
#13 29.06 Installing ri documentation for tzinfo-2.0.6
#13 29.86 Parsing documentation for time_with_zone-0.3.1
#13 29.87 Installing ri documentation for time_with_zone-0.3.1
#13 29.89 Parsing documentation for e2mmap-0.1.0
#13 29.90 Installing ri documentation for e2mmap-0.1.0
#13 29.92 Parsing documentation for thwait-0.2.0
#13 29.94 Installing ri documentation for thwait-0.2.0
#13 29.96 Parsing documentation for thor-1.3.1
#13 30.44 Installing ri documentation for thor-1.3.1
#13 30.65 Parsing documentation for rexml-3.2.6
#13 31.30 Installing ri documentation for rexml-3.2.6
#13 31.99 Parsing documentation for retriable-3.1.2
#13 32.01 Installing ri documentation for retriable-3.1.2
#13 32.02 Parsing documentation for uber-0.1.0
#13 32.04 Installing ri documentation for uber-0.1.0
#13 32.07 Parsing documentation for trailblazer-option-0.1.2
#13 32.08 Installing ri documentation for trailblazer-option-0.1.2
#13 32.10 Parsing documentation for declarative-0.0.20
#13 32.13 Installing ri documentation for declarative-0.0.20
#13 32.16 Parsing documentation for representable-3.2.0
#13 32.36 Installing ri documentation for representable-3.2.0
#13 32.48 Parsing documentation for mini_mime-1.1.5
#13 32.52 Installing ri documentation for mini_mime-1.1.5
#13 32.61 Parsing documentation for httpclient-2.8.3
#13 32.98 Installing ri documentation for httpclient-2.8.3
#13 33.31 Parsing documentation for multi_json-1.15.0
#13 33.40 Installing ri documentation for multi_json-1.15.0
#13 33.44 Parsing documentation for jwt-2.8.0
#13 33.60 Installing ri documentation for jwt-2.8.0
#13 33.68 Parsing documentation for faraday-net_http-3.1.0
#13 33.70 Installing ri documentation for faraday-net_http-3.1.0
#13 33.71 Parsing documentation for faraday-2.9.0
#13 33.90 Installing ri documentation for faraday-2.9.0
#13 34.06 Parsing documentation for public_suffix-5.0.4
#13 34.10 Installing ri documentation for public_suffix-5.0.4
#13 34.14 Parsing documentation for addressable-2.8.6
#13 34.28 Installing ri documentation for addressable-2.8.6
#13 34.48 Parsing documentation for signet-0.19.0
#13 34.60 Installing ri documentation for signet-0.19.0
#13 34.71 Parsing documentation for os-1.1.4
#13 34.74 Installing ri documentation for os-1.1.4
#13 34.78 Parsing documentation for google-cloud-env-2.1.1
#13 34.85 Installing ri documentation for google-cloud-env-2.1.1
#13 34.91 Parsing documentation for googleauth-1.11.0
#13 35.12 Installing ri documentation for googleauth-1.11.0
#13 35.26 Parsing documentation for google-apis-core-0.14.0
#13 35.43 Installing ri documentation for google-apis-core-0.14.0
#13 35.52 Parsing documentation for google-apis-discovery_v1-0.16.0
#13 35.61 Installing ri documentation for google-apis-discovery_v1-0.16.0
#13 35.69 Parsing documentation for gems-1.2.0
#13 35.73 Installing ri documentation for gems-1.2.0
#13 35.75 Parsing documentation for zeitwerk-2.6.13
#13 35.84 Installing ri documentation for zeitwerk-2.6.13
#13 35.89 Parsing documentation for minitest-5.22.2
#13 36.08 Installing ri documentation for minitest-5.22.2
#13 36.22 Parsing documentation for i18n-1.14.1
#13 36.47 Installing ri documentation for i18n-1.14.1
#13 36.58 Parsing documentation for activesupport-6.1.7.7
#13 37.57 Installing ri documentation for activesupport-6.1.7.7
#13 38.13 Parsing documentation for google-apis-generator-0.14.0
#13 38.19 Installing ri documentation for google-apis-generator-0.14.0
#13 38.38 Parsing documentation for google-api-client-0.53.0
#13 2149.1 Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
#13 2149.1 at java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:57)
#13 2149.1 at java.nio.CharBuffer.allocate(CharBuffer.java:335)
#13 2149.1 at org.jruby.RubyEncoding$UTF8Coder.<init>(RubyEncoding.java:376)
#13 2149.1 at org.jruby.RubyEncoding.getUTF8Coder(RubyEncoding.java:458)
#13 2149.1 at org.jruby.RubyEncoding.doEncodeUTF8(RubyEncoding.java:225)
#13 2149.1 at org.jruby.RubyString.encodeBytelist(RubyString.java:6782)
#13 2149.1 at org.jruby.RubyString.<init>(RubyString.java:402)
#13 2149.1 at org.jruby.RubyString.newString(RubyString.java:482)
#13 2149.1 at org.jruby.Ruby.newString(Ruby.java:3569)
#13 2149.1 at org.jruby.ext.ripper.RubyRipper.lex_state_name(RubyRipper.java:393)
#13 2149.1 at org.jruby.ext.ripper.RubyRipper$INVOKER$s$1$0$lex_state_name.call(RubyRipper$INVOKER$s$1$0$lex_state_name.gen)
#13 2149.1 at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:242)
#13 2149.1 at uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.ripper.lexer.invokeOther1:lex_state_name(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/ripper/lexer.rb:61)
#13 2149.1 at uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.ripper.lexer.RUBY$method$initialize$0(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/ripper/lexer.rb:61)
#13 2149.1 at java.lang.invoke.LambdaForm$DMH/156947070.invokeStatic_L7_L(LambdaForm$DMH)
#13 2149.1 at java.lang.invoke.LambdaForm$MH/1934418561.invokeExact_MT(LambdaForm$MH)
#13 2149.1 at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:165)
#13 2149.1 at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:185)
#13 2149.1 at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:257)
#13 2149.1 at org.jruby.RubyClass.newInstance(RubyClass.java:904)
#13 2149.1 at org.jruby.RubyClass$INVOKER$i$newInstance.call(RubyClass$INVOKER$i$newInstance.gen)
#13 2149.1 at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodZeroOrOneOrNBlock.call(JavaMethod.java:355)
#13 2149.1 at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:242)
#13 2149.1 at uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.ripper.lexer.invokeOther3:new(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/ripper/lexer.rb:94)
#13 2149.1 at uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.ripper.lexer.RUBY$method$initialize$0(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/ripper/lexer.rb:94)
#13 2149.1 at java.lang.invoke.LambdaForm$DMH/1739595981.invokeStatic_L7_L(LambdaForm$DMH)
#13 2149.1 at java.lang.invoke.LambdaForm$MH/1934418561.invokeExact_MT(LambdaForm$MH)
#13 2149.1 at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139)
#13 2149.1 at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:112)
#13 2149.1 at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:90)
#13 2149.1 at org.jruby.RubyClass.newInstance(RubyClass.java:931)
#13 2149.1 at org.jruby.RubyClass$INVOKER$i$newInstance.call(RubyClass$INVOKER$i$newInstance.gen)
#13 ERROR: process "/bin/sh -c java -jar ./embulk.jar gem install embulk -v ${EMBULK_VERSION} && java -jar ./embulk.jar gem install embulk-output-bigquery embulk-filter-column embulk-input-s3 embulk-input-mysql liquid msgpack" did not complete successfully: exit code: 1
------
> [8/8] RUN java -jar ./embulk.jar gem install embulk -v 0.11.2 && java -jar ./embulk.jar gem install embulk-output-bigquery embulk-filter-column embulk-input-s3 embulk-input-mysql liquid msgpack:
#13 2149.1 at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:242)
#13 2149.1 at uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.ripper.lexer.invokeOther3:new(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/ripper/lexer.rb:94)
#13 2149.1 at uri_3a_classloader_3a_.META_minus_INF.jruby_dot_home.lib.ruby.stdlib.ripper.lexer.RUBY$method$initialize$0(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/ripper/lexer.rb:94)
#13 2149.1 at java.lang.invoke.LambdaForm$DMH/1739595981.invokeStatic_L7_L(LambdaForm$DMH)
#13 2149.1 at java.lang.invoke.LambdaForm$MH/1934418561.invokeExact_MT(LambdaForm$MH)
#13 2149.1 at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139)
#13 2149.1 at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:112)
#13 2149.1 at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:90)
#13 2149.1 at org.jruby.RubyClass.newInstance(RubyClass.java:931)
#13 2149.1 at org.jruby.RubyClass$INVOKER$i$newInstance.call(RubyClass$INVOKER$i$newInstance.gen)
------
Dockerfile:24
--------------------
23 |
24 | >>> RUN java -jar ./embulk.jar gem install embulk -v ${EMBULK_VERSION} && \
25 | >>> java -jar ./embulk.jar gem install embulk-output-bigquery embulk-filter-column embulk-input-s3 embulk-input-mysql liquid msgpack
26 |
--------------------
ERROR: failed to solve: process "/bin/sh -c java -jar ./embulk.jar gem install embulk -v ${EMBULK_VERSION} && java -jar ./embulk.jar gem install embulk-output-bigquery embulk-filter-column embulk-input-s3 embulk-input-mysql liquid msgpack" did not complete successfully: exit code: 1
I would appreciate any feedback or input on this proposed update. Best regards,
CI Failed on the JRuby 9.3.10.0
> bundle install
/home/runner/.rubies/jruby-9.3.10.0/bin/bundle config --local path /home/runner/work/embulk-output-bigquery/embulk-output-bigquery/vendor/bundle
/home/runner/.rubies/jruby-9.3.10.0/bin/bundle lock
Fetching gem metadata from https://rubygems.org/........
Resolving dependencies......
Bundler found conflicting requirements for the Ruby version:
In Gemfile:
Ruby
bundler (>= 1.10.6) was resolved to 2.2.33, which depends on
Ruby (>= 2.3.0)
pry-nav was resolved to 1.0.0, which depends on
pry (< 0.15, >= 0.9.10) was resolved to 0.14.2, which depends on
coderay (~> 1.1) was resolved to 1.1.3, which depends on
Ruby (>= 1.8.6)
embulk-output-bigquery was resolved to 0.7.0, which depends on
googleauth (= 1.10.0) was resolved to 1.10.0, which depends on
Ruby (>= 2.7)
embulk (= 0.10.49) was resolved to 0.10.49, which depends on
msgpack (>= 1.1.0) was resolved to 1.7.2, which depends on
Ruby (>= 2.5)
pry-nav was resolved to 1.0.0, which depends on
pry (< 0.15, >= 0.9.10) was resolved to 0.14.2, which depends on
Ruby (>= 2.0.0)
pry-nav was resolved to 1.0.0, which depends on
Ruby (>= 2.1.0)
Error: The process '/home/runner/.rubies/jruby-9.3.10.0/bin/bundle' failed with exit code 6
docker build -m option?
https://docs.docker.com/config/containers/resource_constraints/
@hiroyuki-sato Thanks for for your comment! I fixed the code from your review comments. Please check them when you have time.
Could you tell me why you want to update gems?
I want to be able to install embulk-output-bigquery in a 2GB or 4GB memory environment, so I want to update the gems.
I also tried docker build -t embulk:hoge -m 4g .
, but I got a java.lang.OutOfMemoryError
.
LGTM👍 Could you squash commits? I'll request another maintainer's review.
I squashed commits. Thank you!
@Nozomuts LGTM 👍 Thanks
@joker1007 Could you review this PR when you get a chance? This PR reduces storage size dependency gems.
Dependency gem size dramatically decreases.
Dependency Gem size
Before this PR | After this PR |
---|---|
124MB | 10MB |
I also test this PR on the local.
Thanks!