webmock
webmock copied to clipboard
webmock failures inside sidekiq workers not bubbled up to rspec
Using webmock 3.8.3 with ruby 2.6.6 and rails 5.2.4
This is related to #897 now.
When I enable and then disable, I see the webmock failures in the logs, but they are not exposed up to rspec.
In other words, the request fails properly:
error: Real HTTP connections are disabled. Unregistered request: GET.......
But the test is considered passed because the failure inside the worker doesn't bubble up to rspec:
Finished in 0.55596 seconds (files took 4.35 seconds to load)
1 example, 0 failures
I'm not sure if this is expected or not.
I also added the above as the last comment. The rest of this post and my other comments is mostly background details that aren't totally relevant at this point. I also updated the title of the issue to reflect the "real" problem, as it was mostly user error the first threve-teen go throughs.
Test scenario (rspec):
it "should fetch token vector for article content when content is not nil" do
require 'webmock/rspec'
# don't wait for async sidekiq job
Sidekiq::Testing.inline!
request_url = "#{ENV['TOKENIZER_URL']}/tokenize"
# webmock the expected request and response
stub = stub_request(:post, request_url)
.with(body: 'content=y')
.to_return(body: 'y')
# create an attrs to hand to ingest
attrs = {
data_source: @data_source,
title: Faker::Book.title,
url: Faker::Internet.url,
content: "y",
abstract: Faker::Book.genre,
published_on: DateTime.now,
created_at: DateTime.now
}
# ingest an article from the attrs
status = Article.ingest(attrs)
# the ingest occurs roughly simultaneously to the submission to the
# worker so we need to re-fetch the article by the id because at that
# point it will have gotten the vector saved to the DB
@token_vector_article = Article.find(status[1].id)
# we should have sent a request with content=y
expect(stub).to have_been_requested
# we should've saved "y" as the token_vector
expect(@token_vector_article.token_vector).not_to eq(nil)
expect(@token_vector_article.token_vector).to eq("y")
end
Result:
root@c18df30d6d22:/usr/src/app# bundle exec rspec spec/models/article_spec.rb:174
database: seraph_aa_test
Run options: include {:locations=>{"./spec/models/article_spec.rb"=>[174]}}
.2020-06-19T12:34:48.961Z 2693 TID-gso1icmp1 INFO: before connection
2020-06-19T12:34:49.068Z 2693 TID-gso1icmp1 INFO: after connection
F
Failures:
1) Article tokenization and vectorization should fetch token vector for article content when content is not nil
Failure/Error: expect(stub).to have_been_requested
The request POST https://xxx/tokenize with body "content=y" was expected to execute 1 time but it executed 0 times
The following requests were made:
No requests were made.
============================================================
# ./spec/models/article_spec.rb:213:in `block (3 levels) in <top (required)>'
# ./spec/rails_helper.rb:49:in `block (3 levels) in <top (required)>'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/generic/base.rb:16:in `cleaning'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/configuration.rb:87:in `block (2 levels) in cleaning'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/configuration.rb:88:in `cleaning'
# ./spec/rails_helper.rb:48:in `block (2 levels) in <top (required)>'
Finished in 32.2 seconds (files took 3.93 seconds to load)
2 examples, 1 failure
Failed examples:
rspec ./spec/models/article_spec.rb:180 # Article tokenization and vectorization should fetch token vector for article content when content is not nil
You can see that the Sidekiq worker is being executed because its log messages are showing (before connection / after connection)
Here's the code in the worker:
class TokenizeAndVectorizeWorker
include Sidekiq::Worker
sidekiq_options queue: 'tokenizer_vectorizer', retry: true, backtrace: true
def perform(article_id)
article = Article.find(article_id)
tokenizer_url = ENV['TOKENIZER_URL']
if article.content.nil?
send_content = article.abstract
else
send_content = article.content
end
# configure Faraday
conn = Faraday.new(tokenizer_url) do |c|
c.use Faraday::Response::RaiseError
c.headers['Content-Type'] = 'application/x-www-form-urlencoded'
end
logger.info('before connection')
# get the response from the tokenizer
resp = conn.post '/tokenize', "content=#{URI.encode(send_content)}"
logger.info('after connection')
# the response's body contains the JSON for the tokenized and vectorized article content
article.token_vector = resp.body
article.save
end
end
You can see where I am making the Faraday call and it is definitely reaching out to the remote system because I get back an expected output from that remote system:
Article Update (1.2ms) UPDATE "articles" SET "token_vector" = $1, "updated_at" = $2, "data" = $3 WHERE "articles"."id" = $4 [["token_vector", "{\"py/object\": \"scipy.sparse.csr.csr_matrix\", \"_shape\": {\"py/tuple\": [1, 262144]}, \"maxprint\": 50, \"indices\": {\"py/reduce\": [{\"py/function\": \"numpy.core.multiarray._reconstruct\"}, {\"py/tuple\": [{\"py/type\": \"numpy.ndarray\"}, {\"py/tuple\": [0]}, {\"py/b64\": \"Yg==\"}]}, {\"py/tuple\": [1, {\"py/tuple\": [1]}, {\"py/reduce\": [{\"py/type\": \"numpy.dtype\"}, {\"py/tuple\": [\"i4\", 0, 1]}, {\"py/tuple\": [3, \"<\", null, null, null, -1, -1, 0]}]}, false, {\"py/b64\": \"FpIBAA==\"}]}]}, \"indptr\": {\"py/reduce\": [{\"py/function\": \"numpy.core.multiarray._reconstruct\"}, {\"py/tuple\": [{\"py/type\": \"numpy.ndarray\"}, {\"py/tuple\": [0]}, {\"py/b64\": \"Yg==\"}]}, {\"py/tuple\": [1, {\"py/tuple\": [2]}, {\"py/id\": 2}, false, {\"py/b64\": \"AAAAAAEAAAA=\"}]}]}, \"data\": {\"py/reduce\": [{\"py/function\": \"numpy.core.multiarray._reconstruct\"}, {\"py/tuple\": [{\"py/type\": \"numpy.ndarray\"}, {\"py/tuple\": [0]}, {\"py/b64\": \"Yg==\"}]}, {\"py/tuple\": [1, {\"py/tuple\": [1]}, {\"py/reduce\": [{\"py/type\": \"numpy.dtype\"}, {\"py/tuple\": [\"f8\", 0, 1]}, {\"py/tuple\": [3, \"<\", null, null, null, -1, -1, 0]}]}, false, {\"py/b64\": \"AAAAAAAA8D8=\"}]}]}, \"_has_canonical_format\": true, \"_has_sorted_indices\": true}"], ["updated_at", "2020-06-19 12:34:49.078913"], ["data", "{\"tags\":[],\"tag_ids\":[],\"added_tags\":[],\"removed_tags\":[],\"bookmarks\":[]}"], ["id", 11]]
If I put require 'webmock/rspec'
at the top of this particular test file or even in my spec_helper
, it causes other things in my test to blow up (namely the integration with Elasticsearch and setting up the tests):
# top of spec file
require 'spec_helper'
require 'rails_helper'
require 'sidekiq/testing'
require 'webmock/rspec'
Result:
root@c18df30d6d22:/usr/src/app# bundle exec rspec spec/models/article_spec.rb:174
database: seraph_aa_test
Run options: include {:locations=>{"./spec/models/article_spec.rb"=>[174]}}
There was an error creating the elasticsearch index for Article: #<NameError: uninitialized constant Faraday::Error::ConnectionFailed>
There was an error removing the elasticsearch index for Article: #<NameError: uninitialized constant Faraday::Error::ConnectionFailed>
F
Failures:
1) Article extract image url article with improper image url
Failure/Error: Article.import
ArgumentError:
seraph_aa_application_test does not exist to be imported into. Use create_index! or the :force option to create it.
# /usr/local/bundle/gems/elasticsearch-model-5.1.0/lib/elasticsearch/model/importing.rb:118:in `import'
# /usr/local/bundle/gems/elasticsearch-model-5.1.0/lib/elasticsearch/model.rb:118:in `import'
# ./spec/models/article_spec.rb:18:in `block (2 levels) in <top (required)>'
# ./spec/rails_helper.rb:49:in `block (3 levels) in <top (required)>'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/generic/base.rb:16:in `cleaning'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/configuration.rb:87:in `block (2 levels) in cleaning'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/configuration.rb:88:in `cleaning'
# ./spec/rails_helper.rb:48:in `block (2 levels) in <top (required)>'
Finished in 0.50151 seconds (files took 4.75 seconds to load)
1 example, 1 failure
Failed examples:
rspec ./spec/models/article_spec.rb:166 # Article extract image url article with improper image url
It seems to "work" with webmock/rspec
loaded in the specific test in the sense that it's clearly showing what webmock expects. However it's not detecting any requests being made, let alone the wrong request.
Related to: https://stackoverflow.com/questions/62461780/how-do-i-test-whether-a-sidekiq-worker-is-sending-the-right-data-to-an-external
I tried adding a WebMock.ebable!
to the test per #622 but that just ends up breaking ElasticSearch:
1) Article tokenization and vectorization should fetch token vector for article content when content is not nil
Failure/Error: record.__elasticsearch__.__send__ "#{operation}_document"
NameError:
uninitialized constant Faraday::Error::ConnectionFailed
# /usr/local/bundle/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/transport/http/faraday.rb:47:in `host_unreachable_exceptions'
# /usr/local/bundle/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/transport/base.rb:282:in `rescue in perform_request'
# /usr/local/bundle/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/transport/base.rb:252:in `perform_request'
# /usr/local/bundle/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
# /usr/local/bundle/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/client.rb:131:in `perform_request'
# /usr/local/bundle/gems/elasticsearch-api-5.0.5/lib/elasticsearch/api/actions/index.rb:100:in `index'
# /usr/local/bundle/gems/elasticsearch-model-5.1.0/lib/elasticsearch/model/indexing.rb:346:in `index_document'
# ./app/workers/indexer.rb:27:in `perform'
# /usr/local/bundle/gems/sidekiq-5.2.8/lib/sidekiq/testing.rb:301:in `execute_job'
# /usr/local/bundle/gems/sidekiq-5.2.8/lib/sidekiq/testing.rb:296:in `block in process_job'
# /usr/local/bundle/gems/sidekiq-5.2.8/lib/sidekiq/middleware/chain.rb:128:in `block in invoke'
# /usr/local/bundle/gems/sidekiq-5.2.8/lib/sidekiq/middleware/chain.rb:133:in `invoke'
# /usr/local/bundle/gems/sidekiq-5.2.8/lib/sidekiq/testing.rb:295:in `process_job'
# /usr/local/bundle/gems/sidekiq-5.2.8/lib/sidekiq/testing.rb:89:in `block in raw_push'
# /usr/local/bundle/gems/sidekiq-5.2.8/lib/sidekiq/testing.rb:85:in `each'
# /usr/local/bundle/gems/sidekiq-5.2.8/lib/sidekiq/testing.rb:85:in `raw_push'
# /usr/local/bundle/gems/sidekiq-5.2.8/lib/sidekiq/client.rb:74:in `push'
# /usr/local/bundle/gems/sidekiq-5.2.8/lib/sidekiq/worker.rb:149:in `client_push'
# /usr/local/bundle/gems/sidekiq-5.2.8/lib/sidekiq/worker.rb:93:in `perform_async'
# ./app/models/concerns/searchable.rb:74:in `block (2 levels) in <module:Searchable>'
# /usr/local/bundle/gems/airbrake-6.3.0/lib/airbrake/rails/active_record.rb:27:in `run_callbacks'
# ./app/models/article.rb:306:in `ingest'
# ./spec/models/article_spec.rb:205:in `block (3 levels) in <top (required)>'
# ./spec/rails_helper.rb:49:in `block (3 levels) in <top (required)>'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/generic/base.rb:16:in `cleaning'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/configuration.rb:87:in `block (2 levels) in cleaning'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/configuration.rb:88:in `cleaning'
# ./spec/rails_helper.rb:48:in `block (2 levels) in <top (required)>'
# ------------------
# --- Caused by: ---
# WebMock::NetConnectNotAllowedError:
# Real HTTP connections are disabled. Unregistered request: PUT http://localhost:9200/seraph_aa_application_test/article/11 with
...
So, I think I really only want to webmock specific requests -- this doesn't seem like the right path.
Continuing my troubleshooting, I tried the following in the test:
WebMock.disable_net_connect!(allow_localhost: true)
This allowed the ElasticSearch things to work, but I still had the problem of Sidekiq's call of Faraday not being intercepted:
1) Article tokenization and vectorization should fetch token vector for article content when content is not nil
Failure/Error: expect(stub).to have_been_requested
The request POST https://xxx/tokenize with body "content=y" was expected to execute 1 time but it executed 0 times
The following requests were made:
No requests were made.
============================================================
So this appears to me to definitely be some kind of issue with Sidekiq's test environment not being affected by webmock.
This might be related to #578 because I'm using a connection object. I'm currently trying to figure out how to NOT use a connection object (to validate the problem) but still make the same configuration options.
I tried switching to Excon because of #578 and I am seeing the same behavior.
Worker code:
class TokenizeAndVectorizeWorker
include Sidekiq::Worker
sidekiq_options queue: 'tokenizer_vectorizer', retry: true, backtrace: true
def perform(article_id)
article = Article.find(article_id)
tokenizer_url = ENV['TOKENIZER_URL']
if article.content.nil?
send_content = article.abstract
else
send_content = article.content
end
response = Excon.post("#{tokenizer_url}/tokenize",
:body => URI.encode_www_form(:content => send_content),
:headers => { "Content-Type" => "application/x-www-form-urlencoded" },
:expects => [200])
# the response's body contains the JSON for the tokenized and vectorized article content
article.token_vector = response.body
article.save
end
end
Test:
it "should fetch token vector for article content when content is not nil" do
# don't wait for async sidekiq job
Sidekiq::Testing.inline!
require 'webmock/rspec'
request_url = "#{ENV['TOKENIZER_URL']'}tokenize"
# webmock the expected request and response
stub = stub_request(:post, request_url)
.with(body: 'content=y')
.to_return(body: 'y')
# create an attrs to hand to ingest
attrs = {
data_source: @data_source,
title: Faker::Book.title,
url: Faker::Internet.url,
content: "y",
abstract: Faker::Book.genre,
published_on: DateTime.now,
created_at: DateTime.now
}
# ingest an article from the attrs
status = Article.ingest(attrs)
# the ingest occurs roughly simultaneously to the submission to the
# worker so we need to re-fetch the article by the id because at that
# point it will have gotten the vector saved to the DB
@token_vector_article = Article.find(status[1].id)
# we should have sent a request with content=y
expect(stub).to have_been_requested
# we should've saved "y" as the token_vector
expect(@token_vector_article.token_vector).not_to eq(nil)
expect(@token_vector_article.token_vector).to eq("y")
end
Result:
root@c18df30d6d22:/usr/src/app# bundle exec rspec spec/models/article_spec.rb:174
database: seraph_aa_test
Run options: include {:locations=>{"./spec/models/article_spec.rb"=>[174]}}
.F
Failures:
1) Article tokenization and vectorization should fetch token vector for article content when content is not nil
Failure/Error: expect(stub).to have_been_requested
The request POST https://xxx/tokenize with body "content=y" was expected to execute 1 time but it executed 0 times
The following requests were made:
No requests were made.
============================================================
# ./spec/models/article_spec.rb:212:in `block (3 levels) in <top (required)>'
# ./spec/rails_helper.rb:49:in `block (3 levels) in <top (required)>'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/generic/base.rb:16:in `cleaning'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/configuration.rb:87:in `block (2 levels) in cleaning'
# /usr/local/bundle/gems/database_cleaner-1.8.4/lib/database_cleaner/configuration.rb:88:in `cleaning'
# ./spec/rails_helper.rb:48:in `block (2 levels) in <top (required)>'
Finished in 3.03 seconds (files took 3.86 seconds to load)
2 examples, 1 failure
Failed examples:
rspec ./spec/models/article_spec.rb:180 # Article tokenization and vectorization should fetch token vector for article content when content is not nil
How could I use binding.pry
inside the worker to see if WebMock is "present" and listening for / intercepting HTTP requests?
I switched to using excon and simplified things with a service class, so closing this.
I'm reopening this because I ran into it again in a different place. I have several workers that interact with the internet. I don't think I should have to use service classes for webmock to intercept traffic happening inside a Sidekiq worker. Or there's something I don't understand about how to make it do so.
Simply put, if you have a model that does SomethingWorker.perform
and that Sidekiq worker does Excon.get
, that request is not seen by Webmock, even when that test is "inline". My guess is it's because the worker is not being directly launched via rspec, but rather rspec is doing something with a model and the model (which doesn't load webmock?) is doing something with a worker (which also doesn't load webmock?) and the worker is what's interacting with the internet... and technically that worker is outside of rspec?
This is related to #897 now.
When I enable and then disable, I see the webmock failures in the logs, but they are not exposed up to rspec.
In other words, the request fails properly:
error: Real HTTP connections are disabled. Unregistered request: GET.......
But the test is considered passed because the failure inside the worker doesn't bubble up to rspec:
Finished in 0.55596 seconds (files took 4.35 seconds to load)
1 example, 0 failures
I'm not sure if this is expected or not.