WIP: Reduce use of Selenium in favour of rack-test
Just a proposal: would it make sense to use rack-test for system tests, where possible?
If we can reduce the use of Selenium, the test suite will run faster and more reliably. The code in this PR is just a quick experiment, showing that only 40% of system tests need Selenium. The relevant output is the following:
272 runs, 1285 assertions, 53 failures, 55 errors, 0 skips
I'm not familiar with rack-test so can you explain a bit about how it works and how it differs from using selenium?
Presumably it's not actually using a browser so all you're really doing is checking that the expected HTML is returned? So what does it offer over a controller or integration test?
Right, that's exactly it. The difference with controller tests is that you get to browse across different controllers, not just the one. The difference with integration tests is that you browse via clicking links, submitting forms, etc, as opposed as sending low-level HTTP requests.
In general, my view is that it's best to avoid Selenium unless you really need JS. In this case, the tests involved typically are about the profile section and similar, which don't need JS most of the time. Anywhere we think the JS is relevant, we can add back.
The implementation would look something like "Selenium by default, except in these tests where we use rack-test" or the other way around.
I'd be happy to see the tests updated to use rack-test by default, and a full browser only when needed for full javascript execution. I don't think there's any noticeable gap between the html generation of rack-test and a browser, whereas there were often issues back in the poltergeist-vs-real-browser days for JS execution.
For me, the "visit link" style testing is the most important bit. I'd prefer if the controller-style and integration-style tests were all rewritten away from raw POSTs etc (apart from the tests that deliberately try to inject incorrect form submissions to validate the error handling) but that's such a huge project I've never thought it was worthwhile.
I'm used to using rspec in other projects, and only tagging the specific system tests that need javascript with an option, e.g. it "does something", :js do. I don't know if there's a way to do similar tagging with minitest, or whether trying to do that becomes more complex (especially for novice developers) than it's worth.
So as @gravitystorm says I know he's long been keen to move away from controller and integration tests to system tests and I understand the logic that you're actually properly testing the behaviour of the site as the user interacts with it but the downside has always been the cost in time/resources of running everything through selenium.
So it sounds like this is a solution to that but with the downside that you're really testing how the site will behave in a browser any more.
Is there a tension here with more use of turbo? Presumably rack-test won't be doing any additional loads that turbo might call for either or will it?
100% agree with leaving controller/integration tests for low-level testing. My experience is that these are typically used for testing specific technical details, variants of "happy paths" reflected in system tests, and malicious usage of APIs.
My main experience is also with RSpec with the syntax shared by @gravitystorm. I want to think that with minitest it can be done with something like driven_by :rack_test similar to how there are driven_by declarations on https://github.com/openstreetmap/openstreetmap-website/blob/master/test/system/embed_test.rb, but I haven't gone that far into testing this idea yet.
Re: Turbo, yes, that's a tradeoff. With rack-test, links and form submissions will always trigger full-page loads. This creates a difference between what the tests do and real users do. At this point the question is: how much do we trust Rails to do what it promises to do?
When using an external tool, we want to think that we don't need to test the tool itself, but just the things we do with the tool, if that makes sense. Having said that, things can go wrong for example if something is misconfigured, etc. Eg: if we don't label a turbo frame correctly by mistake. Still we can go on a case-by-case basis. For example using Selenium only for the happy paths, or for specific situations where there's a known risk.
In exchange, we gain a faster execution of the test suite, a reduction of flaky tests, and more easily debuggable system tests.
So it sounds like this is a solution to that but with the downside that you're really testing how the site will behave in a browser any more.
I have the "behave like a browser" split into two aspects in my mind:
- With regards to the manner that the tests are written in, I prefer "click_on link, fill_in foo, click submit" as more closely mimicking what a user is doing. The alternative is "assert page has_link(foo), get foo, assert_dom has_input_field, post bar" and I find that more fragile and generally less like mimicking the user experience. So that's in favour of the system test syntax.
- Then when it comes to the manner the tests are run, I'd love to just push everything through a real (albeit headless) browser, but for practical reasons it can be painfully slow. I console (delude?) myself that using e.g. rack-test (or similar) is like "using the site with javascript disabled". It's not ideal, but I haven't come across many actual problems with doing that.
I live in hope that someone can create something that's much closer to a real browser, in terms of layout engines and js execution, without having to actually be an entire browser and all the headaches that brings.
I don't know if there's a way to do similar tagging with minitest
- https://github.com/ordinaryzelig/minispec-metadata - but this only works if we're already using the spec syntax which would be a ton of work - almost at the "might as well use rspec" level of effort, perhaps.
- https://github.com/bernardoamc/minitag - as far as I can tell, this is only useful for choosing which tests to run, rather than being a source of metadata that e.g.
beforeblocks can use to switch drivers. I could be wrong on this. I'm not keen on the syntax.
So I think the only approach is to be more explicit with choosing the driver in each class or in specific tests. Something like:
class MixedDriverTest < ApplicationSystemTestCase
driven_by :rack_test # default
test "simple non-JS test" do
visit "/status"
assert_text "OK"
end
test "switch to selenium just for this test" do
using_driver(:selenium, using: :chrome) do
visit "/dashboard"
# JS interactions here
assert_selector ".map"
end
end
end
I would probably default to :rack_test at the ApplicationSystemTestCase level, and then provide a custom method in place of the using_driver stuff, for ease of refactoring and being more explicit as to why we're doing it e.g.
test "switch to selenium just for this test" do
using_javascript do
visit "/dashboard"
# JS interactions here
assert_selector ".map"
end
end
I'd be interested in a comparison of test suite runtimes, to see if it's worthwhile.