|
Source: ONLamp.com Why the Lucky Stiff’s fast, enjoyable Hpricot library makes hard Rails View tests effective and fun. Hpricot is a deep and useful HTML parser with a wide, flexible interface. It supports many clever systems to read and edit HTML. When we put it to work in Rails functional tests, it offers lots of different ways to solve hard problems.
This dissertation depends on Ruby on Rails; the general techniques apply to any web development. The assertions presented here are available in the Rails plugin assert_xpath.
Hpricot’s arch enemy is REXML, an XML parser bundled with Ruby. Here’s their score chart:
Hpricot REXML
compliance forgives anything vaguely resembling HTML too strict for Transitional XHTML
utility
several creative Domain Specific Languages, for lean, clear expressions
XPath, and a terse Object Model
queries
good CSS selectors and poor XPath
no CSS selectors and perfect XPath
speed
optimized with C
pessimized with Regexps
The test plugin assert_xpath supports both systems, and enhances their DSLs.
A Rails functional test works by mocking the web server, and generating a sample web page as a big string, in the variable @response.body. Then a test case parses this string, looking for its important details. This technique avoids the overhead of invoking a real web server and browser, and commanding each to do something outside its performance envelop. The two query languages for HTML are CSS selectors and XPath.
Here’s a test case using raw Hpricot, before we cook it up in reusable assertions: def test_raw_Hpricot get :index, :id = 'FrontPage' # serve a WikiWiki doc = Hpricot(@response.body) # read the mock server response script = doc.search('script[3]').first # locate our target assert_equal 'text/javascript', script['type'], 'script should be JS' assert_match /&/, script.to_s, 'oh no! our script has a & character!' end
Test cases can choose between Hpricot and REXML, to leverage each one’s advantages. assert_xml uses either, depending on a recent call to invoke_hpricot or invoke_rexml. (Use this technique with the Abstract Test Pattern, to run assertions twice.) Call assert_hpricot or assert_rexml directly, to override this default.
Assertiveness Counseling
Now we bundle those Hpricot calls up into two assertions, assert_hpricot and assert_xpath: def test_with_assert_hpricot get :index, :id = 'FrontPage' assert_hpricot # @response.body is the default script = assert_xpath('script[3]') assert_equal 'text/javascript', script['type'], 'script should be JS' assert_match /&/, script.to_s, 'our script has a & character!' end
Because Hpricot is forgiving, assert_hpricot itself does not actually assert very much! (Use assert_rexml or assert_tidy to validate your code.) The important part is the next line, assert_xpath, because it wraps doc.search, so we can put a wide subset of XPath into it. In this case, we only put in a [3], to select the third .
You Are all Forgiven
Like a web browser, Hpricot forgives your HTML for its sins. Some test cases should not. But REXML is so unforgiving that Transitional XHTML might break it. The fun starts when your XML contains an & without its escapes: # both Hpricot and REXML like well-formed & escapes: assert_xml '&', 'a[ "&" = . ]' # ^ input XML ^ XPath to satisfy
# only Hpricot likes ill-formed escapes;
assert_hpricot '&', 'a[ "&" = . ]'
assert_raise_message REXML::ParseException, /Illegal character '&'/ do assert_rexml '&', 'a[ "&" = . ]' end
# and both like incomplete escapes!
assert_xml '&yo', 'a[ "&yo" = . ]'
Why is that important? Because web browsers don’t process the escapes found in embedded JavaScript. That forces our tools to incorrectly escape these escapes when they generate HTML. So a Rails call to javascript_tag("document.write('&');"), for example, will emit this: script type="text/javascript" //![CDATA[ document.write('&'); //]] script
Bless ActionView’s pointy head for escaping the entire block correctly, but according to the “law” (or “recommendations”), that output should contain &. Browsers should interpret that and pass & as a source code literal to JavaScript, and this should push & into the browser’s surface, which should then display & to your user. If an HTML tool like javascript_tag corrected that &, modern browsers would not interpret it before the JavaScript layer, and your users would see &. That’s not really what you wanted, and browsers can’t upgrade until everyone in the world who wrote their websites with Notepad upgrades their source. Don’t hold your breath. And so javascript_tag doesn’t escape the & to &.
The culture of XML enforces well-formed contents, typically machine-generated. So even if REXML does not choke on any appearance of & followed by alphabetic characters, it still chokes on all the other appearances of &, such as && for and operations. And you can’t escape them because your browser won’t de-escape them. If these problems prevent you from using assert_rexml, prepare your XHTML first with a call like: @response.body.gsub!(%r/&(?=[^a-z])/i, '&')
Hpricot doesn’t have all these problems.
Functional Tests for Views
A Rails test that operates on a controller is a “functional test”. These should guide the operations of complete features. Ideally, all our low-level data manipulations should appear inside models. Controllers control data transactions, and send results to Views. So the place to start view testing is the functional tests, where each page we render comes back as a big string. def test_buy_item_form login_as :tygr get :index assert_hpricot action = url_for(:action = :buy_items) assert_xpath "//form[ '#{ action }' = @action ]" end
The login_as method comes from one of Rails’s nifty authentication plugins. Then get :index simulates fetching the index page of our current controller. The assert_hpricot absorbs its output, and the assert_xpath reaches out to a suspect FORM.
Note that we always concoct URIs using url_for(), and we never hard-code FORM actions, such as “/training/buy_items“. We don’t want our tests to break just because we changed the file routes.rb.
The test is not complete yet because it doesn’t do anything with the FORM. First, we will upgrade its Hpricot stylings.
CSS Selectors
Note the first search used XPath to query for a given FORM, while the second one used CSS selector notation to identify the same FORM. Hpricot supports a subset of XPath, and CSS selectors, thru the same interface, so we can always use the system that’s most convenient. For example, if we must target an element with multiple classes, , our first attempt at a matching XPath is odious and fragile:
.//div[ contains(@class, “class_D”) ]
That’s fragile because a different class, “class_Dismissed“, would provide a false match. A better XPath would require more tedious string manipulations in its [predicate] filter. The CSS notation is more clear and accurate: “div.class_D“.
So this test case finds our FORM using its unique id, not its action: form = assert_xpath('form#buying_items') action = url_for(:action = :buy_items) assert_equal action, form[:action]
This opens the question how to test the link from that URI to its target action in the controller. We could change that action’s name, and this test wouldn’t break. Because unit tests for web sites cannot (yet) work with real servers and browsers, we must at least test each step, with overlapping test cases. One case will test we have a FORM, the next tests that it calls the right controller action, the next tests that the controller action does the right thing, and so on.
Submitting Forms
The Rails plugin form_test_helper works with assert_select (another useful assertion system based on an HTML parser and CSS selectors) to read a FORM’s input variables, and present each one as a helpful little collection. We can assert that our FORM contains the right action, then assert that submitting our FORM, with its current fields, will call the action correctly. def test_buy_item_submit_form login_as :tygr get :index assert_hpricot form = assert_xpath('form#buying_items') action = url_for(:action = :buy_items) assert_equal action, form[:action]
submit_form form[:action] do |post| assert_equal users(:tygr).id.to_s, post['user[id]'].value post[:prop_1].check post[:prop_4].check end # assertions here should check the controller # updated the model and database correctly end
submit_form passes its post information into our block for treatment. We can assert that some automatic fields are populated correctly (including hidden ones), and we can simulate user input by changing some fields.
(Tip: Temporarily run p post.field_names, to remind yourself what your FORM contains.)
Conclusion
Hpricot’s XPath system cannot handle long elaborate queries. Use REXML if you need those. And Hpricot’s forgiveness envelop is a benefit when retrofitting tests to ill-formed HTML, but it’s a liability when building a site from scratch. Test cases should always incidentally coerce your code to improve its quality. If a super-strict test case, based on REXML, suddenly fails, you should revert your most recent edit and try again. This time you might not make the same mistake. Hpricot, in its default configuration, would not have warned you.
A test case can mix-and-match REXML and Hpricot freely; by passing the results of one into the base method of the other: def test_handoff assert_rexml '' + ' ' + ' ' + ' ' + ''
assert_xpath '/anna/marie/candy/lights' do |lights| lights = assert_hpricot(lights.to_s) # transfer a fragment of XML lights.since.imp.pulp{ @lay == 'things' } end # both assert_rexml and assert_hpricot end # support these query notations
These assertions allow Rails view tests to move beyond reacting to code changes. You can upgrade a test to fail for the right reason, and then upgrade your code to pass the test. This improves confidence that your tests cover the right things, and you can change your code more freely without making mistakes.
|