stockboy
stockboy copied to clipboard
Expose Nokogiri parsing options for xml reader
Currently:
- The
:soap
provider returns a Nori hash from Savon by default - The
:xml
reader either accepts a Nori hash, or parses XML strings using Nori
Nori is fine for the easy cases but is too lightweight as a general XML tool. We should expose the true XML parsing power of Nokogiri.
Can we make the options work sensibly for a single reader :xml
block to provide for both lightweight Nori hashes OR Nokogiri nodes?
The attribute map can read keys from anything that responds to the []
method (remember: proc.call(x)
== proc[x]
), so it might just be an option for what kind of object we yield for each record, such as:
- a Nori hash (current default)
- A wrapper for xpath as
->(key) { node.at_xpath(key) }
- A wrapper for css as
->(key) { node.at_css(key) }
Proposing:
reader :xml do
# Altered "elements" option, with current nori default aliased to `:hash` option:
elements "body", "ul", "li"
elements hash: ["body", "ul", "li"] # same (default)
elements xpath: "//body/ul/li"
elements css: "body > ul > li"
# New "keys" option for wrapping nodes passed to the attribute map:
keys :hash # (default)
keys :xpath
keys :css
end
The attribute map could remain unchanged:
attributes do
email from: "/Details/EmailAddress", as: [:string] # when reader gives keys: :xpath
end
@markedmondson would this be in line with what you have done in your "xpath" reader or do you still think we need to expose reader :xpath
as a separate option from :xml
?