stockboy icon indicating copy to clipboard operation
stockboy copied to clipboard

Expose Nokogiri parsing options for xml reader

Open avit opened this issue 11 years ago • 2 comments

Currently:

  • The :soap provider returns a Nori hash from Savon by default
  • The :xml reader either accepts a Nori hash, or parses XML strings using Nori

Nori is fine for the easy cases but is too lightweight as a general XML tool. We should expose the true XML parsing power of Nokogiri.

Can we make the options work sensibly for a single reader :xml block to provide for both lightweight Nori hashes OR Nokogiri nodes?

The attribute map can read keys from anything that responds to the [] method (remember: proc.call(x) == proc[x]), so it might just be an option for what kind of object we yield for each record, such as:

  • a Nori hash (current default)
  • A wrapper for xpath as ->(key) { node.at_xpath(key) }
  • A wrapper for css as ->(key) { node.at_css(key) }

avit avatar Feb 17 '14 23:02 avit

Proposing:

reader :xml do

  # Altered "elements" option, with current nori default aliased to `:hash` option:
  elements "body", "ul", "li"
  elements hash: ["body", "ul", "li"] # same (default)
  elements xpath: "//body/ul/li"
  elements css: "body > ul > li"

  # New "keys" option for wrapping nodes passed to the attribute map:

  keys :hash   # (default)
  keys :xpath
  keys :css
end

The attribute map could remain unchanged:

attributes do
  email from: "/Details/EmailAddress", as: [:string]  # when reader gives keys: :xpath
end

avit avatar Feb 17 '14 23:02 avit

@markedmondson would this be in line with what you have done in your "xpath" reader or do you still think we need to expose reader :xpath as a separate option from :xml?

avit avatar Feb 17 '14 23:02 avit