node-osmosis
node-osmosis copied to clipboard
v1.1.0 - When a query fails in a nested context innerHTML of the last successful query is returned.
In a nested context when the query fails (element/attribute/etc. doesn't exist or match) osmosis is setting the value of the object to what appears to be innerHTML. I'd expect null, or an empty array (if an array is wrapping the context).
example:
osmosis.parse(myhtml)
.find('.may-or-may-not-have-form')
.set({
name: 'form > .name',
inputs: [
osmosis.find('input').set({name: '.name')
]
})
When form exists I'm getting an array of inputs (as expected), but when form doesn't exist I'm getting inputs set to a single element array with a string.
Try changing osmosis.find('input') to osmosis.find('form input')
No matter what I use... select/find... greedy or strict selector query I'm still getting innerHTML of an element that doesn't fit the query. For the input that matches I'm getting expected results.
Here's what I think is a minimal working example showcasing the issue.
.html
<html>
<head><title>osmosis input</title></head>
<body>
<div class="section">
<div class="title">Section without a form.</div>
</div>
<div class="section">
<div class="title">Section with a form.</div>
<form>
<input type="text" name="sectionText" />
</form>
</div>
<div class="section">
<div class="title">Another section without a form.</div>
</div>
</body>
</html>
.js
import fs from 'fs';
import path from 'path';
import osmosis from 'osmosis';
const INPUTFILE = path.join('ext', 'example.html');
fs.readFile(INPUTFILE, null, function(err, data) {
osmosis
.parse(data)
.find('.section')
.set({
title: '.title',
inputs: [
osmosis
.find('form input')
.set({
name: '@name'
})
]
})
.data(function(data) {
console.log(JSON.stringify(data, null, 1));
})
.done(function() {
console.log('done');
})
.log(console.log)
.error(console.log);
});
console
(find) found 3 results for ".section"
{
"title": "Section without a form.",
"inputs": [
"\n Section without a form.\n "
]
}
(find) no results for "form input"
(find) found 1 results for "form input"
{
"title": "Section with a form.",
"inputs": [
{
"name": "sectionText"
}
]
}
{
"title": "Another section without a form.",
"inputs": [
"\n Another section without a form.\n "
]
}
(find) no results for "form input"
done
Yes, if you scraping multi context and any ones before throw an Error, the after one will not returned any values. :( This not good