node-osmosis icon indicating copy to clipboard operation
node-osmosis copied to clipboard

URL encoding (follow method) - ERR_UNESCAPED_CHARACTERS

Open andriibieriezhnoi opened this issue 6 years ago • 4 comments

Hello, I got this error while using .follow() method.

_http_client.js:115
      throw new ERR_UNESCAPED_CHARACTERS('Request path');
      ^

TypeError [ERR_UNESCAPED_CHARACTERS]: Request path contains unescaped characters
    at new ClientRequest (_http_client.js:115:13)
    at Object.request (http.js:41:10)
    at Needle.send_request (/Users/andrey/Projects/kankakeecountyed/scripts/node_modules/needle/lib/needle.js:465:26)
    at next (/Users/andrey/Projects/kankakeecountyed/scripts/node_modules/needle/lib/needle.js:361:10)
    at Needle.start (/Users/andrey/Projects/kankakeecountyed/scripts/node_modules/needle/lib/needle.js:364:17)
    at Object.exports.request (/Users/andrey/Projects/kankakeecountyed/scripts/node_modules/needle/lib/needle.js:746:56)
    at Request (/Users/andrey/Projects/kankakeecountyed/scripts/node_modules/osmosis/lib/Request.js:15:19)
    at Osmosis.request (/Users/andrey/Projects/kankakeecountyed/scripts/node_modules/osmosis/index.js:187:5)
    at Osmosis.dequeueRequest (/Users/andrey/Projects/kankakeecountyed/scripts/node_modules/osmosis/index.js:269:10)
    at /Users/andrey/Projects/kankakeecountyed/scripts/node_modules/osmosis/index.js:223:22

My config:

osmosis
  .get(SOURCE_URL)
  .find('.newsSummaryItem')
  .set({
    title: '.newsTitle',
    date: '.newsDate',
    excerpt: '.newsSummary',
  })
  .follow('.readMore@href')
  .set({
    content: '.newsSummary',
  })
  .data(data => savedNews.push(data))
  .log(console.log)
  .error(console.log)
  .debug(console.log)
  .done(() => {
    fs.writeFile('news.json', JSON.stringify(savedNews, null, 4), (err) => {
      if (err) {
        console.log(err);
      } else {
        console.log(`Data saved to news.json file.\nNews count: ${savedNews.length}`);
      }
    });
  });

andriibieriezhnoi avatar Dec 28 '18 10:12 andriibieriezhnoi

I think SOURCE_URL has unescaped characters, please paste it here.

BitFros7y avatar Jan 07 '19 20:01 BitFros7y

@NegativeIQ http://kankakeecountyed.org/about-us/news-and-updates.aspx

its works on nodejs 9.7.1, but I have this error on node 10.14.* and 10.15

andriibieriezhnoi avatar Jan 07 '19 21:01 andriibieriezhnoi

@NegativeIQ https://repl.it/@andreyberezhnoy/news-scrapping

repl.it use node 9.7.1

andriibieriezhnoi avatar Jan 07 '19 21:01 andriibieriezhnoi

Did you manage to solve the problem? I ran into the same.

andykov avatar Jul 17 '19 11:07 andykov