hlidac-shopu icon indicating copy to clipboard operation
hlidac-shopu copied to clipboard

rozetka 3 actors added

Open vladyslav-n opened this issue 3 years ago • 8 comments

These are 3 new typescript actors for Hlidac Rozetka Project by Geniusee with edits after previous pull request attempt.

vladyslav-n avatar Apr 14 '21 12:04 vladyslav-n

Please do not use custom editorconfig, eslintrc nor prettierrc. Also, please, do not use TypeScript. Actors have to be directly usable without any compilation steps.

rarous avatar Apr 14 '21 12:04 rarous

Dear Aleš,

I’m a developer at Geniusee and we are partners with Apify. We usually develop our actors for Apify in typescript with a folder dist included so that the actor doesn’t require any build steps before run. Does it still mean we should change the typescript code to javascript? If it’s crucial for the project requirements, we’ll surely do so.

Sincerely yours,

Vlad 14 апр. 2021 г., 15:53 +0300, Aleš Roubíček @.***>, писал:

Please do not use custom editorconfig, eslintrc nor prettierrc. Also, please, do not use TypeScript. Actors have to be directly usable without any compilation steps. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

vladyslav-n avatar Apr 14 '21 13:04 vladyslav-n

Oh, I see Hlidac has dist/ directory in .gitignore, so that I missed that dist directory hasn’t actually been pushed to the repo, sorry for that. I could simply rename it to ‘build/’ so that it will be pushed for sure. In that case will typescript still need to be replaced with javascript? Thanks for your time!

vladyslav-n avatar Apr 14 '21 14:04 vladyslav-n

Hey @rarous, I ported Rozetka actors to javascript, please, check them out. Have a nice day!

vladyslav-n avatar Apr 16 '21 11:04 vladyslav-n

@rarous @metalwarrior665 Hi guys! I suppose, the code is ready for the 2-nd round of the code review)

vladyslav-n avatar Apr 21 '21 12:04 vladyslav-n

@vladyslav-n I tried run actor with "type" = "COUNT" & "type" = "DAILY" and I am getting quite frequently errors:

ArgumentError: Expected property string "url" to be a URL, got "/laboratornoe-osnashchenie/c4644808/page=11/" in object "requestLike" at Object.ow [as default] (/Users/janfiedler/Work/TopMonks/GitHub/hlidac-eshopu/node_modules/ow/dist/index.js:19:23)
      at RequestQueue.addRequest (/Users/janfiedler/Work/TopMonks/GitHub/hlidac-eshopu/node_modules/apify/build/storages/request_queue.js:173:21)
      at enqueueLastPage (file:///Users/janfiedler/Work/TopMonks/GitHub/hlidac-eshopu/actors/rozetka-daily/src/routes/helpers/enqueueLastPage.js:17:24)
      at countProductsOrSplitPriceRange (file:///Users/janfiedler/Work/TopMonks/GitHub/hlidac-eshopu/actors/rozetka-daily/src/routes/helpers/countProductsOrSplitPriceRange.js:21:15)
      at handleProductList (file:///Users/janfiedler/Work/TopMonks/GitHub/hlidac-eshopu/actors/rozetka-daily/src/routes/handleProductList.js:43:15)
      at CheerioCrawler.handlePageFunction [as userProvidedHandler] (file:///Users/janfiedler/Work/TopMonks/GitHub/hlidac-eshopu/actors/rozetka-daily/src/main.js:68:27)
      at CheerioCrawler._handleRequestFunction (/Users/janfiedler/Work/TopMonks/GitHub/hlidac-eshopu/node_modules/apify/build/crawlers/cheerio_crawler.js:452:49)
      at processTicksAndRejections (node:internal/process/task_queues:96:5)
      at async CheerioCrawler._runTaskFunction (/Users/janfiedler/Work/TopMonks/GitHub/hlidac-eshopu/node_modules/apify/build/crawlers/basic_crawler.js:423:13)
      at async AutoscaledPool._maybeRunTask (/Users/janfiedler/Work/TopMonks/GitHub/hlidac-eshopu/node_modules/apify/build/autoscaling/autoscaled_pool.js:399:13)

It is look like, this happen only when part of url is /page=x/

janfiedler avatar Jul 12 '21 10:07 janfiedler

@janfiedler Hi! Seems like there's an issue with relative pathes in urls, will look at it and give some feedback tomorrow.

vladyslav-n avatar Jul 12 '21 11:07 vladyslav-n

@janfiedler Hi! Sorry for the delay. Made some updates to the crawling of the actor and also explicitly added a new allowed content-type to the actor — seems like it failed the actor to work at all in both modes COUNT and DAILY. The crawling changes were made due to some changes in the site categories logic.

vladyslav-n avatar Aug 02 '21 17:08 vladyslav-n