Rcrawler icon indicating copy to clipboard operation
Rcrawler copied to clipboard

ExtractCSSPat issue on search results

Open hamedf62 opened this issue 6 years ago • 2 comments

Dear Salim; thank you so much for such great efforts and useful package!

i have an issue in crawling and data gathering in search result pages.

in ExtractCSSPat mentioned few CSS rules but some of the pages doesn't include the required data and CSS is not available on them.

below error would occured: Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 1, 0 In addition: Warning messages: 1: In UseMethod("xml_remove") : closing unused connection 7 (<-DESKTOP-K73RC4R:11502) 2: In UseMethod("xml_remove") : closing unused connection 6 (<-DESKTOP-K73RC4R:11502) 3: In UseMethod("xml_remove") : closing unused connection 5 (<-DESKTOP-K73RC4R:11502) 4: In UseMethod("xml_remove") : closing unused connection 4 (<-DESKTOP-K73RC4R:11502) 5: In UseMethod("xml_remove") : closing unused connection 3 (C:/Users/Hamed/Documents/tripadvisor.com-281027/extracted_data.csv)

****i though its possible to put a conditional statement to check if the CSS tag is not available then return null in data set.

hamedf62 avatar Nov 28 '18 07:11 hamedf62

Dear Salim i did two important correction in source, wish to contact you to explain.

regards

hamedf62 avatar Nov 30 '18 04:11 hamedf62

by default , the crawler return NA value for non existant css patterns , i need to check your command and run it localy to investigate the issue. send it by email if you dont want it to be public

Le ven. 30 nov. 2018 4:40 AM, hamedf62 [email protected] a écrit :

Dear Salim i did two important correction in source, wish to contact you to explain.

regards

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/salimk/Rcrawler/issues/48#issuecomment-443088957, or mute the thread https://github.com/notifications/unsubscribe-auth/AQgZ3NpPme3hKPpRxVzNwG17nZnVzKcAks5u0LbJgaJpZM4Y3ASx .

salimk avatar Nov 30 '18 10:11 salimk