Shai Yallin
Shai Yallin
@itamararjuan please handle ASAP
Is it worthwhile to attempt scraping the PDF files?
> i'm probably missing something, but cant we just download the csv and parse that? https://www.gov.il/he/departments/guides/pro_felling_trees this is the URL I was referring to
`server/api/model/tree_permit_constants.js`: ``` exports.UNSUPPORTED_PLACES = [ 'אשדוד', 'באר שבע', 'גבעתיים', 'גבעת שמואל', 'הוד השרון', 'חיפה', 'יבנה', 'יפו', 'נתניה', 'פתח תקווה', 'ראשון לציון', 'רחובות', 'רמת גן', 'תל אביב-יפו', ]; ``` Why are...
I get this by running the crawler against the above URL: ``` TreePermit { attributes: [Object: null prototype] { regional_office: 'פקיד יערות מחוז תל אביב', permit_number: '1116', person_request_name: 'אברהם הרב',...
Interesting. So we're back to wondering whether it's worthwhile to attempt parsing the PDFs and scraping whatever data can be scraped.
I see that there are no tests for the tree felling crawling process - is that on purpose? I'll try to contribute. The flow as I see it is: 1)...
Although, we can extract 100% of street addresses from the PDF file name, and we can conceivably create a unique id from street address city and publication date. So we...
Wow. That was fast. I'll check soon.
Nope, same behavior.